So Jim and I were playing around with a few different data sets in Voyant when we each decided to try analyzing a source in each of our languages. I chose a Japanese newspaper article because I didn’t have any Japanese language sources handy. I was really pleasantly surprised at how well it was able to distinguish different words in a non-English language with no spacing to differentiate them. The word cluster function was especially interesting to me.
The article itself is about the suicide of two boys at a Japanese high school, but suicide was only mentioned half as much as the investigation itself (調査) or the bereaved families (遺族) and only one third as much as committee members (委員). I think that this clearly represents the author’s narrative focus, emphasizing the aftermath rather than sensationalizing the suicides themselves, which may be a larger trend in Japanese journalism worth investigating.