New paper on Russia’s international propaganda during the Ukraine crisis

My paper on Russia’s international propaganda during the Ukraine crisis, The spread of the Kremlin’s narratives by a western news agency during the Ukraine crisis, is published in the Journal of International Communication. This is very timely, because people are talking about spread of “fake news”! The description of the Ukraine crisis as an ‘information […]

Handling multi-word features in R

Multi-word verbs (e.g. “set out”, “agree on” and “take off”) or names (e.g. “United Kingdom” and “New York”) are very important features of texts, but it is often difficult to keep them in bag-of-words text analysis, because tokenizers usually break up strings by spaces. You can preprocess texts to concatenate multi-word features with underscores like […]

Stringiによる日本語と中国語のテキストの分かち書き

MecabやChasenなどのによる形態素解析が、日本語のテキストの分かち書きには不可欠だと多くの人が考えていますが、必ずしもそうではないようです。このことを知ったのは、quantedaのトークン化の関数を調べている時で、日本語のテキストをこの関数に渡してみると、単語が Mecabと同じように、きれいに単語に分かれたからです。 > txt_jp quanteda::tokens(txt_jp) tokens from 1 document. Component 1 : [1] “政治” “と” “は” “社会” “に対して” “全体” “的” “な” [9] “影響” “を” “及” “ぼ” “し” “、” “社会” “で” [17] “生きる” “ひとりひとり” “の” “人” “の” “人生” “に” “も” [25] “様々” “な” “影響” “を” “及ぼす” “複雑” “な” “領域” [33] “で” “ある” “。” quantedaには、形態素解析の機能がないのですが、そのトークン化関数は、中国語のテキストもきれいに、分かち書きをしたのは意外でした。 > txt_cn […]

Begin typing your search term above and press enter to search. Press ESC to cancel.

Back To Top