My co-authored paper on temporal orientation of texts appeared in Research & Politics. In this study we applied latent semantic scaling (LSS) to a corpus of English and German texts to identify features related to the future or the past automatically. Only with a set common verbs as seed words, the algorithm could classify sentences […]
New papers on distributed LDA for sentence-level topic classification
I have been studying and developing an LDA algorithm for classification of sentences since 2022. Sentence-level topic classification allows us to analyze association between topics and other properties such as sentiments within documents. Also, sentence-level analysis has become more common in text analysis in general thanks to highly capable transformer models in recent years. My […]
Measuring emotional distress during COVID through words and emojis on Twitter
My co-authored article on public mental health has appeared recently in the Journal of Medical Internet Research. In this study, we combined survey research and social media analysis to infer Japanese people’s mental health during the COVID pandemic. The methodological novelty of this study is that (1) we collected individual characteristics (age, gender, occupation, income […]
Encyclopedia entries on text analysis from fresh perspectives
The Elgar Encyclopedia of Technology and Politics was published earlier this month. Andrea Ceron, the editor, compiled entries by many young political scientists to make the volume full of fresh perspectives. I have contributed to it by writing an entry on “text as data” (preprint) with an emphasis on the “string-of-words” approach that would improve […]
New research on the effect of nuclear threats in news on leader popularity
Since 2019, the most important research project of mine has been about media coverage of North Korea and Iran’s nuclear threats and its political implications. I started this project in 2019 with Elad Segev (Tel Aviv) and Atushi Tago (Waseda) supported by a Japanese funding agency. We have analyzed how Japanese (Asahi and Yomiuri) and […]
Replication data for The Geopolitical Threat Index: A Text-Based Computational Approach to Identifying Foreign Threats
I have received emails from the readers of my paper, The Geopolitical Threat Index: A Text-Based Computational Approach to Identifying Foreign Threats, that appeared in International Studies Quarterly (ISQ) last year. It is great that my paper is still attracting attention but they said they cannot not find the replication dataset… There is a page […]
Preprint on nuclear threats using LSS
I have been leading a project with Elad Segev (Tel Aviv University) and Atsushi Tago (Waseda University) on implications of security threats for domestic politics. We have completed a content analysis of newspapers and a simultaneous survey experiment in both Japan and Israel since the beginning of the project in 2019. One of the goals […]
New report on the Kremlin’s influence on Twitter
My co-authored report on Russia’s influence on Twitter during the 2020 US presidential election has been published by Free Russia Foundation. I and Maria Snegovaya conducted a representative online survey of Americans during the election campaign along with quantitative content analysis of their Twitter posts over a year. We aimed to reveal the relationship between […]
New research paper on how to choose seed words for semi-supervised models
I have been developing and applying semi-supervised models, such as seeded-LDA, Newsmap and LSS, for classification and document scaling aiming to broader the scope of quantitative text analysis in recent years. These models are very cost efficient because they only require a small set of “seed words” to learn categories or dimensions of interest. However, […]