昨年の夏に東京大学社会科学研究所で、通時的なテキスト分析についてのセミナーをやらせてもらいました。この講座は、今年刊行予定の『International Communication Association Handbook of Computational Communication Research』の「Time-dynamic Analysis」という章の前半の内容に基づいています。セミナーではトピックモデルと感情分析を組み合わせて、総理大臣と外務大臣の演説に有意な通時的な変化があったかを検証しました。章の原稿が書きあがったので、スライドとデータと併せて読んでみてください。
Chapter in upcoming ICA Handbook of CCR
I recently submitted my chapter in ICA Handbook of Computational Communication Research to the editors recently. Among the wide range of topics covered in the volume, my chapter, Time-dynamic Analysis, explains how to analyze textual data collected from over an extended period: In communication research, scholars often analyze news articles, speech transcripts or social media […]
A new topic model for analyzing imbalanced corpora
I have been developing and testing a new topic model called Distributed Asymmetric Allocation (DAA) because latent Dirichlet allocation (LDA) takes a long time to fit to a large corpus, but does not always discover topics that I am interested in. I know that these are also problems for many other users, so I decided […]
New paper on semantic temporality analysis
My co-authored paper on temporal orientation of texts appeared in Research & Politics. In this study we applied latent semantic scaling (LSS) to a corpus of English and German texts to identify features related to the future or the past automatically. Only with a set common verbs as seed words, the algorithm could classify sentences […]
New papers on distributed LDA for sentence-level topic classification
I have been studying and developing an LDA algorithm for classification of sentences since 2022. Sentence-level topic classification allows us to analyze association between topics and other properties such as sentiments within documents. Also, sentence-level analysis has become more common in text analysis in general thanks to highly capable transformer models in recent years. My […]
Measuring emotional distress during COVID through words and emojis on Twitter
My co-authored article on public mental health has appeared recently in the Journal of Medical Internet Research. In this study, we combined survey research and social media analysis to infer Japanese people’s mental health during the COVID pandemic. The methodological novelty of this study is that (1) we collected individual characteristics (age, gender, occupation, income […]
Encyclopedia entries on text analysis from fresh perspectives
The Elgar Encyclopedia of Technology and Politics was published earlier this month. Andrea Ceron, the editor, compiled entries by many young political scientists to make the volume full of fresh perspectives. I have contributed to it by writing an entry on “text as data” (preprint) with an emphasis on the “string-of-words” approach that would improve […]
New research on the effect of nuclear threats in news on leader popularity
Since 2019, the most important research project of mine has been about media coverage of North Korea and Iran’s nuclear threats and its political implications. I started this project in 2019 with Elad Segev (Tel Aviv) and Atushi Tago (Waseda) supported by a Japanese funding agency. We have analyzed how Japanese (Asahi and Yomiuri) and […]
Replication data for The Geopolitical Threat Index: A Text-Based Computational Approach to Identifying Foreign Threats
I have received emails from the readers of my paper, The Geopolitical Threat Index: A Text-Based Computational Approach to Identifying Foreign Threats, that appeared in International Studies Quarterly (ISQ) last year. It is great that my paper is still attracting attention but they said they cannot not find the replication dataset… There is a page […]
Preprint on nuclear threats using LSS
I have been leading a project with Elad Segev (Tel Aviv University) and Atsushi Tago (Waseda University) on implications of security threats for domestic politics. We have completed a content analysis of newspapers and a simultaneous survey experiment in both Japan and Israel since the beginning of the project in 2019. One of the goals […]
