I presented my paper on geographical classification, in the methodology pre-conference at ICA in Fukuoka, Japan. The pre-conference has historical significance as the first methodology group at a major international conference of media and communication studies. There were a lot of interesting presentations, but, to my big surprise, I won a Best Paper Award from […]
Newsmap in R
I have been using Newsmap in many of my research projects as one of the key tools, but I was not able share the tool with other people as it was a complex Python system. To make the tool widely available, I recently implemented Newsmap in R. The R version is dependent on another text […]
Analysis of Russian media
Application of the techniques developed with English language texts to other languages is not so easy, but I managed to adapt my LSS system to Russia language for a project on Russian media framing of street protests. In the project, I am responsible for data collection and analysis of Russian language news collected from state-controlled […]
Countries with state-owned news agencies
It is only little recognized, even among the students of mass media, that international news system is a network of national or regional news agencies, and that many of those are state-owned. Fully commercial agencies like Reuters are very rare, and even international news agencies, such as AFP, are often subsidized by the government. In […]
ITAR-TASS’s coverage of annexation of Crimea
My main research interest is estimation of media biases using text analysis techniques. I did a very crude analysis of ITAR-TASS’s coverage of the Ukraine crisis two years ago, but it is time to redo everything with more sophisticated tools. I created a positive-negative dictionaries for democracy and sovereignty, and applied them to see how […]
Russia’s foreign policy priority
Methodological papers are tasteless and boring without nice examples. For an exemplary application of my Newsmap, I downloaded all the news stories published by ITAR-TASS news agency from 2009 to 2014 both in English and Russian. From a public diplomacy point of view, I was interested in which countries are receiving the highest coverage in […]
Sentence segmentation
I believe that sentence is the optimal unit of sentiment analysis, but splitting whole news articles into sentences is often tricky because there are a lot of quotations in news. If we simply chop up texts based on punctuations, we get quoted texts are split into different sentences. This code is meant to avoid such […]
Nexis news importer updated
I posted the code Nexis importer last year, but it tuned out that the HTML format of the database service is less consistent than I though, so I changed the logic. The new version is dependent less on the structure of the HTML files, but more on the format of the content. library(XML) #might need […]
The Latent Semantic Scaling
I have posted document scaling results on different dimensions such as political left-right, and immigration positive-negative on this blog previously, but I did not explain the detail of the technique, call the Latent Semantic Scaling. The LSS is a type of lexicon expansion technique based on the Latent Semantic Analysis. Please have a look at […]