Analysis of Russian media

Application of the techniques developed with English language texts to other languages is not so easy, but I managed to adapt my LSS system to Russia language for a project on Russian media framing of street protests. In the project, I am responsible for data collection and analysis of Russian language news collected from state-controlled […]

Sentence segmentation

I believe that sentence is the optimal unit of sentiment analysis, but splitting whole news articles into sentences is often tricky because there are a lot of quotations in news. If we simply chop up texts based on punctuations, we get quoted texts are split into different sentences. This code is meant to avoid such […]

The Latent Semantic Scaling

I have posted document scaling results on different dimensions such as political left-right, and immigration positive-negative on this blog previously, but I did not explain the detail of the technique, call the Latent Semantic Scaling. The LSS is a type of lexicon expansion technique based on the Latent Semantic Analysis. Please have a look at […]

Geographical dictionary making technique

My new draft paper Newsmap: Dictionary expansion technique for geographical classification of very short longitudinal texts explains how to create a large geographical dictionary for text classification. Its algorithm is an updated version of the International Newsmap, and it is simpler and more statistically grounded. As I am arguing in the paper, this technique could […]

Begin typing your search term above and press enter to search. Press ESC to cancel.

Back To Top