My new draft paper Newsmap: Dictionary expansion technique for geographical classification of very short longitudinal texts explains how to create a large geographical dictionary for text classification. Its algorithm is an updated version of the International Newsmap, and it is simpler and more statistically grounded. As I am arguing in the paper, this technique could […]
Terrorism Dictionary 2014
After seeing mass media’s strong response to the extremists’ attack against Charlie Hebdo, I started thinking what can I do for this increasingly important topic? One simple work is making a dictionary containing keywords related to terrorism, so the Terrorism Dictionary 2014 is created. This dictionary is made from newswires submitted by the Associated Press […]
Left-right policy position dictionary
The Latent Semantic Scaling (LSS) not only works well with positive-negative sentiment but with left-right position on economic policy. The seed words for this dimension are {deficit, austerity, unstable, recession, inflation, currency, workforce} for the light and {poor, poverty, free, benefits, prices, money, workers} for the left. Left-right policy position dictionary was created from UK […]
Immigration dictionary
This is probably the final version of my immigration dictionary. This text analysis dictionary was created using technique called the Latent Semantic Scaling, which is based on the Latent Semantic Analysis, from British newspaper corpus. The result of the automated content analysis by this dictionary is strongly corresponds to manual coding by Amazon’s Mechanical Turks […]
Text analysis dictionary on psychology
My automated dictionary creation project is making good progress, and I created a psychology dictionary from a large corpus of UK news on psychology from 1990 to 2011. Scores given to each entry word is interpreted as strength of association to psychology, and the list can be truncated based on the scores. The words are […]
Testing immigration dictionary
After making some changes in my automated dictionary creation system, I ran a test to validate the word choice for the new immigration dictionary. Latest version contains fewer intuitively negative words with positive scores, unlike the original version. The test was performed by comparing the computer content-analysis with human coding of the 2010 UK manifestos. […]
Text analysis dictionary on immigration policy
Dictionary-based text analysis has a number of good properties, but it is always difficult to make a new dictionary and text analysts often use existing dictionaries that include the General Inquirer dictionaries, which are originally created decades ago, or their derivatives. However, I believe that it is time to create new dictionaries from scratch using […]