Group polarity words with different colors in LSS

We should always ensure that we are measuring what we want to measure in text analysis. In Latent Semantic Scaling (LSS), we can asses the validity of measurement by inspecting polarity scores of words using LSX::textplot_terms(). This function automatically selects words with very high or low polarity scores and highlights them. We can confirm that the measurement is valid if these words are appearing where they should, but we sometimes do not know where. For example, an LSS model that I fitted on news articles with sentiment seed words has “coca-cola” as one of the most positive words, but we don’t know if it is correct.

It is easier to validate the measurement if we highlight words that have known polarity in the plot. In the latest version of the LSX package (v1.4.0), I upgraded the function to highlight words from multiple lists and group them with different colors. If we pass the Lexicoder Sentiment Dictionary (LSD) to the function textplot_terms(lss, highlighted = data_dictionary_LSD2015[1:2]), positive and negative words are colored in blue and red.

The new visualization function can be used not only for validation but also for analysis. In the next plot, polarity words an LSS model about security threats are highlighted with four different colors depending on their association with China, North Korea, Iran or Russian. I fitted LSS and newsmap models on news articles published in the US in 2022, and combined them in my project. Interestingly, words with high polarity scores are related to Russia’s invasion of Ukraine, North Korea’s long range ballistic missiles, and Iran’s Shahed 136 drones, but polarity scores of words associated with China are relatively low. This seems to suggest that China’s security threat is still much less than other adversaries.

Posts created 114

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Posts

Begin typing your search term above and press enter to search. Press ESC to cancel.

Back To Top