Sentiment Analysis Using PSEUDO Nearest Neighbor and TF-IDF TEXT Vectorizer

Yogi Pratama, Abdiansyah Abdiansyah, Kanda Januar Miraswan

Abstract

Twitter is one of the social media that is often used by researchers as an object of research to conduct sentiment analysis. Twitter is also a good indicator in influencing research, problems that often arise in research in the field of sentiment analysis are the many factors such as the use of colloquial or informal language and other factors that can affect sentiment results. To improve the results of sentiment classification, it is necessary to carry out a good information extraction process. One of the word weighting methods resulting from the information extraction process is the TF-IDF Vectorizer. This study examines the effect of the TF-IDF Vectorizer weighting results in sentiment analysis using the Pseudo Nearest Neighbor method. The results of the f-measure classification of sentiment using the TF-IDF Vectorizer at parameters k-2 = 89%, k-3 = 89%, k-4 = 71% and k-5 = 75% while without using the TF-IDF Vectorizer on the parameters k-2 = 90%, k-3 = 92%, k-4 = 84% and k-5 = 89%. From the results of the classification of sentiment analysis that does not use the TF-IDF Vectorizer, the f-measure value is slightly better than using it.

Full Text:

PDF

Refbacks

  • There are currently no refbacks.