Effect of N-Gram on Document Classification on the Naïve Bayes Classifier Algorithm

Fitria Khoirunnisa, Novi Yusliani, M.T., Desty Rodiah, M.T.

Abstract

News has become a major need for everyone, with news we can get the information needed. News can be distributed in the form of print mass media, electronic mass media and online media. The means of spreading the news now have grown very rapidly, making the amount of information being managed are bigger and word management classified also not small.  herefore, we need a system for classifying documents that are not structured. In this study, word processing in a document is done by N-Gram as a feature generation. The document classification process is carried out using the Naïve Bayes Classifier algorithm. This study examines the effect of N-Gram on document classification on the Naïve Bayes Classifier algorithm. The results of the classification accuracy of documents by applying N-Gram is 32.68% and without applying N-Gram is 84.97%. A decrease in the classification results occurs the number of features that result from solving N-Gram that is unique or dominant to another category. The accuracy of the results obtained shows that the application of N-Gram in the classification of documents using the Naïve Bayes Classifier algorithm gives a decreased effect on the performance of the classification

Full Text:

PDF

Refbacks

  • There are currently no refbacks.