Text Summarization with K-Means Method

Ari Firdaus, Novi Yusliani, Desty Rodiah

Abstract

Text Summarization is a tool used to generate a short form of text that contains important information that is needed by the user automatically. In this study, Text Summarization was conducted on Indonesian news using K-Means method. The news is taken from CNN Indonesia with a free topic. K-Means is used to classify sentences that already have weight in the news with 2 clusters, namely text summaries and not text summaries. The initial centroid is selected based on the sentence with the largest value and the sentence with the smallest value. The test conducted on Indonesian news with a total 50 news and tested for feasibility using a questionnaire. K-Means was successfully summarizing the news with an average 27.3 % of original news length and gain 87% good summarize based on respondents from questionnaire.

Full Text:

PDF

References

J. Lewis, "News and the empowerment of citizens," European Journal of Cultural Studies, pp. 303-319, 2006.

A. Agrawal and U. Gupta, "Extraction based approach for text summarization using k-means clustering," International Journal of Scientific and Research Publications, pp. 1-4, 2014.

V. Pandya, "AUTOMATIC TEXT SUMMARIZATION OF LEGAL CASES: A HYBRID APPROACH," arXiv preprint, 2019.

S. Akter, A. S. Asa, M. P. Uddin, M. D. Hossain, S. K. Roy and M. I. Afjal, "An Extractive Text Summarization Technique for Bengali Document(s) using K-means Clustering Algorithm," IEEE, 2017.

B. K. Khotimah, F. Irhamni and T. Sundarwati, "A GENETIC ALGORITHM FOR OPTIMIZED INITIAL CENTERS K-MEANS CLUSTERING IN SMEs," Journal of Theoretical and Applied Information Technology, pp. 23-30, 2016.

S. Mujilahwati, "Pre-Processing Text Mining pada Data Twitter," Seminar Nasional Teknologi Informasi dan Komunikasi 2016, pp. 49-56, 2016.

M. R. Prathima and H. R. Divakar, "Automatic Extractive Text Summarization Using K-Means Clustering," International Journal of Computer Sciences and Engineering , pp. 782-787, 2018.

A. Librian, "sastrawijs," 18 Januari 2021. [Online]. Available: https://www.npmjs.com/package/sastrawijs.

A. Librian, "damzaky/sastrawijs: Indonesian language stemmer. Javascript port of PHP Sastrawi project.," 18 Januari 2021. [Online]. Available: https://github.com/damzaky/sastrawijs.

B. Nazief, M. Adriani, J. Asian, S. M. M. Tahaghoghi and H. E. Williams, "Stemming Indonesian," Conferences in Research and Practice in Information Technology Series, pp. 307-314, 2005.

K. S. Jones, "A statistical interpretation of term specificity and its," Journal of Documentation, 1972.

I. Carolina and A. Supriyatna, "Penerapan Metode Extreme Programming Dalam Perancangan Aplikasi Perhitungan Kuota SKS Mengajar Dosen," Jurnal IKRA-ITH Informatika Vol 3 No 1, pp. 106-113, 2019.

C. Mallick, A. K. Das, M. Dutta, A. K. Das and A. Sarkar, "Graph-Based Text Summarization Using Modified TextRank," Soft Computing in Data Analytics Advances in Intelligent Systems and Computing, pp. 137-146, 2019.

Refbacks

  • There are currently no refbacks.