Bully Comments Classification on TikTok Using Support Vector Machine and Chi-Square Feature Selection

Amelia Putri, Abdiansyah Abdiansyah, Alvi Syahrini Utami

Abstract

TikTok has been named the world’s most popular social media platform. The high level of TikTok use makes it easier for an irresponsible user to do unethical things such as spreading hateful comments on someone’s account. TikTok developers can prevent bullying by using policies such as word detection and filtering features that indicate comments fall under the category of bullying or non-bullying comments. Therefore, we conducted this study to classify bullying comments using Machine Learning methods for convenience purposes on TikTok usage, a method that we used in this research is the SVM method to classify the data and Chi-Square as the feature selection. Tests were carried out using the Linear, Polynomial, and RBF kernel functions with the C parameter, namely 0,1, 1, and 10 for each kernel. The results of this research show that the Support Vector Machine method with Chi-Square Feature Selection has a better performance.  This was proven by the increased accuracy in RBF kernel C=0,1 which was 0,20

Full Text:

PDF

References

Clement, J. Most popular categories on TikTok worldwide 2020, by hashtag views. Retrieved from https://www.statista.com/statistics/1130988/most-popular-categories-tiktok-worldwide-hashtagviews. 2020.

Hidayah, A., Marcelawati, Y., Saputra, H. “Cyber Harassment: Fenomena Hate Comment Di Era Pandemi Covid-19 Pada Akun Tik-Tok.” doi:10.31629/jmm.v5i1.3419 (2021). 10-17.

Nur Chamidah & Reiza Sahawaly. “Comparison Support Vector Machine and Naive Bayes Methods for Classifying Cyberbullying in Twitter”. Jurnal Ilmiah Teknik Elektro Komputer dan Informatika (JITEKI), (2021). 4-6.

Rahmad, A. N. “Pemilihan Feature Dengan Chi Square Dalam Algoritma Naive Bayes Untuk Klasifikasi Berita.”. 2015.

Nadhia, S. A., Danang, T., dan Kemas, M.L. “Toxic Comment Classification on Social Media Using Support Vector Machine and Chi-Square Feature Selection”. Intl. Journal on ICT. 2021.

Fakhri T, Said Al Faraby, & Mahendra, D. P. “Klasifikasi Teks Multi Label pada Hadis Terjemahan Bahasa Indonesia Menggunakan Chi-Square dan SVM”. Jurnal Tugas Akhir Fakultas Informatika, (2021). 1-3.

Ade Clinton, S., Wanayumini, & Zakarias, S. “Analisis Kinerja Support Vector Machine dalam Mengidentifikasi Komentar Perundungan pada Jejaring Sosial.” Jurnal Media Informatika BudiDarma, (2021). 475-478.

M. Fortunatusa, P. A. “Combining textual features to detect cyberbullying in social media posts. 24th International Conference on Knowledge-Based and Intelligent Information & Engineering Systems, Procedia Computer Science”, 176, (2020). 612-6

Mohkammad, R. T. “Analisis Sentimen Review Transportasi Menggunakan Algoritma Support Vector Machine Berbasis Chi-Square”. Smart Comp Vol.9, (2020). 36-38.

Rohmawati, U., Sihwi, S., & Cahyani, D. “SEMAR: An Interface for Indonesian Hate Speech Detection Using Machine Learning. International Seminar On Research Of Information Technology And Intelligent (ISRITI). “doi:10.1109/isriti.2018.8864484. 2018.

Ramadlani, M. F. “SELEKSI FITUR MENGGUNAKAN ALGORITMA CHI-SQUARE UNTUK PREDIKSI CACAT PERANGKAT LUNAK”. 2019.

Luque, A., Carrasco, A., Martin, A. & de las Heras, A. “The impact of class imbalance in classification performance metrics based on the binary Confusion Matrix”, 91, (2019). 216-231.

M. U. Hasan, S. Ullah, M. J. Khan, & K. Khurshid. “Comparative Analysis of SVM, ANN and CNN for Classifying Vegetation Species Using Hyperspectral Thermal Infrared Data” . The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XLII-2/W13, 1861-1868. doi:https://doi.org/10.5194/isprs-archives-XLII-2-W13-1861-2019

Refbacks

  • There are currently no refbacks.