DETEKSI CYBERBULLYING MULTIKELAS BERKINERJA TINGGI: ENSEMBLE ROBERTA-LARGE DENGAN PRESISI CAMPURAN

Muhammad Syifaaul Jinan
Maya Rini Handayani
Masy Ari Ulinuha
Khothibul Umam


DOI: https://doi.org/10.29100/jipi.v10i3.8056

Abstract


Isu cyberbullying yang terus berkembang di lingkungan digital telah menjadi perhatian global serius, menimbulkan dampak negatif signifikan dan menyoroti kebutuhan mendesak akan sistem deteksi otomatis. Tujuan primer penelitian ini adalah mengembangkan dan mengevaluasi sistem klasifikasi cyberbullying multikelas yang efektif, mampu mengidentifikasi kelas-kelas age, ethnicity, gender, dan religion, sekaligus membedakannya dari konten not_cyberbullying dan other_cyberbullying. Desain penelitian ini adalah eksperimental, berfokus pada fine-tuning model bahasa besar untuk tugas klasifikasi teks. Metodologi yang diterapkan melibatkan fine-tuning model RoBERTa-Large menggunakan dataset terlabel multikelas sebanyak 47.692 tweet. Untuk meningkatkan robustisitas dan generalisasi model, digunakan teknik ensemble learning melalui soft voting dari tiga model RoBERTa-Large yang dilatih dengan seed yang berbeda. Pelatihan dilakukan dengan presisi campuran (FP16) untuk efisiensi komputasi. Hasil utama menunjukkan bahwa model ensemble ini mencapai kinerja yang solid dan kompetitif pada test set untuk deteksi cyberbullying multikelas, dengan Akurasi 0.87 dan F1-Score (Weighted) sebesar 0.86. Model menunjukkan kinerja yang sangat baik pada kelas-kelas age, ethnicity, gender, dan religion tersebut, namun masih menghadapi tantangan pada klasifikasi kelas not_cyberbullying dan other_cyberbullying. Kesimpulannya, sistem ini membuktikan efektivitas signifikan dari RoBERTa-Large dalam konfigurasi ensemble untuk deteksi cyberbullying multikelas, menunjukkan kemampuan deteksi yang kuat secara keseluruhan dan sangat baik pada kategori-kategori tertentu, memberikan dasar kuat untuk aplikasi pencegahan cyberbullying di dunia nyata.

Keywords


Cyberbullying; Deep Learning; Ensemble Learning; Klasifikasi Teks; Presisi Campuran; RoBERTa

Full Text:

PDF

Article Metrics :

References


S. Bansal, N. Garg, J. Singh, and F. Van Der Walt, “Cyberbullying and mental health: past, present and future,” Front. Psychol., vol. 14, 2023, doi: 10.3389/fpsyg.2023.1279234.

A. M. El Koshiry, E. H. I. Eliwa, T. A. El-Hafeez, and M. Khairy, “Detecting cyberbullying using deep learning techniques: a pre-trained glove and focal loss technique,” PeerJ Comput. Sci., vol. 10, pp. 1–33, 2024, doi: 10.7717/peerj-cs.1961.

F. Elsafoury, S. Katsigiannis, Z. Pervez, and N. Ramzan, “When the Timeline Meets the Pipeline: A Survey on Automated Cyberbullying Detection,” IEEE Access, vol. 9, pp. 103541–103563, 2021, doi: 10.1109/ACCESS.2021.3098979.

M. T. Hasan, M. A. E. Hossain, M. S. H. Mukta, A. Akter, M. Ahmed, and S. Islam, “A Review on Deep-Learning-Based Cyberbullying Detection,” Futur. Internet, vol. 15, no. 5, pp. 1–47, 2023, doi: 10.3390/fi15050179.

B. Ogunleye and B. Dharmaraj, “The Use of a Large Language Model for Cyberbullying Detection,” Analytics, vol. 2, no. 3, pp. 694–707, 2023, doi: 10.3390/analytics2030038.

H. Aljalaoud, K. Dashtipour, and A. Al-Dubai, “Arabic Cyberbullying Detection: A Comprehensive Review of Datasets and Methodologies,” IEEE Access, vol. 13, no. March, pp. 69021–69038, 2025, doi: 10.1109/ACCESS.2025.3561132.

Z. S. Bai and S. Malempati, “Ensemble Deep Learning (EDL) for Cyber-bullying on Social Media,” Int. J. Adv. Comput. Sci. Appl., vol. 14, no. 7, pp. 551–560, 2023, doi: 10.14569/IJACSA.2023.0140761.

Q. Li et al., “A Survey on Text Classification: From Traditional to Deep Learning,” ACM Trans. Intell. Syst. Technol., vol. 13, no. 2, 2022, doi: 10.1145/3495162.

S. Abimannan, E. S. M. El-Alfy, Y. S. Chang, S. Hussain, S. Shukla, and D. Satheesh, “Ensemble Multifeatured Deep Learning Models and Applications: A Survey,” IEEE Access, vol. 11, no. September, pp. 107194–107217, 2023, doi: 10.1109/ACCESS.2023.3320042.

A. Jakhotiya, H. Jain, B. Jain, and C. Chaniyara, “Text Pre-Processing Techniques in Natural Language Processing: A Review,” Int. Res. J. Eng. Technol., vol. 9, no. 2, pp. 878–880, 2022.

S. Nazir, M. Asif, M. Rehman, and S. Ahmad, “Machine learning based framework for fine-grained word segmentation and enhanced text normalization for low resourced language,” PeerJ Comput. Sci., vol. 10, no. 1, pp. 1–19, 2024, doi: 10.7717/peerj-cs.1704.

G. Tucudean, M. Bucos, B. Dragulescu, and C. D. Caleanu, “Natural language processing with transformers: a review,” PeerJ Comput. Sci., vol. 10, pp. 1–22, 2024, doi: 10.7717/PEERJ-CS.2222.

Y. Chang et al., “A Survey on Evaluation of Large Language Models,” ACM Trans. Intell. Syst. Technol., vol. 15, no. 3, 2024, doi: 10.1145/3641289.

I. N. Santana, R. S. Oliveira, and E. G. S. Nascimento, “Text Classification of News Using Transformer-based Models for Portuguese,” J. Syst. Cybern. Informatics, vol. 20, no. 5, pp. 33–59, 2022, doi: 10.54808/jsci.20.05.33.

N. Alangari, M. El Bachir Menai, H. Mathkour, and I. Almosallam, “Exploring Evaluation Methods for Interpretable Machine Learning: A Survey,” Inf., vol. 14, no. 8, 2023, doi: 10.3390/info14080469.

H. Allam, L. Makubvure, B. Gyamfi, K. N. Graham, and K. Akinwolere, “Text Classification: How Machine Learning Is Revolutionizing Text Categorization,” Inf., vol. 16, no. 2, 2025, doi: 10.3390/info16020130.

A. A. Jamjoom, H. Karamti, M. Umer, S. Alsubai, T. H. Kim, and I. Ashraf, “RoBERTaNET: Enhanced RoBERTa Transformer Based Model for Cyberbullying Detection With GloVe Features,” IEEE Access, vol. 12, no. May 2024, pp. 58950–58959, 2024, doi: 10.1109/ACCESS.2024.3386637.

A. F. Alqahtani and M. Ilyas, “An Ensemble-Based Multi-Classification Machine Learning Classifiers Approach to Detect Multiple Classes of Cyberbullying,” Mach. Learn. Knowl. Extr., vol. 6, no. 1, pp. 156–170, 2024, doi: 10.3390/make6010009.

A. Muneer, A. Alwadain, M. G. Ragab, and A. Alqushaibi, “Cyberbullying Detection on Social Media Using Stacking Ensemble Learning and Enhanced BERT,” Inf., vol. 14, no. 8, 2023, doi: 10.3390/info14080467.

I. Tabassum and V. Nunavath, “A Hybrid Deep Learning Approach for Multi-Class Cyberbullying Classification Using Multi-Modal Social Media Data,” Appl. Sci., vol. 14, no. 24, 2024, doi: 10.3390/app142412007.