Implementasi Deteksi Judul Berita Clickbait Berbahasa Indonesia dengan pre-trained model Multilingual BERT Pada Aplikasi Berbasis Chrome Extension

Girinoto Girinoto, Dhana Arvina Alwan, Gusti Agung Ngurah Gde K.T. D, Olga Geby Nabila, Arizal arizal, Dimas Febriyan Priambodo

Abstract


Clickbait news title is often used by online news portal. The purpose of clickbait is to attract reader to open and read the news. Furthermore, news containing clickbait title can give negative impact by reducing the essence of important news. Therefore, clickbait detection tool is needed to avoid the clickbait news title. Chrome extension was chosen in this study because it supports all Chrome based browsers, such as Google Chrome, Chromium, Microsoft Edge, and Opera so that many users apply this program. In this study, Chrome extension-based application was designed and integrated by using artificial intelligence model. This application also utilized the availability of pre-trained multilingual BERT model as Natural Language Processing (NLP) which will be used to predict a clickbait news title. This study used Multilingual BERT model as NLP because this model has been trained into 104 languages, including Bahasa Indonesia and it has significant performance. The result of this study can detect clickbait news along with 92% of AUC-ROC value.   


Keywords


Chrome Extension, Clickbait, BERT, Natural Language Processing, Transformers.

Full Text:

PDF

References


Botnevik, B., Sakariassen, E., & Setty, V. (2020. (2020). Brenda: Browser extension for fake news detection. Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2117–2120.

Chen, Y., Conroy, N. K., & Rubin, V. L. (2015). News in an online world: The need for an “automatic crap detector.” Proceedings of the Association for Information Science and Technology, 1–4.

Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv.

Dhawan, M., & Ganapathy, V. (2009). Analyzing Information Flow in JavaScript-Based Browser Extensions. 2009 Annual Computer Security Applications Conference, 382–391. https://doi.org/10.1109/ACSAC.2009.43

Fakhruzzaman, M. N., & Gunawan, S. W. (2021). Web-based Application for Detecting Indonesian Clickbait Headlines using IndoBERT. arXiv.

Fakhruzzaman, M. N., Jannah, S. Z., Ningrum, R. A., & Fahmiyah, I. (2021). Clickbait Headline Detection in Indonesian News Sites using Multilingual Bidirectional Encoder Representations from Transformers (M-BERT). arXiv.

Hadiyat, Y. D. (2019). Clickbait on Indonesia Online Media. Journal Pekommas, 4(1), 1. https://doi.org/10.30818/jpkm.2019.2040101

Handoko, A. P. (2006). Aplikasi Pengolah Bahasa Alami Untuk Operasi Queri Database. Jurnal Ilmiah Sinus, 4(2). https://doi.org/http://dx.doi.org/10.30646/sinus.v4i2

Handoko, A. P. (2007). Aplikasi Pengolah Bahasa Alami Untuk Operasi Boolean Antar Citra. Jurnal Ilmiah Sinus, 5, 2. https://doi.org/http://dx.doi.org/10.30646/sinus.v5i2

Masse, M. (2011). REST API Design Rulebook: Designing Consistent RESTful Web Service Interfaces. O’Reilly Media, Inc.

Narkhede, S. (2018). Understanding AUC - ROC Curve | by Sarang Narkhede | Towards Data Science. Towards Data Science. https://towardsdatascience.com/understanding-auc-roc-curve-68b2303cc9c5

Pires, T., Schlinger, E., & Garrette, D. (2019). How multilingual is multilingual BERT? Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 4996--5001. https://doi.org/10.18653/v1/P19-1493

William, A., & Sari, Y. (2020). CLICK-ID: A novel dataset for Indonesian clickbait headlines. Data in Brief, 32(10623), 1.




DOI: http://dx.doi.org/10.30646/sinus.v20i2.624

Refbacks

  • There are currently no refbacks.


 


STMIK Sinar Nusantara

KH Samanhudi 84 - 86 Street, Laweyan Surakarta, Central Java, Indonesia
Postal Code: 57142, Phone & Fax: +62 271 716 500 

Email: ejurnal @ sinus.ac.id | https://p3m.sinus.ac.id/jurnal/e-jurnal_SINUS/

ISSN: 1693-1173 (print) | 2548-4028 (online)


Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

View My Stats