Comparison of the Word2vec Skipgram Model Method Linkaja Application Review using Bidirectional LSTM Algorithm and Support Vector Machine

Puji Ayuningtyas, Henri Tantyoko

Abstract


Word embedding is a phase in word processing that seeks to convert each word into a vector representation. Word2Vec is a sort of word embedding that is frequently utilized in natural language processing research. Choosing the proper algorithm can help increase the performance of the word embedding method while doing text data categorization tasks. This research uses the Bidirectional LSTM deep learning algorithm and the Support Vector Machine (SVM) machine learning algorithm. The crawling approach was used to obtain data by accessing the LinkAja Application ID on the Google Play Store. The total number of rows in the dataset was 35560. Labeling data involves categorizing it into two target classes: positive (score 1) and negative (score 0). This study employs the Word2Vec approach with skipgram architecture during the vectorization stage. Vector size, window, min count, and sg are the four parameters employed. The bidirectional LSTM architecture employs a sequential model that consists of three neural network layers: embedding, bidirectional, and dense. In the meanwhile, the SVM architecture employs the Radial Basis Function (RBF) kernel parameters. For the final stage of algorithm testing, the accuracy of the bidirectional LSTM (BiLSTM) algorithm was 0.9505, which means it was higher than the support vector machine (SVM) algorithm with an accuracy value of 0.93.


Keywords


Bidirectional LSTM, Skipgram, Support Vector Machine, Word2Vec

Full Text:

PDF

References


M. Amien, “Sejarah dan Perkembangan Teknik Natural Language Processing (NLP) Bahasa Indonesia: Tinjauan tentang sejarah, perkembangan teknologi, dan aplikasi NLP dalam bahasa Indonesia,” Mar. 2023, [Online]. Available: http://arxiv.org/abs/2304.02746

E. M. Dharma, F. Lumban Gaol, H. Leslie, H. S. Warnars, and B. Soewito, “THE ACCURACY COMPARISON AMONG WORD2VEC, GLOVE, AND FASTTEXT TOWARDS CONVOLUTION NEURAL NETWORK (CNN) TEXT CLASSIFICATION,” J Theor Appl Inf Technol, vol. 31, no. 2, 2022, [Online]. Available: www.jatit.org

B. Bramantyo, M. Pajar, K. Putra, and N. Hendrastuty, “Implementasi Recurrent Neural Network Pada Multiclass Text Classification Judul Berita,” JURNAL MEDIA BORNEO, vol. 1, no. 1, 2023, doi: 10.58602/mediaborneo.v1i1.6.

F. Wahyu Kurniawan and W. Maharani, “Indonesian Twitter Sentiment Analysis Using Word2Vec,” 2020. [Online]. Available: http://ow.ly/ZbTB50yf0jH

Riza Adrianti Supono and Muhammad Azis Suprayogi, “Perbandingan Metode TF-ABS dan TF-IDF Pada Klasifikasi Teks Helpdesk Menggunakan K-Nearest Neighbor,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 5, no. 5, pp. 911–918, Oct. 2021, doi: 10.29207/resti.v5i5.3403.

Merinda Lestandy, Abdurrahim Abdurrahim, and Lailis Syafa’ah, “Analisis Sentimen Tweet Vaksin COVID-19 Menggunakan Recurrent Neural Network dan Naïve Bayes,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 5, no. 4, pp. 802–808, Aug. 2021, doi: 10.29207/resti.v5i4.3308.

S. Senhadji and R. A. S. Ahmed, “Fake news detection using naïve Bayes and long short term memory algorithms,” IAES International Journal of Artificial Intelligence, vol. 11, no. 2, pp. 746–752, Jun. 2022, doi: 10.11591/ijai.v11.i2.pp746-752.

D. A. Kristiyanti and Sri Hardani, “Sentiment Analysis of Public Acceptance of Covid-19 Vaccines Types in Indonesia using Naïve Bayes, Support Vector Machine, and Long Short-Term Memory (LSTM),” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 7, no. 3, pp. 722–732, Jun. 2023, doi: 10.29207/resti.v7i3.4737.

T. L. Nikmah, M. Z. Ammar, and Y. R. Allatif, “Comparison of LSTM, SVM, and naive bayes for classifying sexual harassment tweets,” Journal of Soft Computing Exploration, vol. 3, no. 2, pp. 131–137, Sep. 2022, doi: 10.52465/joscex.v3i2.85.

G. Xu, Y. Meng, X. Qiu, Z. Yu, and X. Wu, “Sentiment analysis of comment texts based on BiLSTM,” IEEE Access, vol. 7, pp. 51522–51532, 2019, doi: 10.1109/ACCESS.2019.2909919.

M. Murnawan, “PEMANFAATAN ANALISIS SENTIMEN UNTUK PEMERINGKATAN POPULARITAS TUJUAN WISATA,” Jurnal Penelitian Pos dan Informatika, vol. 7, no. 2, p. 109, Dec. 2018, doi: 10.17933/jppi.2017.070203.

P. Arsi and R. Waluyo, “ANALISIS SENTIMEN WACANA PEMINDAHAN IBU KOTA INDONESIA MENGGUNAKAN ALGORITMA SUPPORT VECTOR MACHINE (SVM),” vol. 8, no. 1, pp. 147–156, 2021, doi: 10.25126/jtiik.202183944.

O. I. Gifari, M. Adha, I. Rifky Hendrawan, F. Freddy, and S. Durrand, “Analisis Sentimen Review Film Menggunakan TF-IDF dan Support Vector Machine,” JIFOTECH (JOURNAL OF INFORMATION TECHNOLOGY, vol. 2, no. 1, 2022.

H. P. P. Zuriel and A. Fahrurozi, “IMPLEMENTASI ALGORITMA KLASIFIKASI SUPPORT VECTOR MACHINE UNTUK ANALISA SENTIMEN PENGGUNA TWITTER TERHADAP KEBIJAKAN PSBB,” Jurnal Ilmiah Informatika Komputer, vol. 26, no. 2, pp. 149–162, 2021, doi: 10.35760/ik.2021.v26i2.4289.

M. Işik and H. Dağ, “The impact of text preprocessing on the prediction of review ratings,” Turkish Journal of Electrical Engineering and Computer Sciences, vol. 28, no. 3. Turkiye Klinikleri, pp. 1405–1421, 2020. doi: 10.3906/elk-1907-46.

A. Elcholiqi and A. Musdholifah, “Chatbot in Bahasa Indonesia using NLP to Provide Banking Information,” IJCCS (Indonesian Journal of Computing and Cybernetics Systems), vol. 14, no. 1, p. 91, Jan. 2020, doi: 10.22146/ijccs.41289.

D. Intan Af et al., “Pengaruh Parameter Word2Vec terhadap Performa Deep Learning pada Klasifikasi Sentimen,” vol. 6, no. 3, 2021.

S. B. S, D. Khyani, N. N. M, and D. B. M, “An Interpretation of Lemmatization and Stemming in Natural Language Processing,” 2020. [Online]. Available: https://www.researchgate.net/publication/348306833

Y. Vikriansyah Wijaya, A. Erfina, and C. Warman, “Analisis Sentimen Seputar UU ITE menggunakan Algoritma Support Vector Machine,” Progresif: Jurnal Ilmiah Komputer STMIK Banjarbaru, vol. 17, no. 2, pp. 1–14, 2021, [Online]. Available: https://databoks.katadata.co.id/

R. Rasenda, H. Lubis, and R. Ridwan, “Implementasi K-NN Dalam Analisa Sentimen Riba Pada Bunga Bank Berdasarkan Data Twitter,” JURNAL MEDIA INFORMATIKA BUDIDARMA, vol. 4, no. 2, p. 369, Apr. 2020, doi: 10.30865/mib.v4i2.2051.

W. Parasati, F. Abdurrachman Bachtiar, and N. Y. Setiawan, “Analisis Sentimen Berbasis Aspek pada Ulasan Pelanggan Restoran Bakso President Malang dengan Metode Naïve Bayes Classifier,” 2020. [Online]. Available: http://j-ptiik.ub.ac.id

J. Amalia, J. Pakpahan, M. Pakpahan, Y. Panjaitan, F. Informatika dan Teknik Elektro, and I. Teknologi Del, “Model Klasifikasi Berita Palsu Menggunakan Bidirectional LSTM Dan Word2Vec Sebagai Vektorisasi,” Jurnal Teknik Informatika dan Sistem Informasi, vol. 9, no. 4, 2022, [Online]. Available: http://jurnal.mdp.ac.id

A. Muzakir and U. Suriani, “Model Deteksi Berita Palsu Menggunakan Pendekatan Bidirectional Long Short-Term Memory (BiLSTM),” Journal of Computer and Information Systems Ampera, vol. 4, no. 2, 2023, doi: 10.51519/journalcisa.v4i2.397.

A. R. Isnain, A. Sihabuddin, and Y. Suyanto, “Bidirectional Long Short Term Memory Method and Word2vec Extraction Approach for Hate Speech Detection,” IJCCS (Indonesian Journal of Computing and Cybernetics Systems), vol. 14, no. 2, p. 169, Apr. 2020, doi: 10.22146/ijccs.51743.

H. N. Irmanda and R. Astriratma, “Klasifikasi Jenis Pantun dengan Metode Support Vector Machines (SVM),” RESTI, vol. 1, no. 3, pp. 915–922, 2020.

S. Y. Pangestu, Y. Astuti, and L. D. Farida, “ALGORITMA SUPPORT VECTOR MACHINE UNTUK KLASIFIKASI SIKAP POLITIK TERHADAP PARTAI Politik Indonesia,” Jurnal Mantik Penusa, vol. 3, no. 1, pp. 236–241, 2019, [Online]. Available: https://t.co/eF

D. D. Dewi, N. Qisthi, S. Sarah, S. Lestari, Z. Hidayah, and S. Putri, “PERBANDINGAN METODE NEURAL NETWORK DAN SUPPORT VECTOR MACHINE DALAM KLASIFIKASI DIAGNOSA PENYAKIT DIABETES,” Jurnal Ilmiah Indonesia, vol. 3, no. 9, pp. 828–839, 2023, doi: 10.36418/cerdika.xxx.

N. O. Idris, W. Widyawan, and T. B. Adji, Classification of Radicalism Content from Twitter Written in Indonesian Language using Long Short Term Memory. IEEE, 2019.




DOI: http://dx.doi.org/10.26418/justin.v12i1.72530

Refbacks

  • There are currently no refbacks.


Copyright (c) 2024 JUSTIN (Jurnal Sistem dan Teknologi Informasi)

ara komputer
View My Stats

Creative Commons License
All article in Justin is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License