Optimasi Prediksi Kematian pada Gagal Jantung Analisis Perbandingan Algoritma Pembelajaran Ensemble dan Teknik Penyeimbangan Data pada Dataset

Retno Waluyo, Andhar Siraj Munir

Abstract


Penyakit jantung merupakan penyebab utama kematian di seluruh dunia, termasuk di Indonesia. Identifikasi penyakit kardiovaskular (CVD) memerlukan pertimbangan berbagai faktor, seperti tekanan darah tinggi, kadar kolesterol, diabetes, dan lainnya, dengan gejala yang dapat bervariasi antar jenis kelamin. Meskipun angiografi dianggap metode yang akurat, biayanya tinggi dan kurang terjangkau oleh keluarga berpendapatan rendah. Biaya penyakit kardiovaskular juga memberikan dampak finansial signifikan pada sistem kesehatan. Dalam upaya untuk meningkatkan prediksi penyakit jantung, penelitian ini menggunakan metode ensemble learning, seperti Random Forest, Bagging, Adaboost, Gradient Boosting, dan XGBoost, dengan penyetelan hyperparameter. Eksperimen pada dataset gagal jantung menunjukkan bahwa penerapan teknik Synthetic Minority Over-sampling Technique (SMOTE) pada algoritma Extreme Gradient Boosting (XGB) memberikan hasil terbaik, mencapai akurasi 88.9%, F1-score 87.7%, dan Matthews Correlation Coefficient (MCC) 75.8%. Penggunaan metode balancing data, seperti SMOTE, ROS, dan RUS, secara signifikan memengaruhi performa algoritma, menyoroti pentingnya pemilihan metode sesuai dengan karakteristik dataset. Hasil ini memiliki implikasi penting dalam meningkatkan prediksi dan manajemen risiko kematian pada pasien gagal jantung secara dini dan lebih hemat biaya.


Keywords


Gagal Jantung; Prediksi Kematian; Algoritma Pembelajaran Ensemble; Teknik Penyeimbangan Data

Full Text:

PDF

References


WHO, “The top 10 causes of death.” https://www.who.int/news-room/fact-sheets/detail/the-top-10-causes-of-death

Rokom, “Penyakit Jantung Penyebab Utama Kematian, Kemenkes Perkuat Layanan Primer.” https://sehatnegeriku.kemkes.go.id/baca/rilis-media/20220929/0541166/penyakit-jantung-penyebab-utama-kematian-kemenkes-perkuat-layanan-primer/

Kementetrian Kesehatan RI, “Hasil Utama Riskesdas 2018,” 2018. https://kesmas.kemkes.go.id/ (accessed Apr. 28, 2024).

R. Lestari, “Angka Kematian karena Gagal Jantung di Indonesia Tergolong Tinggi,” 2022. https://www.medcom.id/gaya/fitness-health/0k8XZ4gk-angka-kematian-karena-gagal-jantung-di-indonesia-tergolong-tinggi (accessed Apr. 28, 2024).

Cooper et al., “Trends and disparities in coronary heart disease, stroke, and other cardiovascular diseases in the United States: findings of the national conference on cardiovascular disease prevention.,” Circulation, vol. 102, no. 25, pp. 3137–3147, 2000.

S. Suman, J. Pravalika, P. Manjula, and U. Farooq, “Gender and CVD- Does It Really Matters?,” Curr. Probl. Cardiol., vol. 48, no. 5, 2023.

L. A. Allen et al., “Decision Making in Advanced Heart Failure A Scientific Statement From the American Heart Association,” Circulation, vol. 125, no. 15, pp. 1928–1952, 2012.

Z. Arabasadi, R. Alizadehsani, M. Roshanzamir, H. Moosaei, and A. A. Y. A, “Computer aided decision making for heart disease detection using hybrid neural network-Genetic algorithm,” Comput. Methods Programs Biomed., vol. 141, pp. 19–26, 2017.

C. T. L. Emelia J Benjamin, Paul Muntner, Alvaro Alonso, Marcio S Bittencourt, Clifton W Callaway, April P Carson, Alanna M Chamberlain, Alexander R Chang, Susan Cheng, Sandeep R Das, Francesca N Delling, Luc Djousse, Mitchell S V Elkind, Jane F Ferguson, Myriam F et al., “Heart Disease and Stroke Statistics—2019 Update: A Report From the American Heart Association,” Circulation, vol. 139, no. 10, pp. 356-e528, 2019.

B. Chapman, A. D. DeVore, R. J. Mentz, and M. Metra, “Clinical profiles in acute heart failure: an urgent need for a new approach,” ESC Hear. Fail, vol. 6, no. 3, pp. 464–474, 2019.

S. F. Weng, J. Reps, J. Kai, J. M. Garibaldi, and N. Qureshi, “Can machine-learning improve cardiovascular risk prediction using routine clinical data?,” PLoS One, vol. 12, no. 4, p. e0174944, 2017.

V. V Ramalingam and A. Dandapath, “Heart disease prediction using machine learning techniques: A survey,” Int. J. Eng. Technol., vol. 7, no. 2.8, p. 684, 2017.

T. Ahmad, A. Munir, S. H. Bhatti, M. Aftab, and M. A. Raza, “Survival analysis of heart failure patients: A case study,” PLoS One, vol. 12, no. 7, p. e0181001, 2017.

A. Ishaq et al., “Improving the Prediction of Heart Failure Patients’ Survival Using SMOTE and Effective Data Mining Techniques,” IEEE Access, vol. 9, pp. 39707–39716, 2021.

D. Chicco and G. Jurman, “The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation,” BMC Genomics, vol. 21, no. 6, pp. 1–13, 2020.

M. Zounemat-Kermani, O. Batelaan, M. Fadaee, and R. Hinkelmann, “Ensemble machine learning paradigms in hydrology: A review,” J. Hydrol., vol. 598, no. 33, 2021.

M. Belgiu and L. Drăguţ, “Random forest in remote sensing: A review of applications and future directions,” ISPRS J. Photogramm. Remote Sens., vol. 114, pp. 24–31, 2016.

L. Breiman, “Bagging predictors,” Mach. Learn., vol. 24, pp. 123–140, 1996.

T. Hastie, S. Rosset, J. Zhu, and H. Zou, “Multi-class adaboost,” Stat. Interface, vol. 2, no. 3, pp. 349–360, 2009.

A. Natekin and A. Knoll, “Gradient boosting machines, a tutorial,” Front. Neurorobot., vol. 7, p. 21, 2013.

T. Chen and C. Guestrin, “XGBoost: A Scalable Tree Boosting System,” Proc. 22nd acm sigkdd Int. Conf. Knowl. Discov. data Min., pp. 785–794, 2016.

A. Rácz, D. Bajusz, and K. Héberger, “Effect of Dataset Size and Train/Test Split Ratios in QSAR/QSPR Multiclass Classification,” Molecules, vol. 26, no. 4, p. 1111, 2021.

A. Altmann, L. Toloşi, O. Sander, and T. Lengauer, “Permutation importance: a corrected feature importance measure,” Bioinformatics, vol. 26, no. 10, pp. 1340–1347, 2010.

F. Thabtah, S. Hammoud, F. Kamalov, and A. Gonsalves, “Data imbalance in classification: Experimental evaluation,” Inf. Sci. (Ny)., vol. 513, pp. 429–441, 2020.

N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: Synthetic Minority Over-sampling Technique,” J. Artif. Intell. Res., vol. 16, pp. 321–357, 2002.

M. Hayaty, S. Muthmainah, and S. M. Ghufran, “Random and Synthetic Over-Sampling Approach to Resolve Data Imbalance in Classification,” Int. J. Artif. Intell. Res., vol. 4, no. 20, pp. 86–94, 2020.

T. Hasanin and T. Khoshgoftaar, “The Effects of Random Undersampling with Simulated Class Imbalance for Big Data,” in IEEE International Conference on Information Reuse and Integration (IRI), 2018, pp. 70–79.

S. Ambesange, R. Nadagoudar, R. Uppin, V. Patil, S. Patil, and S. Patil, “Liver Diseases Prediction using KNN with Hyper Parameter Tuning Techniques,” in 2020 IEEE Bangalore Humanitarian Technology Conference (B-HTC), 2020, pp. 1–6. doi: 10.1109/B-HTC50970.2020.9297949.




DOI: https://doi.org/10.26418/justin.v12i2.75158

Refbacks

  • There are currently no refbacks.


Copyright (c) 2024 JUSTIN (Jurnal Sistem dan Teknologi Informasi)

ara komputer
View My Stats

Creative Commons License
All article in Justin is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License