Analisis Terjadinya Kanker Paru-Paru Pada Pasien Menggunakan Decision Tree: Penerapan Algoritma C4.5 Dan RapidMiner Untuk Menentukan Risiko Kanker Pada Gejala Pasien


  • Deigo Anugrah Pratama Universitas Bina Sarana Informatika
  • Ibnu Rizal Mutaqin Universitas Bina Sarana Informatika
  • Kevin Rafael Manuela Universitas Bina Sarana Informatika



Data analysis, Data Mining, RapidMiner, Decision Tree, C4.5 Algorithm


The innovative approach to cancer patient data modeling has been employed in this research. We utilized the "Decision Tree" concept as a machine learning algorithm to analyze a dataset containing detailed information about patients, including age, gender, family history, and other medical test results. Through meticulous data study steps, we compiled a relevant dataset and then performed data classification to determine the target variable, whether a patient can be categorized as likely to have lung cancer or not. Input variables were carefully grouped to ensure the accuracy of the analysis. Data analysis using the Decision Tree algorithm provided profound insights into the significant factors in predicting cancer symptoms in patients. The results of this analysis were interpreted carefully, and performance model evaluation metrics, such as accuracy and precision, were provided to offer a comprehensive understanding of the reliability of the generated model. The findings of this research have important implications for the understanding and management of cancer in patients. The application of this method can enhance accuracy in predicting cancer status, assist in clinical decision-making, and ultimately improve the quality of patient care.


Böhlen, M., Gamper, J., & Polasek, W. (Eds.). (2013). Data Analysis, Machine Learning, and Applications. Springer.

Fayyad, U., Piatetsky-Shapiro, G., Smyth, P., & Uthurusamy, R. (Eds.). (1996). Advances in Knowledge Discovery and Data Mining. AAAI Press.

Han, J. & M. Kamber. (2006). Data Mining Concept and Techniques, Morgan Kaufmann Publishers, San Francisco.

Larose, D. T. (2014). Discovering Knowledge in Data: An Introduction to Data Mining (2nd ed.). Wiley.

Lee, J. H., et al. (2021). Age-specific prognosis of non-small cell lung cancer in patients with cough. Lung Cancer, 208, 1-9.

Miller, C. A., et al. (2022). Predicting lung cancer in patients with cough and risk factors. Cancer Epidemiology, Biomarkers & Prevention, 31(1), 1-10

Quinlan, J. R. (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann.

Travis, W. D., Brambilla, E., Noguchi, M., Nicholson, A. G., Geisinger, K., Yatabe, Y., & Flieder, D. B. (2011). International association for the study of lung cancer/American thoracic society/European respiratory society international multidisciplinary classification of lung adenocarcinoma. Journal of Thoracic Oncology, 6(2), 244-285.

Witten, I. H., Frank, E., Hall, M. A., & Pal, C. J. (2016). Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann.




How to Cite

Deigo Anugrah Pratama, Ibnu Rizal Mutaqin, & Kevin Rafael Manuela. (2023). Analisis Terjadinya Kanker Paru-Paru Pada Pasien Menggunakan Decision Tree: Penerapan Algoritma C4.5 Dan RapidMiner Untuk Menentukan Risiko Kanker Pada Gejala Pasien. Jurnal Teknik Mesin, Industri, Elektro Dan Informatika, 2(4), 156–170.