BREAST CANCER PREDICTION WITH GRADIENT BOOSTING CLASSIFIERS
Keywords:
Machine Learning, Gradient Boosting Machines, Oncology, Healthcare AI, Clinical Decision Support SystemsAbstract
Breast cancer poses a significant global health challenge, being the most prevalent cancer in women and a leading cause of cancer-related mortality. With the increasing number of diagnoses, there is a pressing need for innovative approaches to enhance detection and treatment. This study explores the effectiveness of boosting algorithms, namely AdaBoost, XGBoost, CatBoost, and LightGBM, in predicting breast cancer by leveraging artificial intelligence (AI) and machine learning (ML) technologies. This research distinguishes itself from previous studies by focusing specifically on gradient boosting algorithms. The research utilizes datasets sourced from Kaggle, comprising 569 instances categorized into Benign (357) and Malignant (212) classes. Data preprocessing steps, such as feature selection and normalization, were performed to improve the model’s learning ability and ensure that all features contribute fairly during training. Additionally, Hyperparameter tuning, including adjusting the learning rate, number of trees, and tree depth, was conducted to optimize the model’s performance. Following training, various performance metrics including accuracy, sensitivity, precision, specificity, ROC-AUC, and confusion matrix were assessed. The results showed that AdaBoost achieved an accuracy of 97.37%, precision of 0.9545, recall of 0.9767, specificity of 0.9718, and an AUC-ROC of 0.9967. CatBoost achieved an accuracy of 97.49%, precision of 0.9756, recall of 0.9302, specificity of 0.9859, and an AUC-ROC of 0.9977. LightGBM achieved an accuracy of 96.49%, precision of 0.9535, recall of 0.9535, specificity of 0.9718, and an AUC-ROC of 0.9961....
Downloads
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Kingsley Chibueze, Lucy Ifeyinwa Ezigbo, Anthony Kwubeghari
This work is licensed under a Creative Commons Attribution 4.0 International License.
This is an open-access journal which means that all content is freely available without charge to the user or his/her institution. Users are allowed to read, download, copy, distribute, print, search, or link to the full texts of the articles, or use them for any other lawful purpose, without asking prior permission from the publisher or the author.
The Authors own the copyright of the articles.