A Cost Sensitive SVM and Neural Network Ensemble Model for Breast Cancer Classification

Tina Elizabeth Mathew

Abstract


Breast Cancer has surpassed all categories of cancer in incidence and is the most prevalent form of cancer in women worldwide. The global incidence rate is seen to be highest in the country of Belgium as per statistics of WHO. In the case of developing countries specifically, India, it has overtaken other cancers and stands first in incidence and mortality. Major factors identified as impacting the prognosis and survival in the country is chiefly the late diagnosis of the disease and diverse situations prevailing in different parts of the country including lack of diagnostic facilities, lack of awareness, fear of undergoing existing procedures and so on. This is also true for many other countries in the world. Early diagnosis is a vital factor for survival. The implementation of machine learning techniques in cancer prediction, diagnosis and classification can assist medical practitioners as a supplementary diagnostic tool. In this work, an ensemble model of a polynomial kernel-based Support Vector machines and Gradient Descent with Momentum Back Propagation Artificial Neural Networks for Breast Cancer Classification is proposed. Feature selection is applied using Genetic Search for identifying the best feature set and data sampling techniques such as combination of oversampling and undersampling and cost senstivke learning are applied on the individual Neural Network and Support Vector Machine classifiers to deal with issues related with class imbalance. The ensemble model is seen to show superior performance in comparison with other models producing an accuracy of 99.12%.

Keywords


Breast cancer; Classification; Cost Sensitive Learning; Neural Networks; Support Vector Machines

References


K. Sathishkumar et al., “Trends in breast and cervical cancer in India under National Cancer Registry Programme: an age-period-cohort analysis,” Cancer Epidemiol., vol. 74, p. 101982, 2021.

F. Bray, J. Ferlay, I. Soerjomataram, R. L. Siegel, L. A. Torre, and A. Jemal, “Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries,” CA. Cancer J. Clin., vol. 68, no. 6, pp. 394–424, 2018.

H. Sung et al., “Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries,” CA. Cancer J. Clin., vol. 71, no. 3, pp. 209–249, 2021.

S. Malvia, S. A. Bagadi, U. S. Dubey, and S. Saxena, “Epidemiology of breast cancer in Indian women,” Asia Pac. J. Clin. Oncol., vol. 13, no. 4, pp. 289–295, 2017.

K. K. Thakur, D. Bordoloi, and A. B. Kunnumakkara, “Alarming burden of triple-negative breast cancer in India,” Clin. Breast Cancer, vol. 18, no. 3, pp. e393–e399, 2018.

P. K. Dhillon et al., “The burden of cancers and their variations across the states of India: the Global Burden of Disease Study 1990–2016,” Lancet Oncol., vol. 19, no. 10, pp. 1289–1306, 2018.

P. Priyadarshini, V. Hemavathy, and S. Sarathi, “RISING INCIDENCE OF BREAST CANCER IN INDIA,” NVEO-Nat. VOLATILES Essent. OILS J. NVEO, pp. 2284–2288, 2021.

Kumar, N., Narayan Das, N., Gupta, D., Gupta, K., & Bindra, J. (2021). Efficient automated disease diagnosis using machine learning models. Journal of Healthcare Engineering, 2021.

K. Kourou, T. P. Exarchos, K. P. Exarchos, M. V. Karamouzis, and D. I. Fotiadis, “Machine learning applications in cancer prognosis and prediction,” Comput. Struct. Biotechnol. J., vol. 13, pp. 8–17, 2015.

T. E. Mathew and K. A. Kumar, “A Logistic Regression Based Hybrid Model For Breast Cancer Classification,” Indian J. Comput. Sci. Eng., vol. 11, no. 6, pp. 899–906, 2020, doi: DOI : 10.21817/indjcse/2020/v11i6/201106201.

Uddin, S., Khan, A., Hossain, M. et al. Comparing different supervised machine learning algorithms for disease prediction. BMC Med Inform Decis Mak 19, 281 (2019). https://doi.org/10.1186/s12911-019-1004-8

Hatem, M.Q. Skin lesion classification system using a K-nearest neighbor algorithm. Vis. Comput. Ind. Biomed. Art 5, 7 (2022). https://doi.org/10.1186/s42492-022-00103-6

T. E. Mathew, “A logistic regression with recursive feature elimination model for breast cancer diagnosis,” Int. J. Emerg. Technol., vol. 10, no. 3, pp. 55–63, 2019.

M. Islam, M. Haque, H. Iqbal, M. Hasan, M. Hasan, and M. N. Kabir, “Breast cancer prediction: a comparative study using machine learning techniques,” SN Comput. Sci., vol. 1, no. 5, pp. 1–14, 2020.

F. J. M. Shamrat, M. A. Raihan, A. S. Rahman, I. Mahmud, and R. Akter, “An analysis on breast disease prediction using machine learning approaches,” Int. J. Sci. Technol. Res., vol. 9, no. 02, pp. 2450–2455, 2020.

M. A. Aswathy and M. Jagannath, “An SVM approach towards breast cancer classification from H&E-stained histopathology images based on integrated features,” Med. Biol. Eng. Comput., vol. 59, no. 9, pp. 1773–1783, 2021.

M.-W. Huang, C.-W. Chen, W.-C. Lin, S.-W. Ke, and C.-F. Tsai, “SVM and SVM ensembles in breast cancer prediction,” PloS One, vol. 12, no. 1, p. e0161501, 2017.

N. Liu, J. Shen, M. Xu, D. Gan, E.-S. Qi, and B. Gao, “Improved cost-sensitive support vector machine classifier for breast cancer diagnosis,” Math. Probl. Eng., vol. 2018, 2018.

C. Aroef, Y. Rivan, and Z. Rustam, “Comparing random forest and support vector machines for breast cancer classification,” Telkomnika, vol. 18, no. 2, pp. 815–821, 2020.

H. Turabieh, “Comparison of NEAT and Backpropagation Neural Network on Breast Cancer Diagnosis.,” Int. J. Comput. Appl., vol. 139, no. 8, pp. 40–44, 2016.

S. Singh, H. Sushmitha, J. Harini, and B. R. Surabhi, “An efficient neural network based system for diagnosis of breast cancer,” Breast Cancer, vol. 8, no. 10, p. 12, 2014.

K. Kaushik and A. Arora, “Breast cancer diagnosis using artificial neural network,” Int. J. Latest Trends Eng. Technol. IJLTET, vol. 7, pp. 41–48, 2016.

T. E. Mathew, “A comparative study of the performance of different Support Vector machine Kernels in Breast Cancer Diagnosis,” Int. J. Inf. Comput. Sci., vol. 6, no. 6, pp. 432–441, 2019.

L. Wang, Z. Wang, G. Wei, and F. E. Alsaadi, “Finite-time state estimation for recurrent delayed neural networks with component-based event-triggering protocol,” IEEE Trans. Neural Netw. Learn. Syst., vol. 29, no. 4, pp. 1046–1057, 2017.

M. M. Saritas and A. Yasar, “Performance analysis of ANN and Naive Bayes classification algorithm for data classification,” Int. J. Intell. Syst. Appl. Eng., vol. 7, no. 2, pp. 88–91, 2019.

A. Alzubaidi, G. Cosma, D. Brown, and A. G. Pockley, “Breast cancer diagnosis using a hybrid genetic algorithm for feature selection based on mutual information,” in 2016 International Conference on Interactive Technologies and Games (ITAG), 2016, pp. 70–76.

M. A. Rahman and R. C. Muniyandi, “An enhancement in cancer classification accuracy using a two-step feature selection method based on artificial neural networks with 15 neurons,” Symmetry, vol. 12, no. 2, p. 271, 2020.

M. Kumar and H. S. Sheshadri, “On the classification of imbalanced datasets,” Int. J. Comput. Appl., vol. 44, no. 8, pp. 1–7, 2012.

R. Akbani, S. Kwek, and N. Japkowicz, “Applying support vector machines to imbalanced datasets,” in European conference on machine learning, 2004, pp. 39–50.

S. Chand, “A comparative study of breast cancer tumor classification by classical machine learning methods and deep learning method,” Mach. Vis. Appl., vol. 31, no. 6, pp. 1–10, 2020.

Kaur, S., Kumar, Y., Koul, A. et al. A Systematic Review on Metaheuristic Optimization Techniques for Feature Selections in Disease Diagnosis: Open Issues and Challenges. Arch Computat Methods Eng 30, 1863–1895 (2023). https://doi.org/10.1007/s11831-022-09853-1

Hanley, J. A., & McNeil, B. J. (1982). The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology, 143(1), 29-36.

Ali, E. E. E., & Feng, W. Z. (2016). Breast cancer classification using support vector machine and neural network. International Journal of Science and Research, 5(3), 1-6.

Huang M-W, Chen C-W, Lin W-C, Ke S-W, Tsai C-F (2017) SVM and SVM Ensembles in Breast Cancer Prediction. PLoS ONE 12(1): e0161501. https://doi.org/10.1371/journal.pone.0161501

Abdar, M., & Makarenkov, V. (2019). CWV-BANN-SVM ensemble learning classifier for an accurate diagnosis of breast cancer. Measurement, 146, 557-570.

Wang, H., Zheng, B., Yoon, S. W., & Ko, H. S. (2018). A support vector machine-based ensemble algorithm for breast cancer diagnosis. European Journal of Operational Research, 267(2), 687-699.


Full Text: PDF

Refbacks

  • There are currently no refbacks.


 

Indonesian Journal of Electrical Engineering and Informatics (IJEEI)
ISSN 2089-3272

Creative Commons Licence

This work is licensed under a Creative Commons Attribution 4.0 International License.

web analytics
View IJEEI Stats

503 Service Unavailable

Service Unavailable

The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later.

Additionally, a 503 Service Unavailable error was encountered while trying to use an ErrorDocument to handle the request.