Enhancing Accuracy for Classification Using the CNN Model and Hyperparameter Optimization Algorithm

Dai Nguyen Quoc, Ngoc Thanh Tran

Abstract


The Convolutional Neural Network (CNN) is a widely employed deep learning model, particularly effective for image recognition and classification tasks. The performance of a CNN is influenced not only by its architecture but also critically by its hyperparameters. Consequently, optimizing hyperparameters is essential for improving CNN model performance. In this study, the authors propose leveraging optimization algorithms such as Random Search, Bayesian Optimization with Gaussian Processes, and Bayesian Optimization with Treestructured Parzen Estimators to fine-tune the hyperparameters of the CNN model. The performance of the optimized CNN is compared with traditional machine learning models, including Random Forest (RF), Support Vector Classification (SVC), and K-Nearest Neighbors (KNN). Both the MNIST and Olivetti Faces datasets are utilized in this research. In the training procedure, on the MNIST dataset, the CNN model achieved a minimum accuracy of 97.85%, surpassing traditional models, which had a maximum accuracy of 97.50% across all optimization techniques. Similarly, on the Olivetti Faces dataset, the CNN achieved a minimum accuracy of 94.96%, while traditional models achieved a maximum accuracy of 94.00%. In the training-testing procedure, the CNN demonstrated impressive results, achieving accuracy rates exceeding 99.31% on the MNIST dataset and over 98.63% on the Olivetti Faces dataset, significantly outperforming traditional models, whose maximum values were 98.69% and 97.50%, respectively. Furthermore, the study compares the performance of the CNN model with three optimization algorithms. The results show that integrating CNN with these optimization techniques significantly improves prediction accuracy compared to traditional models.

Keywords


CNN; RS; BO-GP; BO-TPE

References


I. H. Sarker, “Machine Learning: Algorithms, Real-World Applications and Research Directions”, SN Computer Science, Vol. 2, No. 3, pp. 160, 2021.

H. Fujiyoshi, T. Hirakawa, and T. Yamashita, “Deep learning-based image recognition for autonomous driving”, IATSS Research, Vol. 43, No. 4, pp. 244-252, 2019.

W. J. Wong and S. H. Lai, “Multi-task CNN for restoring corrupted fingerprint images”, Pattern Recognition, Vol. 101, pp. 107203, 2020.

H. H. Luong, T. T. Khanh, M. D. Ngoc, M. H. Kha, K. T. Duy, and T. T. Anh, “Detecting Exams Fraud Using Transfer Learning and Fine-Tuning for ResNet50”, In: Communications in Computer and Information Science, Ho Chi Minh City, Vietnam, Vol. 1688, pp. 747-754, 2022.

M. M. Taye, “Theoretical Understanding of Convolutional Neural Network: Concepts, Architectures, Applications, Future Directions”, Computation, Vol. 11, No. 3, pp. 52, 2023.

S. Almabdy and L. Elrefaei, “Deep Convolutional Neural Network-Based Approaches for Face Recognition”, Applied Sciences (Switzerland), Vol. 9, No. 20, pp. 4397, 2019.

D. Beohar and A. Rasool, “Handwritten Digit Recognition of MNIST dataset using Deep Learning state-of-the-art Artificial Neural Network (ANN) and CNN”, In: International Conference on Emerging Smart Computing and Informatics (ESCI), Pune, India, pp. 542-548, 2021.

R. L. Galvez, A. A. Bandala, E. P. Dadios, R. R. P. Vicerra and J. M. Z. Maningo, “Object Detection Using Convolutional Neural Networks”, In: IEEE Region 10 Annual International Conference, Proceedings TENCON, Jeju, Korea (South), pp. 2023-2027, 2018.

S. Kumaar, R. M. Vishwanath, S. N. Omkar, A. Majeedi and A. Dogra, “Disguised Facial Recognition Using Neural Networks”, In: 2018 IEEE 3rd International Conference on Signal and Image Processing (ICSIP), Shenzhen, China, pp. 28-32, 2018.

S. M. Anwar, M. Majid, A. Qayyum, M. Awais, M. Alnowami, and M. K. Khan, “Medical Image Analysis using Convolutional Neural Networks: A Review”, Journal of Medical Systems, Vol. 42, No. 11, pp. 226, 2018.

R. Lateef and A. Abbas, “Tuning the Hyperparameters of the 1D CNN Model to Improve the Performance of Human Activity Recognition”, Engineering and Technology Journal, Vol. 40, No. 4, pp. 547-554, 2022.

TN Tran, “Grid Search of Convolutional Neural Network model in the case of load forecasting”, Archives Of Electrical Engineering, Vol. 70, No. 1, pp. 25-36, 2021.

K. and N. R. O’Shea, “An Introduction To Convolutional Neural Networks”, International Journal for Research in Applied Science and Engineering Technology, Vol. 10, No. 12, 2015.

R. Zatarain Cabada, H. Rodriguez Rangel, M. L. Barron Estrada, and H. M. Cardenas Lopez, “Hyperparameter optimization in CNN for learning-centered emotion recognition for intelligent tutoring systems,” Soft Computing, Vol. 24, No. 10, pp. 7593-7602, 2020.

L. Yang and A. Shami, “On hyperparameter optimization of machine learning algorithms: Theory and practice,” Neurocomputing, Vol. 415, pp. 295-316, 2020.

A. Morales-Hernández, I. van Nieuwenhuyse, and S. Rojas Gonzalez, “A survey on multi-objective hyperparameter optimization algorithms for machine learning,” Artificial Intelligence Review, Vol. 56, No. 8, pp. 8043-8093, 2023.

N. M. Aszemi and P. D. D. Dominic, “Hyperparameter optimization in convolutional neural network using genetic algorithms,” International Journal of Advanced Computer Science and Applications, Vol. 10, No. 6, pp. 269-278, 2019.

E. C. Garrido-Merchán and D. Hernández-Lobato, “Dealing with categorical and integer-valued variables in Bayesian Optimization with Gaussian processes,” Neurocomputing, Vol. 380, pp. 20-35, 2020.

M. McIntire, D. Ratner, and S. Ermon, “Sparse Gaussian processes for Bayesian optimization,” In: Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI), New Jersey, USA, pp. 517-526, 2016.

W. Zhang, C. Wu, H. Zhong, Y. Li, and L. Wang, “Prediction of undrained shear strength using extreme gradient boosting and random forest based on Bayesian optimization,” Geoscience Frontiers, Vol. 12, No. 1, pp. 469-477, 2021.

J. Bergstra, R. Bardenet, Y. Bengio, and B. Kégl, “Algorithms for hyper-parameter optimization,” In: Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, pp. 2546-2554, 2011.

C. Ferri, J. Hernández-Orallo, and R. Modroiu, “An experimental comparison of performance measures for classification,” Pattern Recognition Letters, Vol. 30, No. 1, pp. 27-38, 2009.

W. Dhifli and A. B. Diallo, “Face Recognition in the Wild,” Procedia Computer Science, Vol. 96, pp. 1571-1580, 2016.


Full Text: PDF

Refbacks

  • There are currently no refbacks.


 

Indonesian Journal of Electrical Engineering and Informatics (IJEEI)
ISSN 2089-3272

Creative Commons Licence

This work is licensed under a Creative Commons Attribution 4.0 International License.

web analytics
View IJEEI Stats

503 Service Unavailable

Service Unavailable

The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later.

Additionally, a 503 Service Unavailable error was encountered while trying to use an ErrorDocument to handle the request.