Ransomware Detection Using Stacked Autoencoder for Feature Selection

Mike Nkongolo Wa Nkongolo, Mahmut Tokmak

Abstract


In response to the escalating malware threats, we propose an advanced ransomware detection and classification method. Our approach combines a stacked autoencoder for precise feature selection with a Long Short-Term Memory classifier which significantly enhances ransomware stratification accuracy. The process involves thorough preprocessing of the UGRansome dataset, training an unsupervised stacked autoencoder for optimal feature selection, and fine-tuning via supervised learning to elevate the Long Short-Term Memory model's classification capabilities. We meticulously analysed the autoencoder's learned weights and activations to pinpoint essential features for distinguishing 17 ransomware families from other malware and created a streamlined feature set for precise classification. Our results demonstrate the exceptional performance of the stacked autoencoder-based Long Short-Term Memory model across the 17 ransomware families. This model exhibits high precision, recall, and F1 score values. Furthermore, balanced average scores affirm its ability to generalize effectively across various malware types. To optimise the proposed model, we conducted extensive experiments, including up to 400 epochs, and varying learning rates and achieved an exceptional 98.5% accuracy in ransomware classification. These results surpass traditional machine learning classifiers. Moreover, the proposed model surpasses the Extreme Gradient Boosting (XGBoost) algorithm, primarily due to its effective stacked autoencoder feature selection mechanism and demonstrates outstanding performance in identifying signature attacks with a 98.5% accuracy rate. This result outperforms the XGBoost model, which achieved a 95.5% accuracy rate in the same task. In addition, a prediction of the ransomware financial impact using the proposed model reveals that while Locky, SamSam, and WannaCry still incur substantial cumulative costs, their attacks may not be as financially damaging as those of NoobCrypt, DMALocker, and EDA2.

Keywords


Ransomware Classification and Detection, Machine Learning, Cybersecurity, Cryptology, Cyberintelligence

References


C. Onwuegbuche, A. D. Jurcut, and L. Pasquale, “Enhancing ransomware classification with multi-stage feature selection and data imbalance correction,” in International Symposium on Cyber Security, Cryptology, and Machine Learning. Springer, 2023, pp. 285–295.

N. E. Majd and T. Mazumdar, “Ransomware classification using machine learning,” in 2023 32nd International Conference on Computer Communications and Networks (ICCCN). IEEE, 2023, pp. 1–7.

D. Krivokapic, A. Nikolic, A. Stefanovic, and M. Milosavljevic, “Financial, accounting and tax implications of ransomware attack,” Studia Iuridica Lublinensia, vol. 32, no. 1, pp. 191–211, 2023.

V. Darwin and M. Nkongolo, “Data protection for data privacy-a south african problem?” arXiv preprint arXiv:2306.09934, 2023.

H. Pieterse, “The cyber threat landscape in south africa: A 10-year review,” The African Journal of Information and Communication, vol. 28, pp. 1–21, 2021.

S. Snail ka Mtuze and M. Musoni, “An overview of cybercrime law in south africa,” International Cybersecurity Law Review, pp. 1–25, 2023.

M. Nkongolo and M. Tokmak, “Zero-day threats detection for critical infrastructures,” in South African Institute of Computer Scientists and Information Technologists, A. Gerber and M. Coetzee, Eds. Cham: Springer Nature Switzerland, 2023, pp. 32–47.

A. Rege and R. Bleiman, “A free and community-driven critical infrastructure ransomware dataset,” in Proceedings of the International Conference on Cybersecurity, Situational Awareness and Social Media, C. Onwubiko, P. Rosati, A. Rege, A. Erola, X. Bellekens, H. Hindy, and M. G. Jaatun, Eds. Singapore: Springer Nature Singapore, 2023, pp. 25–37.

M. Nkongolo, J. P. Van Deventer, and S. M. Kasongo, “Ugransome1819: A novel dataset for anomaly detection and zero-day threats,” Information, vol. 12, no. 10, p. 405, 2021.

M. Tokmak, “Deep forest approach for zero-day attacks detection,” Innovations and Technologies in Engineering., no. ISBN: 978-625-6382-83-1, pp. 45–56, 2022.

D. Shankar, G. V. S. George, J. N. J. N. S. S, and P. S. Madhuri, “Deep analysis of risks and recent trends towards network intrusion detection system,” International Journal of Advanced Computer Science and Applications, vol. 14, no. 1, 2023.

M. Nkongolo, J. P. Van Deventer, S. M. Kasongo, S. R. Zahra, and J. Kipongo, “A cloud based optimization method for zero-day threats detection using genetic algorithm and ensemble learning,” Electronics, vol. 11, no. 11, p. 1749, 2022.

M. Nkongolo, J. P. van Deventer, and S. M. Kasongo, “The application of cyclostationary malware detection using boruta and pca,” in Computer Networks and Inventive Communication Technologies, S. Smys, P. Lafata, R. Palanisamy, and K. A. Kamel, Eds. Singapore: Springer Nature Singapore, 2023, pp. 547–562.

A. Dairi, F. Harrou, B. Bouyeddou, S.-M. Senouci, and Y. Sun, “Semi-supervised deep learning-driven anomaly detection schemes for cyber-attack detection in smart grids,”in Power Systems Cybersecurity: Methods, Concepts, and Best Practices. Springer, 2023, pp. 265–295.

F. Deldar and M. Abadi, “Deep learning for zero-day malware detection and classification: A survey,” ACM Computing Surveys, 2023.

F. GUVC¸ I and A. SENOL, “An improved protection approach for protecting from ransomware attacks,” Journal of

Data Applications, no. 1, pp. 69–82, 2023.

A. Djenna, E. Barka, A. Benchikh, and K. Khadir, “Unmasking cybercrime with artificial-intelligence-driven cybersecurity analytics,” Sensors, vol. 23, no. 14, p. 6302, 2023.

Wang, W. W. Y. Ng, W. Li, S. Kwong, and J. Li, “Broad autoencoder features learning for pattern classification problems,” in 2019 IEEE 18th International Conference on Cognitive Informatics Cognitive Computing (ICCI*CC), 2019, pp. 130–135.

Kong, R. Lin, and H. Zou, “Feature extraction of load curve based on autoencoder network,” in 2020 IEEE 20th International Conference on Communication Technology (ICCT), 2020, pp. 1452–1456.

Y. Wang, H. Yang, X. Yuan, Y. A. Shardt, C. Yang, and W. Gui, “Deep learning for fault-relevant feature extraction and fault classification with stacked supervised auto-encoder,” Journal of Process Control, vol. 92, pp. 79–89, 2020. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0959152420302225

S. Chatterjee, D. Dey, and S. Munshi, “Morphological, texture and auto-encoder based feature extraction techniques for skin disease classification,” in 2019 IEEE 16th India Council International Conference (INDICON), 2019, pp. 1–4.

J. Kim, H. Lee, J. W. Jeon, J. M. Kim, H. U. Lee, and S. Kim, “Stacked auto-encoder based cnc tool diagnosis using discrete wavelet transform feature extraction,” Processes, vol. 8, no. 4, 2020. [Online]. Available: https://www.mdpi.com/2227-9717/8/4/456

W. Hardy, L. Chen, S. Hou, Y. Ye, and X. Li, “Dl4md: A deep learning framework for intelligent malware detection,” in Proceedings of the International Conference on Data Science (ICDATA). The Steering Committee of The World Congress in Computer Science, Computer . . . , 2016, p. 61.

A. Jyothish, A. Mathew, and P. Vinod, “Effectiveness of machine learning based android malware detectors against adversarial attacks,” Cluster Computing, pp. 1–21, 2023.

P. Panda, O. K. CU, S. Marappan, S. Ma, and D. Veesani Nandi, “Transfer learning for image-based malware detection for iot,” Sensors, vol. 23, no. 6, p. 3253, 2023.

F. Ali, S. El-Sappagh, and D. Kwak, “Fuzzy ontology and lstm-based text mining: a transportation network monitoring system for assisting travel,” Sensors, vol. 19, no. 2, p. 234, 2019.

M. Sewak, S. K. Sahay, and H. Rathore, “LSTM Hyper-Parameter Selection for Malware Detection: Interaction Effects and Hierarchical Selection Approach,” arXiv e-prints, p. arXiv:2109.11500, Sep. 2021.

Y. Fang, C. Huang, L. Liu, and M. Xue, “Research on malicious javascript detection technology based on lstm,” IEEE Access, vol. 6, pp. 59 118–59 125, 2018.

M. Nkongolo, “Fuzzification-based feature selection for enhanced website content encryption,” arXiv preprint arXiv:2306.13548, 2023.

C. Roberts and M. Nair, “Arbitrary Discrete Sequence Anomaly Detection with Zero Boundary LSTM,” arXiv eprints, p. arXiv:1803.02395, Mar. 2018.

F. Suthar, N. Patel, and S. Khanna, “A signature-based botnet (emotet) detection mechanism,” Int. J. Eng. Trends Technol, vol. 70, no. 5, pp. 185–193, 2022.

M. Komisarek, M. Pawlicki, T. Simic, D. Kavcnik, R. Kozik, and M. Chora´s, “Modern netflow network dataset with labeled attacks and detection methods,” in Proceedings of the 18th International Conference on Availability, Reliability and Security, 2023, pp. 1–8.

S. Yadav and S. Subramanian, “Detection of application layer ddos attack by feature learning using stacked autoencoder,” in 2016 international conference on computational techniques in information and communication technologies (icctict). IEEE, 2016, pp. 361–366.

S. M. Kasongo, “A deep learning technique for intrusion detection system using a recurrent neural networks based framework,” Computer Communications, vol. 199, pp. 113–125, 2023. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0140366422004601

V. Nath, D. Yang, H. R. Roth, and D. Xu, “Warm start active learning with proxy labels and selection via semisupervised fine-tuning,” in International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 2022, pp. 297–308.

S. A. Althubiti, E. M. Jones, and K. Roy, “Lstm for anomaly-based network intrusion detection,” in 2018 28th International telecommunication networks and applications conference (ITNAC). IEEE, 2018, pp. 1–3.

S. A. Alsaif et al., “Machine learning-based ransomware classification of bitcoin transactions,” Applied Computational Intelligence and Soft Computing, vol. 2023, 2023.

M. Nkongolo, “Using arima to predict the growth in the subscriber data usage,” Eng, vol. 4, no. 1, pp. 92–120, 2023. [Online]. Available: https://www.mdpi.com/2673-4117/4/1/6

M. Nkongolo, J. P. van Deventer, S. M. Kasongo, and W. van der Walt, “Classifying social media using deep packet inspection data,” in Inventive Communication and Computational Technologies, G. Ranganathan, X. Fernando, and A. Rocha, Eds. ´ Singapore: Springer Nature Singapore, 2023, pp. 543–557.

M. Nkongolo, J. P. Van Deventer, S. M. Kasongo, W. Van Der Walt, R. Kalonji, and M. Pungwe, “Network policy enforcement: An intrusion prevention approach for critical infrastructures,” in 2022 6th International Conference on Electronics, Communication and Aerospace Technology, 2022, pp. 686–692.

A. Hansberry, A. Lasse, and A. Tarrh, “Cryptolocker: 2013’s most malicious malware,” Retrieved February, vol. 9, p. 2017.

Nkongolo Wa Nkongolo, M., "RFSA: A Ransomware Feature Selection Algorithm for Multivariate Analysis of Malware Behavior in Cryptocurrency," International Journal of Computing and Digital Systems, vol. 15, no. 1, pp. 893-927, 2024.

Deep, B., & Aman, J. (2023). "Prevention and Detection of Intrusion in Cloud Using Hidden Markov Model, " International Journal of Research -GRANTHAALAYAH, vol. 11, no. 2, pp. 40–46.

Hong, W., Yin, J., You, M., Wang, H., Cao, J., Li, J., Liu, M. and Man, C., 2023. "A graph empowered insider threat detection framework based on daily activities," ISA transactions, 141, pp.84-92.

Dhanya, L. and Chitra, R., 2024. "A novel autoencoder based feature independent GA optimised XGBoost classifier for IoMT malware detection," Expert Systems with Applications, 237, p.121618.

Mofidi, F., Hounsinou, S.G. and Bloom, G., 2024, January. "L-IDS: A Multi-Layered Approach to Ransomware Detection in IoT," in 2024 IEEE 14th Annual Computing and Communication Workshop and Conference (CCWC), pp. 0387-0396. IEEE.

Mehrban, A. and Geransayeh, S.K., 2024. "Ransomware threat mitigation through network traffic analysis and machine learning techniques," arXiv preprint arXiv:2401.15285.

Abbasi, M.S., Al-Sahaf, H., Mansoori, M. and Welch, I., 2022. "Behavior-based ransomware classification: A particle swarm optimization wrapper-based approach for feature selection," Applied Soft Computing, 121, p.108744.

Maniriho, P., Mahmood, A.N. and Chowdhury, M.J.M., 2023. "API-MalDetect: Automated malware detection framework for windows based on API calls and deep learning techniques," Journal of Network and Computer Applications, 218, p.103704.

Jeon, J., Baek, S., Jeong, B. and Jeong, Y.S., 2023. "Early prediction of ransomware API calls behaviour based on GRU-TCN in healthcare IoT," Connection Science, vol. 35, no. 1, p.2233716.


Full Text: PDF

Refbacks

  • There are currently no refbacks.


 

Indonesian Journal of Electrical Engineering and Informatics (IJEEI)
ISSN 2089-3272

Creative Commons Licence

This work is licensed under a Creative Commons Attribution 4.0 International License.

web analytics
View IJEEI Stats

503 Service Unavailable

Service Unavailable

The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later.

Additionally, a 503 Service Unavailable error was encountered while trying to use an ErrorDocument to handle the request.