Advanced Multimodal Emotion Recognition for Javanese Language Using Deep Learning

Fatchul Arifin, Aris Nasuha, Ardy Seto Priambodo, Anggun Winursito, Teddy Surya Gunawan

Abstract


This research utilizes multimodal audio and video datasets to establish a robust Javanese language emotion recognition system, even though there have been few significant developments in emotion recognition systems for Javanese. The investigation investigates three models to optimize the extraction of emotional characteristics: the Spectrogram-Image Model (Model 1), which optimizes the extraction of emotional characteristics by converting audio inputs into spectrogram images and combining them with facial images for emotion labeling; the Convolutional-MFCC Model (Model 2), which improves feature extraction by utilizing convolution techniques for images and Mel-frequency cepstral coefficients for audio; and the Multimodal Feature-Extraction Model (Model 3), which processes video and audio features separately before integrating their recognition results. Comparative testing reveals that the Multimodal Feature-Extraction Model achieves the highest accuracy at 93%, followed by the Convolutional-MFCC Model at 85% and the Spectrogram-Image Model at 71%. According to the results, effective multimodal integration, particularly through separate feature extraction, significantly improves emotion recognition accuracy. This research enhances the effectiveness of communication systems and provides deeper insights into Javanese emotional expressions, with potential applications in human-computer interaction, healthcare, and cultural studies. It also contributes to the development of sophisticated emotion recognition technologies.

Keywords


Javanese emotion recognition; multimodal deep learning; audio-visual integration; emotion detection models; cultural emotion analysis; human-computer interaction

References


N. Ahmed, Z. Al Aghbari, and S. Girija, "A systematic survey on multimodal emotion recognition using learning algorithms," Intelligent Systems with Applications, vol. 17, p. 200171, 2023.

S. Poria, D. Hazarika, N. Majumder, G. Naik, E. Cambria, and R. Mihalcea, "Meld: A multimodal multi-party dataset for emotion recognition in conversations," arXiv preprint arXiv:1810.02508, 2018.

A. Ashraf, T. S. Gunawan, F. Arifin, M. Kartiwi, A. Sophian, and M. H. Habaebi, "On the Audio-Visual Emotion Recognition using Convolutional Neural Networks and Extreme Learning Machine," Indonesian Journal of Electrical Engineering and Informatics (IJEEI), vol. 10, no. 3, pp. 684-697, 2022.

A. Ashraf, T. S. Gunawan, F. Arifin, M. Kartiwi, A. Sophian, and M. H. Habaebi, "Enhanced Emotion Recognition in Videos: A Convolutional Neural Network Strategy for Human Facial Expression Detection and Classification," Indonesian Journal of Electrical Engineering and Informatics (IJEEI), vol. 11, no. 1, pp. 286-299, 2023.

T. M. Wani, T. S. Gunawan, S. A. A. Qadri, M. Kartiwi, and E. Ambikairajah, "A comprehensive review of speech emotion recognition systems," IEEE access, vol. 9, pp. 47795-47814, 2021.

A. S. Jamiluddin, S. K. Udja, and R. Safithri, "Meaning and Message of Communication Behaviour of Javanese Ethnic Traders to Prospective Buyers," in International Conference on Halal, Policy, Culture and Sustainability Issues, 2022, vol. 4, no. 1, p. 19.

P. Wijonarko and A. Zahra, "Spoken language identification on 4 Indonesian local languages using deep learning," Bulletin of Electrical Engineering and Informatics, vol. 11, no. 6, pp. 3288-3293, 2022.

E. T. Sulistyo, "Emotional Intelligence And Balanced Personality In Javanese Cultural Understanding," PalArch's Journal of Archaeology of Egypt/Egyptology, vol. 18, no. 4, pp. 3344-3359, 2021.

S. A. Kumala, "Analysis of Language Attitude and Language Preservation in Javanese Language.: A Case Study of Javanese Speaker in Madiun, East Java," e-LinguaTera, vol. 1, no. 1, pp. 11-19, 2021.

A. A. Kresna, "The Epistemology of Rasa as a Basic Foundation of the Javanese Psychology," East Asian Journal of Multidisciplinary Research, vol. 2, no. 8, pp. 3209-3222, 2023.

T. A. R. Yunanto, "Happiness in the Javanese context: Exploring the role of emotion regulation and resilience," Humanitas: Indonesian Psychological Journal, pp. 149-158, 2023.

Y. Khurana, S. Gupta, R. Sathyaraj, and S. Raja, "RobinNet: A Multimodal Speech Emotion Recognition System With Speaker Recognition for Social Interactions," IEEE Transactions on Computational Social Systems, 2022.

G. Wen, S. Ye, H. Li, P. Wen, and Y. Zhang, "Multimodal and Multitask Learning with Additive Angular Penalty Focus Loss for Speech Emotion Recognition," International Journal of Intelligent Systems, vol. 2023, no. 1, p. 3662839, 2023.

R. A. Patamia, P. E. Santos, K. N. Acheampong, F. Ekong, K. Sarpong, and S. Kun, "Multimodal Speech Emotion Recognition Using Modality-Specific Self-Supervised Frameworks," in 2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2023: IEEE, pp. 4134-4141.

G.-N. Dong, C.-M. Pun, and Z. Zhang, "Temporal relation inference network for multimodal speech emotion recognition," IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 9, pp. 6472-6485, 2022.

F. Arifin, A. S. Priambodo, A. Nasuha, A. Winursito, and T. S. Gunawan, "Development of Javanese Speech Emotion Database (Java-SED)," Indonesian Journal of Electrical Engineering and Informatics (IJEEI), vol. 10, no. 3, pp. 584-591, 2022.

T. Ahmed, I. Begum, M. S. Mia, and W. Tasnim, "Multimodal Speech Emotion Recognition Using Deep Learning and the Impact of Data Balancing," in 2023 5th International Conference on Sustainable Technologies for Industry 5.0 (STI), 2023: IEEE, pp. 1-6.

S. M. S. A. Abdullah, S. Y. A. Ameen, M. A. Sadeeq, and S. Zeebaree, "Multimodal emotion recognition using deep learning," Journal of Applied Science and Technology Trends, vol. 2, no. 01, pp. 73-79, 2021.

K. Nugroho, E. Noersasongko, and H. A. Santoso, "Javanese gender speech recognition using deep learning and singular value decomposition," in 2019 International Seminar on Application for Technology of Information and Communication (iSemantic), 2019: IEEE, pp. 251-254.


Full Text: PDF

Refbacks

  • There are currently no refbacks.


 

Indonesian Journal of Electrical Engineering and Informatics (IJEEI)
ISSN 2089-3272

Creative Commons Licence

This work is licensed under a Creative Commons Attribution 4.0 International License.

web analytics
View IJEEI Stats

503 Service Unavailable

Service Unavailable

The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later.

Additionally, a 503 Service Unavailable error was encountered while trying to use an ErrorDocument to handle the request.