Deep Learning–Based Emotion Classification Models for Chinese and Korean OST Music
DOI:
https://doi.org/10.56979/1002/2026/1244Keywords:
Music Emotion Recognition (MER), Deep Learning; Chinese OST, Korean OST, PMEmo, EMOPIA, Attention MechanismAbstract
Music Emotion Recognition (MER) has made significant advancements with deep learning, however, existing models tend to have cultural bias wherein they are not good at recognizing the emotion of non-Western musical structures. This paper proposes a deep learning framework designed especially for the emotion classification in Chinese and Korean Original Soundtracks (OSTs), which have unique tonal dynamics and a high variance in emotions. We propose a Dual-Stream Convolutional Recurrent Neural Network (CRNN) with Self-Attention, which is able to capture the spectral spatial characteristics and the temporal melodic developments, commonly found in Asian cinematic music. To validate the model, we use two region-specific datasets namely PMEmo (Chinese popular music) and EMOPIA (Korean/Asian piano OSTs). Experimental results show that our proposed architecture can obtain an accuracy of 88.4% and F1-score of 0.87, which outperforms baseline models (ResNet-50 and standard LSTM) with 5.2% margin. The research helps to confirm that the training data for culturally-aware training is vital for accurate affective computing within the music domain.
Downloads
Published
How to Cite
Issue
Section
License
This is an open Access Article published by Research Center of Computing & Biomedical Informatics (RCBI), Lahore, Pakistan under CCBY 4.0 International License



