ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

SENTIMENT ANALYSIS ON SPEECH SIGNALS: LEVERAGING MFCC-LSTM TECHNIQUE FOR ENHANCED EMOTIONAL UNDERSTANDING

Journal: Proceedings on Engineering Sciences (Vol.6, No. 3)

Publication Date:

Authors : ;

Page : 1391-1402

Keywords : Sentiments; Long short-term memory network; Speech emotion recognition; Mel Frequency Cepstral Coefficients; Deep learning;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

The analysis of emotions expressed in spoken language holds a pivotal role in human communication, artificial intelligence, and human-computer interaction. While emotion recognition in text has seen considerable advancements, recognizing emotional states in spoken speech presents distinct challenges and opportunities. This research introduces an innovative approach that harnesses Mel Frequency Cepstral Coefficients (MFCC) and Long Short-Term Memory (LSTM) networks to facilitate deep emotion recognition in spoken speech signals. This study explores the profound potential of the MFCC-LSTM framework, a combination of established audio feature extraction and deep learning. Mel Frequency Cepstral Coefficients offer a powerful representation of spectral features over time, while LSTM networks excel at modeling temporal dependencies. This system classifies emotional states such as sadness, angry, neutral, and happiness from the speaker's utterances. Several performance assessments were carried out on the suggested MFCC-LSTM model. There is significant improvement in the recognition rates when compared with other models that are currently available. The proposed hybrid model reached 96 % recognition success.

Last modified: 2024-09-02 04:04:07