Loading…
Attending this event?
Thursday August 8, 2024 12:15pm - 2:15pm IST
Authors - Sri Samyu Tankasala, Sai Harshitha Peddi, Sushma Bodpati, Jahnavi Kuddigana, Prathibhamol C.P
Abstract - Applications for emotion recognition in speech span from mental health evaluation to human-computer interaction. In order to analyze emotional expressions in speech signals, this research introduces a novel method that combines convolutional neural networks (CNNs)[1] with attention processes. Speech emotion identification systems have advanced in a number of ways, including the application of deep learning models and novel temporal and auditory variables. This research presents a two-dimensional Convolutional Neural Network (CNN) and long short-term memory (LSTM)[2] network combination to develop a self-attention based deep learning model. This work expands on previous research by conducting extensive experiments on various combinations of spectral and rhythmic information in order to determine the features that perform the best for this task. By modelling speech as Mel-Spectrograms, we allow CNNs to capture spatial information while also accounting for temporal dynamics of emotions. Our Parallel CNN-Transformer network has an accuracy of 74 percent, followed by Parallel CNN-BLSTM-Attention at 60 percent, outperforming standalone models. Notably, our solution requires fewer parameters, increasing efficiency while maintaining performance..
Paper Presenter
Thursday August 8, 2024 12:15pm - 2:15pm IST
Virtual Room C Goa, India

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link