ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

SPEECH RECONSTRUCTION USING LSTM NETWORKS

Journal: International Journal of Electrical Engineering and Technology (IJEET) (Vol.12, No. 3)

Publication Date:

Authors : ; ;

Page : 269-277

Keywords : recurrent neural network; mel frequency cepstral coefficient; speech; assistive; long short term memory;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

This paper aims at producing a real-time system that reconstructs partially formed words of persons with disability in speaking. Recurrent neural networks (RNN) have been used extensively in the conversion and reconstruction of partially spoken words. However, traditional RNN networks suffer from exploding and vanishing gradient problems, reducing the learning rate, and affecting the overall performance of the system. To avoid this, we use LSTM (long short term memory) systems. The obtained text message is converted into an audio signal. MFCC (mel frequency cepstral coefficient) technique is used for feature extraction. LSTMs include more control over the output when compared to RNN which have only one function controlling the output in a cell. We try to create a system that can reconstruct words spoken by people with disorders like stuttering. Results indicate that RNN LSTMs offer promising solutions, provided good training can be provided to the system model.

Last modified: 2021-04-08 20:37:12