SPEECH RECONSTRUCTION USING LSTM NETWORKS
Journal: International Journal of Electrical Engineering and Technology (IJEET) (Vol.12, No. 3)Publication Date: 2021-03-31
Authors : Lani Rachel Mathew Arun Manohar Nidheesh S S. Sainath; Arsha Vijayan K. Gopakumar;
Page : 269-277
Keywords : recurrent neural network; mel frequency cepstral coefficient; speech; assistive; long short term memory;
Abstract
This paper aims at producing a real-time system that reconstructs partially formed words of persons with disability in speaking. Recurrent neural networks (RNN) have been used extensively in the conversion and reconstruction of partially spoken words. However, traditional RNN networks suffer from exploding and vanishing gradient problems, reducing the learning rate, and affecting the overall performance of the system. To avoid this, we use LSTM (long short term memory) systems. The obtained text message is converted into an audio signal. MFCC (mel frequency cepstral coefficient) technique is used for feature extraction. LSTMs include more control over the output when compared to RNN which have only one function controlling the output in a cell. We try to create a system that can reconstruct words spoken by people with disorders like stuttering. Results indicate that RNN LSTMs offer promising solutions, provided good training can be provided to the system model.
Other Latest Articles
- The Nutritional Care and Feeding Behavior of an Adult Female Lion (Panthera Leo) in Mvog-Betsi Zoo in Yaounde, Centre Region, Cameroon
- Influence of Geomagnetic M-Index on Light-Trap Catch of Macrolepidoptera Species Selected from Different Families and Subfamilies
- Innovative Approaches for Comparative Genomics Study of Rattus Rattus and Rattus Norvegicus via Computational Biology: Statistical & Structural Functions
- ENHANCED STUDY ON FOG COMPUTING AND IMPLEMENTATION ON VEHICULAR COMMUNICATION
- Expansion of the Distribution of Touit Stictopterus (Psitacidae) for the Central Cordillera, Colombia
Last modified: 2021-04-08 20:37:12