OPTIMIZING SPEECH-TO-TEXT CONVERSION: DEVELOPING AN EFFICIENT MATLAB-BASED SPEECH RECOGNITION SYSTEM
Journal: International Journal of Mechanical Engineering and Technology(IJMET) (Vol.9, No. 2)Publication Date: 2018-12-26
Authors : Poonam Verma;
Page : 954-963
Keywords : speech-to-image translation; voice signals; visuals; teacher-student learning; generative adversarial models; embedding feature; adversarial generative network;
Abstract
Because of the possible applications in human-computer interaction, artistic creativity, computer-aided design, etc., speech-to-image translation with no text is a fascinating and valuable issue. Not to mention the absence of writing in many languages. However, how to directly convert voice signals into visuals and how well they can be translated have not yet been thoroughly researched, as far as we are aware. In this research, we use the development of teacher-student learning and generative adversarial models to attempt to convert the voice signals into the picture signals without the transcription stage. In order to improve generalization ability on new classes, a voice encoder is specifically created to represent the input speech signals as an embedding feature. After that, a speech encoder's embedded feature is employed to train a stacked adversarial generative network to create high-quality pictures. As a result, the process entails the input, pre-processing, feature extraction, and classification processes. The identified voice signal and the related item will be shown as the end product. Our suggested technique effectively converts the raw voice signals into pictures without the intermediary text representation, according to experimental findings on dataset signals. Ablation research offers further information about our approach. The basic goal of this technique is to identify speech from audio, after which the identified speech is transformed into a text picture. To increase the process's precision is also one of the objectives.
Other Latest Articles
- DEPENDABILITY ANALYSIS FOR ULTRARELIABLE COMMUNICATION IN 5G NETWORKS: EVALUATING AVAILABILITY IN SPACE DOMAIN
- OPTIMIZING SUBCARRIER AND POWER ALLOCATION FOR MAX-MIN FAIRNESS IN SWIPT-ENABLED MULTI-GROUP MULTICAST OFDM SYSTEMS
- IMPLEMENTING UNIQUE WORD OFDM IN MIMO FILTER-BANK MULTICARRIER SYSTEMS
- ASSESSING ANTENNA SUITABILITY FOR DEPENDABLE MIMO-OFDM INTERFERENCE ALIGNMENT THROUGH MEASUREMENTBASED ANALYSIS
- OPTIMIZING JOINT POWER AND CHANNEL ALLOCATION WITH QOS CONSTRAINTS IN MULTI-USER OFDM COGNITIVE RADIO SYSTEMS
Last modified: 2023-06-09 16:45:14