ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

A Combination of Deep Neural Networks for Acoustic modeling of Vietnamese LVCSR

Journal: International Journal of Advances in Computer Science and Technology (IJACST) (Vol.5, No. 1)

Publication Date:

Authors : ; ; ;

Page : 1-7

Keywords : Bottleneck feature; Deep neural network; Vietnamese LVCSR.;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

In this work, we propose a deep neural network architecture with the combination of two popular applications of deep neural networks for Vietnamese large vocabulary continuous speech recognition. First, a deep neural network is trained to extract bottleneck features from frames of a combination of Mel frequency cepstral coefficient (MFCC) and tonal feature. This network is then applied as a nonlinear discriminative feature-space transformation for hybrid network training where acoustic modeling is performed by denoising auto-encoder pre-training and back-propagation algorithms. The experiments are carried out on the dataset containing speeches on Voice of Vietnam channel (VOV). The results show that the performance of the system using combined deep neural network architecture obtained relative improvements over the best hybrid HMM/DNN system by 4.1% and over baseline system by 51.4%. Adding tonal feature as input feature of the network reached around 18% relative recognition performance

Last modified: 2016-02-25 10:10:40