Audio-Visual Integration in Multimodal Speech Recognition

Journal: The Journal of the Institute of Internet, Broadcasting and Communication (Vol.1, No. 1)

Publication Date: 2001-06-30

Authors : Chun-Woo Lee; In-Kyu Kim; Sam-Taek Kim;

Page : 9-15

Keywords : ;

Source : Download Find it from : Google Scholar

Abstract

The main factor decreasing speech recognition rate is the surrounding noise . To lower the noise effect, we generally have used the filter bank at preprocessing stage. But in this paper, tried to recognized the numeral digit using 2-D LPC to extract speech and image feature. At first, we obtained the result of speech-only recognition using 13th-order LPC coefficients and then, for distorted speech recognition results. 'O', '4','5','6','9', we added image parameters, in here, 12th-order 2-D LPC coefficients. At each frame, we extracted the 2-D LPC coefficients, simulated recognizer with two parameters, speech and image. Finally, the great result was shown for numeral recognition.

Main Menu

Searching By

PARTNERS

Audio-Visual Integration in Multimodal Speech Recognition

Abstract

Advertisement