ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

BUILDING LEXICON FOR TELUGU SPEECH RECOGNITION

Journal: INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY (Vol.5, No. 1)

Publication Date:

Authors : ; ;

Page : 7-14

Keywords : Lexicon; Phonetization; Rule Generation; FSA; Regular Grammar;

Source : Download Find it from : Google Scholarexternal

Abstract

Speech recognizers usually consist of a language model, a lexicon and a collection of phone models. The lexicon for a language is important to improve the efficacy of speech recognizer for a language. Traditionally building a lexicon for a language was a significant piece of work taking several expert linguists perhaps several years to construct a lexicon with reasonable coverage. However we include a method here that can cut this time significantly. The basic idea is add the most common words to a lexicon where explicitly the user of the system gives the new word, then automatically build letter to sound rules from the initial data. The word entered might be of any arbitrary length. Over multiple passes the lexicon and letter to sound rules will improve. As each pass the letter to sound rules are re-generate with the new data making them more correct. This paper presents the work done in building a lexicon for Telugu language. The major objective is to make the speech systems for Telugu language more proficient. This technique has been proved successful for a number of languages cutting the amount to time and effort to perhaps checking thousands of words rather than tens of thousands of words. It also is a structured method that requires only knowledge of the basic language to carry out.

Last modified: 2016-06-29 19:56:27