IMPLEMENTATION OF HIDDEN MARKOV MODEL (HMM) FOR PARTS OF SPEECH TAGGING IN TELUGU LANGUAGE
Journal: International Education and Research Journal (Vol.8, No. 5)Publication Date: 2022-05-15
Authors : V. Suresh;
Page : 99-102
Keywords : Telugu; Parts-of-speech tagger; corpus; TDIL proposed Telugu tag set; HMM technique;
Abstract
All NLP applications have fundamental task of POS(Parts of Speech) Tagging. Like Grammar Checking, Speech processing, Machine translation etc. that assign the correct tag to the word for a number of available tags. The accuracy of a tagger is the biggest challenge today. A lot of taggers have been proposed by different Researchers for the different languages (Telugu, Tamil, Kannada, Punjabi, Hindi, Bengali etc.) using different techniques like HMM (Hidden Markov Model), SVM (Support Vector Machine), ME (Maximum Entropy) etc. A Telugu POS tagger based on HMM model is one of them. This tagger uses Hidden Markov Model., a statistical technique to accurately tag the words in Telugu language using 630 tags developed by Rama Sree, R.J, Kusuma Kumari,2007.This large tag set (630 tags)results in data sparseness problem. Finally the result has been manually evaluated from a linguistic person. To cope up with this problem, in this research paper an experiment with reduced POS Tag set (36 tags) proposed by Technical Development of Indian Languages (TDIL) has been used to improve the tagging accuracy of HMM based POS Tagger
Other Latest Articles
- CHHAATRAADHYAAPAKON (B.ED. VIDYAARTHEE) KE VYAVAHAAR PAR NYAASAYOG URJA UPACHAAR KA PRABHAAV
- A PERSPECTIVE ON THE BELIEF SYSTEM OF DIYING KHO: A FESTIVAL OF THE BUGUN TRIBE OF ARUNACHAL PRADESH
- A STUDY: INFLUENCE OF E-LEARNING DURING PANDEMIC
- CONCEPT OF BALA SAMSKARA – SCIENTIFIC VIEW
- NASAL INSTILLATION OF MEDICATED OIL (NASYA) SADBINDU TAILA IN PARKINSON’S DISEASE – A CASE SERIES
Last modified: 2023-01-19 17:22:41