ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

IMPROVES THE ACCURACY TO COMBINE SEVERAL POS TAGGERS FOR TEXTS IN TELUGU LANGUAGE

Journal: International Journal OF Engineering Sciences & Management Research (Vol.4, No. 5)

Publication Date:

Authors : ;

Page : 77-83

Keywords : NLP; Telugu morphological analyzer; POS tagger; Sentence Tokenizer; word sense disambiguator;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

POS Taggers are developed by modeling the morpho-syntactic structure of natural language text.POS Tagging is the process of assigning a correct POS tag (noun, verb, adjective, adverb) to each word of the sentence. The three Telugu POS taggers are improve the accuracy of existing Telugu POS taggers by using an voting algorithm. viz., (1) Rule-based POS tagger (2) Brill Tagger (3) Maximum Entropy.POS taggers are developed with an accuracy of 97.014%, 93.248%, and 85.914 respectively. An annotated corpus of 14000 words is used to train the last two taggers. To improve the accuracy of these taggers, an error analysis is made to find out the errors made by these three taggers and methods are then examined. As a first step, a voting algorithm is proposed to get better results to build an ensemble Telugu POS tagger. This tagged output could be used for word sense disambiguation (WSD) is retrieving Telugu documents and a variety of NLP (Natural Language Processing) applications..

Last modified: 2017-05-26 21:53:05