ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

Sindhi Stemmer using Affix Removal Method

Journal: International Journal of Advanced Trends in Computer Science and Engineering (IJATCSE) (Vol.10, No. 3)

Publication Date:

Authors : ;

Page : 2447-2451

Keywords : Natural language processing; Information Retrieval; Sindhi; Lexicon; Yet another Suffix Stripper; Hidden Markov Model.;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

Stemming is the process of mapping various inflections of a word to its base form. Stemmer is an essential component of Information Retrieval (IR) systems and different Natural Language Processing pipelines. This research reports the development and evaluation of stemmer for a resource poor language Sindhi. The stemmer is using a lexicon-based affix removal technique for stemming. The developed lexicon represents the base forms and the algorithm uses this lexicon during the affix removal process. The overall performance accuracy is evaluated, and stemmed error rate is calculated. The results show 89.57% overall performance accuracy.

Last modified: 2021-08-05 14:21:36