Construction of a generic stopwords list for Hindi language without corpus statistics
Journal: International Journal of Advanced Computer Research (IJACR) (Vol.8, No. 34)Publication Date: 2018-01-08
Authors : Sifatullah Siddiqi; Aditi Sharan;
Page : 35-40
Keywords : Stop word; Stop words list; Hindi language; Information retrieval; Text mining; Corpus statistics.;
Abstract
Most of the research in the field of information retrieval (IR) has focused on the English language, but recently there has been a considerable amount of work and effort to develop IR systems for languages other than English. Research and experimentation in the field of IR in the Hindi language are relatively new and limited compared to the research that has been done in English, which has been dominant in the field of IR for a long while. A fundamental tool in IR is the employment of stop word lists. Stop words have no retrieval value in IR. Till now, many stop word lists have been developed for English, European and Chinese languages. However, there is no standard stop word list which has been constructed for Hindi language. In this paper an approach to construct a generic stop word list for Hindi language have been presented. Our list contains more than 800 stop words.
Other Latest Articles
- SISTEM E-LEARNING MATA KULIAH PADA AKADEMI KEBIADAN MUTIARA MAHAKAM
- PENGEMBANGAN APLIKASI GAME SHOOT'EM UP STAR ASSAULT DENGAN GAME MAKER STUDIO
- MEDIA PEMBELAJARAN MATA KULIAH KOMPUTER ANIMASI BERBASIS ANDROID DI FAKULTAS ILMU KOMPUTER DAN TEKNOLOGI INFORMASI UNIVERSITAS MULAWARMAN SAMARINDA
- Content analysis and exploratory factor analysis of relationship goals among young adults:converging data from instagram and offline surveys
- Data exchange architecture for the development of mobile applications that support eHealth systems interoperability: a case of Tanzania
Last modified: 2018-01-11 15:09:58