ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

POS Tagging of Hindi-English Code Mixed Text from Social Media

Journal: International Journal of Science and Research (IJSR) (Vol.5, No. 10)

Publication Date:

Authors : ; ;

Page : 1018-1021

Keywords : Multilingual; virtual; vocabulary; tokenizing;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

Language is way of expressing ideas and feelings using movement, symbol and sounds, particular style of speaking and writing. Language is divided into two, spoken language and written language. Spoken language is a form of communication in which words derived from a large vocabulary (usually at 10.000) together with a diverse variety of names are uttered through or with the mouth, while written language is the representation of a language by means of a writing system. Hundreds of millions people in the world routinely use two or more languages in their daily lives (multilingual). Social media is the social interaction among people in which they treat, share information and ideas in virtual communities and networks. One of social media features that are updated any time by users is status. Through status, the user can inform all activity, news, opinions, exchange ideas, business, and so on. In addition, they also are able to comment or respond to the latest status of their fellow social media users. The user of the social media sometimes mixes and uses several languages to update their status or comment to their friends status, for example when they chat with other people at facebook or web chat. Information retrieval deals with the issues of storing and retrieving information from all types of resources inlcuding social media which is very tough with regard to tokenizing and text processing.

Last modified: 2021-07-01 14:45:37