ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

Universal Dependencies for Urdu Noisy Text

Journal: International Journal of Advanced Trends in Computer Science and Engineering (IJATCSE) (Vol.10, No. 3)

Publication Date:

Authors : ;

Page : 1751-1757

Keywords : ;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

In this paper, the process of creating a Dependency Treebank for tweetsin Urdu,a morphologically rich and less-resourced languageis described. The 500 Urdu tweets treebank iscreated by manually annotating the treebank withlemma, POS tags, morphological and syntacticrelations using the Universal Dependencies annotation scheme, adopted to the peculiarities of Urdu social media text. annotation process is evaluated through Inter-annotator agreement for dependency relations and total agreement of 94.5% and resultant weighted Kappa = 0.876was observed. The treebank is evaluated through 10-fold cross validation using Maltparserwith various feature settings. Results show average UAS score of 74%, LAS score of 62.9% and LA score of 69.8%

Last modified: 2021-06-11 19:47:00