ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

On the Automatic Categorization of Arabic Articles Based on Their Political Orientation

Proceeding: Third International Conference on Informatics Engineering and Information Science (ICIEIS2014) (ICIEIS)

Publication Date:

Authors : ; ; ; ;

Page : 302-309

Keywords : Arabic Language Processing; Authorship Authentication; Stylometric Features; Bag of Words; Classification;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

The prevalence of the dynamic online web pages (such as the social networks, forums, personal Blogs, etc.) that are covering all fields (such as social events, economical events, political events, etc.) are allowing the Internet surfers to interact with their contents such as writing comments and articles. Regarding politics and political events, the Internet surfers post comments and articles based on their beliefs and ideologies. The ability to automatically determine the political orientation of an article can be of great benefit in many areas from Academia to security. This work addresses this important yet largely understudied problem for Arabic texts as a supervised learning problem. Aside from collecting and manually labeling a dataset of articles from different political orientations in the Arab world, the two most popular feature extraction approaches for such a problem (the TC approach and the stylometric features approach) are studied. Moreover, four classifiers are considered to study the effects of different kinds of feature reduction techniques, such as stemming and feature selection, on their effectiveness. Although the experimentation results show the superiority of the TC approach over the stylometric features approach, they also show that the latter approach can be significantly improved by adding new and more discriminating features

Last modified: 2014-09-23 23:01:04