ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

An Approach for Detecting Spam in Arabic Opinion Reviews

Journal: The International Arab Journal of Information Technology (Vol.12, No. 1)

Publication Date:

Authors : ; ;

Page : 9-16

Keywords : Opinion mining; arabic opinion mining; spam review; spam detection.;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

For the rapidly increasing amount of information available on the Internet, little quality control exists, especially over the user-generated content. Manually scanning through large amounts of user-generated content is time-consuming and sometime impossible. In this case, opinion mining is a better alternative. Although, it is recognized that the opinion reviews contain valuable information for a variety of applications, the lack of quality control attracts spammers who have found many ways to draw their benefits from spamming. Moreover, the spam detection problem is complex because spammers always invent fresh methods that can't be easily recognized. Therefore, there is a need to develop a new approach that works to identify spam in opinion reviews. We have some in English; we need one in Arabic language in order to identify Arabic spam reviews. To the best of our knowledge, there is still no published study to detect spam in Arabic reviews. In this research, we propose a new approach for performing spam detection in Arabic opinion reviews by merging methods from data mining and text mining in one mining classification approach. Our work is based on the state-of-the-art achievements in the Latin-based spam detection techniques keeping in mind the specific nature of the Arabic language. In addition; we overcome the drawbacks of the class imbalance problem by using sampling techniques. The experimental results show that the proposed approach is effective in identifying Arabic spam opinion reviews. Our designed machine learning achieves significant improvements. In the best case, our F-measure is improved to 99.59%.

Last modified: 2019-11-14 20:01:05