ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

DETECTION OF FAKE NEWS ON TWITTER USING MACHINE LEARNING: AN XGBOOST-BASED APPROACH WITH SENTIMENT AND SOURCE CHARACTERISTIC ANALYSIS

Journal: International Journal of Advanced Research (Vol.12, No. 08)

Publication Date:

Authors : ;

Page : 948-964

Keywords : Fake News Social Media Twitter Machine Learning Ensemble Learning Sentiment Analysis Detection Algorithm XG Boost Truthseeker 2023 Dataset;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

The spread of fake news on social media platforms is becoming an increasingly alarming problem with fake news becoming more deceptive and harder to detect. Twitter, in particular, poses a significant threat as fake news spreads faster than real news on the platform, enhancing misinformation and leading to serious consequences.This project presents a novel machine learning-based approach for detecting fake news tweets on Twitter using the TruthSeeker 2023 dataset from the University of New Brunswick. As the largest ground truth dataset for fake news detection on social media, it contains over 130,000 crowdsourced tweets, enabling the creation of a broader and more applicable model for real-world scenarios. The algorithm employed in this study leverages the properties of gradient-boosted decision tree models (XGBoost) to develop a novel method for classifying fake and real news tweets. The proposed model preprocesses the data by extracting additional features for each tweet, such as detailed sentiment analysis of both the tweet and the related news statement, as well as features pertaining to the author. These features are added to the tweets feature vector. The enhanced feature vectors are then fed into an XGBoost model with tuned hyperparameters determined through a grid search algorithm to perform binary classification. The additional extracted features increase the robustness of the model by highlighting key differentiating factors between real and fake tweets. The results of this study demonstrate the effectiveness of the proposed algorithm, achieving an accuracy of 0.9335 on over 13,000 unseen tweets.

Last modified: 2024-09-17 19:59:07