ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

Enhancing intrusion detection with imbalanced data classification and feature selection in machine learning algorithms

Journal: International Journal of Advanced Technology and Engineering Exploration (IJATEE) (Vol.11, No. 112)

Publication Date:

Authors : ; ;

Page : 405-419

Keywords : Machine learning (ML); Intrusion detection system (IDS); UNSW-NB15 dataset; XGBoost algorithm; Convolutional neural network (CNN).;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

The effectiveness of an organization in detecting and preventing computer network (CN) attacks is significantly influenced by the performance of intrusion detection systems (IDS) and intrusion prevention systems (IPS). This research focuses on IDS based on machine learning (ML), asserting that ML-based IDS are effective and accurate in detecting network attacks. The study examines the UNSW-NB15 network IDS dataset, which is used for training and testing the models. Furthermore, a filter-based attribute reduction approach was implemented using the extreme gradient boosting (XGBoost) algorithm. The condensed feature space then facilitates the application of various methods including support vector machine (SVM), logistic regression (LR), k-nearest neighbour (KNN), decision tree (DT), and convolutional neural network (CNN). A suitable feature selection approach is essential to eliminate features with minimal impact on the classification process. Additionally, the study notes that many ML-based IDS suffer from limited identification accuracy and a higher false positive rate (FPR) when trained on highly imbalanced datasets. The research considers configurations for both binary and multiclass classification. Results indicate that the XGBoost based attribute selection approach allows techniques such as DT to enhance the test accuracy of the binary-classification scheme from 88.13% to 90.85%. Moreover, the XGBoost-KNN and XGBoost-DT configurations demonstrate improved performance.

Last modified: 2024-04-04 15:29:13