Evaluating the efficacy of decision tree-based machine learning in classifying intrusive behaviour of network users
Journal: International Journal of Advanced Technology and Engineering Exploration (IJATEE) (Vol.11, No. 114)Publication Date: 2024-31-05
Authors : Ashalata Panigrahi; Manas Ranjan Patra;
Page : 736-758
Keywords : Machine learning; Cross-validation; Discriminant power; Geometric mean; Random forest; Naïve bayes tree.;
Abstract
Building network intrusion detection models to detect intrusive behaviour of malicious users has been a major challenge to protect network resources. In this study, decision tree (DT) based machine learning (ML) classification techniques, namely, best first tree (BFT), functional tree (FT), J48, naïve Bayes tree (NBT), random forest (RF), random tree (RT), reduced error pruning tree (REPT), simple classification and regression tree (Simple CART) have been employed to build an anomaly-based network intrusion detection model. Further, in order to remove irrelevant features from the intrusion data three different categories of feature selection techniques, namely, (i) entropy based (gain ratio (GR), information gain (IG) and symmetrical uncertainty (SU)), (ii) statistical based (chi-squared, one-r, and relief-f), and (iii) search based exploratory data analysis (EDA), feature subset harmony search (FSHS), linear forward search (LFS), feature vote harmony search (FVHS)) have been applied. The proposed method was evaluated using the widely recognized NSL-KDD dataset. The efficacy of various combinations of eight classifiers and ten feature selection methods (eighty models) was analysed based on seventeen evaluation metrics such as sensitivity, false positive rate (FPR), Matthew's correlation coefficient (MCC), Kappa coefficient (KC), geometric mean (GM), and discriminant power (DP). Experimental results showed that LFS+RF model achieved the highest accuracy of 0.9989, sensitivity 0.9982, F-value 0.9988, specificity 0.9994, false negative rate (FNR) 0.0018, MCC 0.9977, GM 0.9988, and DP 7.6156 on the NSL-KDD dataset. The proposed model demonstrated its superiority over the other existing models such as support vector machine (SVM), JRip, bagging, deep learning, and neural network (NN).
Other Latest Articles
- Exploring the impact of social media on political discourse: a case study of the Makassar mayoral election
- Optimizing energy production: superiority of feasible solution-moth flame optimization in IEEE 57-bus systems for optimal power flow
- Vehicle functionality and security optimization of autonomous vehicles utilizing EHO: a blockchain-based concept
- Gain and radiation pattern enhancement using ANN-based reflector antenna for full 5G Sub-6GHz applications
- Total harmonic distortion mitigation and voltage control using distribution static synchronous compensator and hybrid active power filter
Last modified: 2024-06-04 23:16:02