ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

LOGISTIC REGRESSION APPROACH FOR OUTLIER MINING IN HIGH DIMENSIONAL DATASET

Journal: International Journal of Advanced Research in Engineering and Technology (IJARET) (Vol.12, No. 01)

Publication Date:

Authors : ;

Page : 569-579

Keywords : Outliers Detection; High Dimensional Data; Synthetic dataset; Logistic regression; NCSS software;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

The most commonly used technique for finding infrequent/ exceptionally happening instances in the real world scenario is outlier mining. In the last few years, outlier detection becomes a significant research area in the data mining. The key objective and focus of this research article is to determine the objects/patterns in large datasets that are significantly differ from the normal patterns i.e. objects with unpredictable, dissimilar, infrequent and abnormal behavior w.r.to most of the datasets. Several algorithms have been projected to conquer the challenges as well as explorations in the field of outlier mining, but these methods unable to yields potentially higher accuracy results in such environments. Now a days, developing an efficient method for detecting the outliers in a huge database is a crucial task. In this research article, Lasso Regression technique is projected for outlier's detections in high dimensional datasets. The proposed methodology is implemented in the open source called NCSS statistical software. Here, the parameters like Sum of Squares Error 0.76343, Model R² 0.10401, Mean Squares Error 1.07935081, Specificity 0.61000 Specificity 0.51556, RMSE 0.89333 and Coefficient of Variation 0.95889 are evaluated using synthetic dataset. Outcomes from experimental analysis illustrate that projected method identifies the outliers with potentially higher precision in high dimensional datasets

Last modified: 2021-03-25 20:54:50