ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

An Improved Synthetic Minority Over Sampling Technique for Imbalanced Datasets Classification

Journal: International Journal of Computer Science and Mobile Computing - IJCSMC (Vol.7, No. 12)

Publication Date:

Authors : ; ;

Page : 277-290

Keywords : Movie Recommendation System; Memory-Based Collaborative Filtering; Model-Based Collaborative Filtering; Stochastic Gradient Descent;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

The fast growing of the world data availability in many large-scale, complex, and networked systems, such as surveillance, security, Internet, and finance, it becomes critical to advance the fundamental understanding of knowledge discovery and analysis from raw data to support decision-making processes. The problem of learning from imbalanced data is a relatively new challenge that has attracted growing attention from both academia and industry. The distribution between the samples of the majority and minority classes. The minority instance of the dataset has a smaller number of instances than other categories, such a dataset may imply a problem of category imbalance, which means that the trained classification model is likely to be more likely to be discovered because of a few category instances. The reason for the low, but the minority category instance error is judged as the majority category instance. It is not a solution to balance the distribution between artificial and minority data examples. Many algorithms have been designed based on this concept. The propose an improved algorithm ISMOTE to solve the category imbalance problem. ISMOTE differs from previous algorithms in that it differs from previous algorithms in that it does not consider only a few categories of distribution, but also measures the relative advantages of minority categories and multiple density distributions, and uses this as a basis for weight measurement. In addition, our method will choose to generate man-made with a few categories of instances and most recent references. This method can reduce the difficulty of classifier learning due to the generation of erroneous artificial data instances, and the artificial examples through this method can better help the classifier to learn.

Last modified: 2018-12-30 18:25:45