A Local and Global Discretization Method
Journal: International Journal of Information Engineering (IJIE) (Vol.3, No. 1)Publication Date: 2013-03-29
Authors : Yu Sang; Pingfei Zhu; Keqiu Li; Heng Qi; Yueting Zhu;
Page : 6-17
Keywords : Machine Learning; Data Mining; Discretization; Merging Criterion; Independence Measure;
Abstract
Most machine learning and data mining algorithms require that the training data contain only discrete attributes, which makes it necessary to discretize continuous numeric attributes. Bottom-up discretization algorithms are well-known methods. They mainly focus on discretizing data based on either local or global independence measure. In this paper, we present a novel bottom-up discretization method by combining local and global independence measures. First, we present a novel merging criterion that locally and globally captures the independence between the discretized attributes and the decision class; this is conducted by evaluating pairs of intervals with a proposed measure and developing a measurement of significance of interval pair among attributes. The advantage of our proposed merging criterion is further analyzed. Moreover, we develop an algorithm to find the best discretization based on the new merging criterion. Detailed analysis shows that the proposed method brings higher accuracy to the discretization process. Finally, we conduct extensive experimental results on 18 real-world datasets to evaluate the performance of the proposed method by comparison with existing methods. The experimental results show that the proposed method outperforms existing methods over the performance metrics considered.
Other Latest Articles
Last modified: 2013-06-29 22:43:17