ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

A Review on OCCT: A One Class Clustering Tree (OCCT) for Implementing One-to-Many Data Linkage

Journal: International Journal of Scientific Engineering and Research (IJSER) (Vol.3, No. 8)

Publication Date:

Authors : ; ;

Page : 58-61

Keywords : Data linkage; One Class Clustering Tree; Splitting; Pruning; Data linkage;

Source : Downloadexternal Find it from : Google Scholarexternal


There is increased consciousness in several nations of the potential of record linkage for recommender system, data leakage detection and fraud detection. Record linkage compares records in one data set with records in another data set to match them. Record linkage is traditionally performed among tables to cluster the data. The proposed method aims to perform One-to-many data linkage i.e. to associate one record in Table $T_A$ with one or more matching records in Table $T_B$ using OCCT tree. The OCCT tree provides One-to-One as well as One-to-many record linkage between objects of same or different types and these objects do not share common attribute. It is easy to build OCCT tree and convert into linkage rules. The inner nodes of OCCT tree contains attribute from table $T_A$ and the leafs holds a compact representation of a subset of records from Table $T_B$ which are more likely to be linked with matching record from Table $T_A$.The values of Table $T_A$'s attribute are according to the path from the root of the tree to the leaf. The induced OCCT tree is small in size due to use of splitting and pruning methods. The OCCT tree contains lesser number of nodes to avoid over fitting. Old methods take long time for one-to-many record linkage. The OCCT based on One Class approach that is it considers only positive examples (matching examples). Hence the proposed method provides better performance in terms of precision and recall as compared to previous record linkage methods.

Last modified: 2021-07-08 15:26:54