A Survey on Duplicate Detection in Hierarchical Data
Journal: International Journal of Science and Research (IJSR) (Vol.3, No. 12)Publication Date: 2014-12-05
Authors : Nikhil Gawande; S. R. Todamal;
Page : 751-754
Keywords : duplicate detection; record linkage; entity resolution; XML; Bayesian networks; data cleaning; optimization;
- Image Classification Using Group Sparse Multiview Patch Alignment Framework Method
- Convolutional Sparse Coding Multiple Instance Learning for Whole Slide Image Classification
- HYPERSPECTRAL IMAGE CLASSIFICATION BASED ON MANIFOLD DATA ANALYSIS AND SPARSE SUBSPACE PROJECTION
- Face Image Retrieval using Sparse Code and Age Group Estimation by using Face Angle
- Human Facial Image Age Group Classification Based On Third Order Four Pixel Pattern (TOFP) of Wavelet Image
Abstract
Although there has been a lot work done on identifying duplicates in relational data, but only a few solutions focus on identifying duplicates in more complex hierarchical structures, like XML data. In this paper, we have demonstrated the novel method for XML duplicate detection, called XMLDup. XMLDup method implements the Bayesian network to calculate and determine the probability of two XML nodes, considering not only the information within the XML nodes, but also the way that the information is structured. In addition, to increase the efficiency of the network evaluation, a novel pruning strategy, capable of significant gains over the unoptimized version of the algorithm, is presented. Through experiments and comparisons, we show that our algorithm is able to achieve high precision and recall scores in several datasets. XMLDup method helps us to improve both efficiency and of effectiveness.
Other Latest Articles
- Occurrence of Fungal Contamination due to Unhealthy Haircutting during Hajj
- Relationship between Caretakers Variable on Age and Coping Strategies of Primary Caretakers Attending to Children with Cardiac Problems
- Priority Queuing Approach for Video Streaming Over Mobile Adhoc Network Using WEAC Protocol
- Relationship between the Habit of Daydreaming and Creative Writing in English Among High School Students
- Survey Paper on Alleviation of Cloud Internal Denial of Service Attacks
Last modified: 2021-06-30 21:15:01