Progressive Detection of Duplicate Data
Journal: International Journal of Science and Research (IJSR) (Vol.6, No. 1)Publication Date: 2017-01-05
Authors : Deepa Bhattacharya; Sapna Patle;
Page : 1647-1649
Keywords : Duplicate detection; entity resolution; progressiveness; and data cleaning;
Abstract
Data duplicate detection is the process of identifying multiple representations of same or real world entities. Nowadays, data duplicate detection methods are needed to process larger datasets in shorter time maintaining the quality of the datasets and also the entities duplicated becomes increasingly difficult. This application focus on the duplicates in hierarchical datas like XML file. The data can be detected using the detection methods. Here the datasets are loaded in the applications and the processing, extraction, cleaning, separation and detection are carried out to remove the duplicated data. Comprehensive experiments show that our progressive algorithms can double the efficiency over time of traditional duplicate detection and significantly improve upon related work
Other Latest Articles
- Suitability of Drinking Water in and around Clay Mines in Northern Kerala, India
- Improving Formation Efficiency of Lead Acid Battery using Hydrogen Peroxide as an Additive
- Extra-Adrenal Silent Retroperitoneal Paraganglioma: A Rare Case
- Gridhrasi and Its Management through Panchakarma- A Case Study
- Achieving Efficient Multi-Keyword Ranked Search over Encrypted Cloud Data Using Bloom Filters
Last modified: 2021-06-30 17:35:27