ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

Parallel and Multiple E-Data Distributed Process with Progressive Duplicate Detection Model

Journal: Bonfring International Journal of Software Engineering and Soft Computing (Vol.8, No. 1)

Publication Date:

Authors : ; ;

Page : 23-25

Keywords : --;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

In present, duplicate detection methods need to process ever larger datasets in ever shorter time: It is difficult to maintain the dataset. This project presents progressive duplicate detection algorithm that gradually increase the efficiency of finding duplicates if the execution time is limited: They maximize the gain of the overall process within the available time by reporting most results. These experiments show that progressive algorithms can double the efficiency over time of traditional duplicate detection and improve the work. Progressive duplicate detection identifies most duplicate pairs in the detection process. Instead of reducing the overall time needed to finish the entire process, this approaches tries to reduce the average time.

Last modified: 2018-10-27 15:40:21