Enhancing Generic Pipeline Model for Code Clone Detection using Divide and Conquer Approach
Journal: The International Arab Journal of Information Technology (Vol.12, No. 5)Publication Date: 2015-09-01
Authors : Al-Fahim Mubarak-Ali; Sharifah Syed-Mohamad; Shahida Sulaiman;
Page : 510-517
Keywords : Code clone detection; divide and conquer approach; generic pipeline model;
Abstract
Code clone is known as identical copies of the same instances or fragments of source codes in software. Current code clone research focuses on the detection and analysis of code clones in order to help software developers identify code clones in source codes and reuse the source codes in order to decrease the maintenance cost. Many approaches such as textual based comparison approach, token based comparison and tree based comparison approach have been used to detect code clones. As software grows and becomes a legacy system, the complexity of these approaches in detecting code clones increases. Thus, this scenario makes it more difficult to detect code clones. Generic pipeline model is the most recent code clone detection that comprises five processes which are parsing process, pre-processing process, pooling process, comparing processes and filtering process to detect code clone. This research highlights the enhancement of the generic pipeline model using divide and conquer approach that involves concatenation process. The aim of this approach is to produce a better input for the generic pipeline model by processing smaller part of source code files before focusing on the large chunk of source codes in a single pipeline. We implement and apply the proposed approach with the support of a tool called Java Code Clone Detector (JCCD). The result obtained shows an improvement in the rate of code clone detection and overall runtime performance as compared to the existing generic pipeline model
Other Latest Articles
- Using Textual Case-based Reasoning in Intelligent Fatawa QA System
- Event Extraction from Classical Arabic Texts
- A WK-Means Approach for Clustering
- Lessons Learned: The Complexity of Accurate Identification of in-Text Citations
- Adaptive Semantic Indexing of Documents for Locating Relevant Information in P2P Networks
Last modified: 2019-11-17 16:33:11