Enhancing the Software Clone Detection in BigCloneBench

In the software system, the code snippets that are copied and pasted in the same software or another software result in cloning. The basic cause of cloning is either a programmer‘s constraint or language constraints. An increase in the maintenance cost of software is the major drawback of code clones. So, clone detection techniques are required to remove or refactor the code clone. Recent studies exhibit the abstract syntax tree (AST) captures the structural information of source code appropriately. Many researchers used tree-based convolution for identifying the clone, but this technique has certain drawbacks. Therefore, in this paper, the authors propose an approach that finds the semantic clone through square-based convolution by taking abstract syntax representation of source code. Experimental results show the effectiveness of the approach to the popular BigCloneBench benchmark.

Download Full-text

A Qualitative Interpretation of Code Clone Detection Techniques

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.d1429.029420 ◽

2020 ◽

Vol 9 (4) ◽

pp. 1295-1300

Keyword(s):

Clone Detection ◽

Code Clones ◽

Detection Techniques ◽

Software Developer ◽

Advantages And Disadvantages ◽

Code Clone ◽

Qualitative Interpretation ◽

Software Code

Clone is the software code snippets that are similar to each other with little modifications. There is a 10-20 percent clone mostly present in the software. Many techniques are developed for detection. With the code clone detection, the software developer gets an idea of removing, refactoring the clone. Code clone has both advantages and disadvantages in the particular software. In this paper, we explore the types of code clones, advantages, and disadvantages, the reason for cloning. Typically, this paper describes various techniques by using several parameters. Lastly, we discuss gaps in the research.

Download Full-text

A Survey on Software Code Clone Detection to Improve the Maintenance Effort and Maintenance Cost of the Software

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v6si3.188192 ◽

2018 ◽

Vol 06 (03) ◽

pp. 188-192

Author(s):

V. Guna ◽

M. Sunil Kumar

Keyword(s):

Maintenance Cost ◽

Clone Detection ◽

Code Clone ◽

Software Code

Download Full-text

Metric level based code clone detection using optimized code manager

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.27.13763 ◽

2018 ◽

Vol 7 (2.27) ◽

pp. 144

Author(s):

Gundeep Kaur ◽

Sumit Sharma

Keyword(s):

Semantic Analysis ◽

Performance Metrics ◽

Object Oriented Programming ◽

Software Systems ◽

Clone Detection ◽

Major Drawback ◽

Syntax Analysis ◽

Code Clone ◽

Svm Algorithm ◽

Optimized Code

Object-oriented programming today, is the main prototype in typical software development. Code Cloning defines generally, all through the designing and development of software systems. Detection can be based on Textual analysis, Lexical analysis, Syntax analysis, Semantic analysis, Hybrid analysis and Metric analysis. The major drawback of the present research is that it focuses more on fragments of copied code and does not focus on the aspect that the fragments of duplicated code are may be part of a larger replicated program structure. In this process, techniques take a lot of time and it creates complexity. In our research, a source code is then scanned for detecting various methods by adopting a “OPTIMIZED SVM ALGORITHM” and the method definitions are extracted and collected by means of a CLONE CODE and saved for further reference. To evaluate the performance parameters we calculate the LOC, the number of repetitions, and maximum and minimum LOC. To enhance the performance metrics precision recall, accuracy and reduce the error rate and time complexity

Download Full-text

Multi-Agent based Sequence Algorithm for Detecting Plagiarism and Clones in Java Source Code using Abstract Syntax Tree

International Journal of Computer Applications ◽

10.5120/15796-4494 ◽

2014 ◽

Vol 90 (15) ◽

pp. 19-24 ◽

Cited By ~ 1

Author(s):

D. Poongodi ◽

G.Tholkkappia Arasu

Keyword(s):

Source Code ◽

Abstract Syntax ◽

Agent Based ◽

Abstract Syntax Tree ◽

Syntax Tree ◽

Multi Agent

Download Full-text

Integrated Reasoning Engine for Code Clone Detection

ABC Journal of Advanced Research ◽

10.18034/abcjar.v3i2.575 ◽

2014 ◽

Vol 3 (2) ◽

pp. 143-152 ◽

Cited By ~ 5

Author(s):

Naresh Babu Bynagari

Keyword(s):

Clone Detection ◽

Code Clones ◽

High Pitch ◽

Detection Process ◽

Code Clone ◽

Similar Code ◽

Reasoning Engine

This article seeks to foray into the nitty-gritty of integrated reasoning for code clone detection and how it is effectively carried out, given the amount of analytics usually associated with such activities. Detection of codes requires high-pitch familiarity with cloning systems and their workings. Hence, discovering similar code segments that are often regarded and seen as code imitations (clone) is not an easy responsibility. More especially, this very detection process might possess key purposes in the context of susceptibility findings, refactoring, and imitation detecting. Through the voyage of discovery this article intends to expose you to, you will realize that identical code segments, more often than not described as code clones, appear to be a serious duty, especially for large code bases <1; 2; 3; 4>. There are certain approaches and deep technicalities that this sort of detection is known for. Still, from the avalanche of resources that formed the bedrock of this article, one would discover the easiest formula to adopt in maneuvering such strenuous issues.

Download Full-text

Using Dynamic Time Warping to Detect Clones in Software Systems

International Journal of Software Innovation ◽

10.4018/ijsi.2021010103 ◽

2021 ◽

Vol 9 (1) ◽

pp. 20-36

Author(s):

Mostefai Abdelkader

Keyword(s):

Time Series ◽

Dynamic Time Warping ◽

Software Systems ◽

Clone Detection ◽

Code Clones ◽

Time Warping ◽

Code Clone ◽

Software Modules ◽

Software Clone Detection ◽

Dynamic Time

Software clone detection is a widely researched area over the last two decades. Code clones are fragments of code judged similar by some metric of similarity. This paper proposes an approach for code clone detection using dynamic time warping technique (i.e., DTW). DTW is a well-known algorithm for aligning and measuring similarity of time series and it has been found effective in many domains where similarity plays an important role such as speech and gesture recognition. The proposed approach finds clones in three steps. First software modules are extracted. Then, the extracted modules are turned to time series. Finally, the time series are compared using the DTW algorithm to find clones. The results of the experiment conducted on a well-known Benchmark show that the approach can detect clones effectively in software systems.

Download Full-text

WASTK: A Weighted Abstract Syntax Tree Kernel Method for Source Code Plagiarism Detection

Scientific Programming ◽

10.1155/2017/7809047 ◽

2017 ◽

Vol 2017 ◽

pp. 1-8 ◽

Cited By ~ 12

Author(s):

Deqiang Fu ◽

Yanyan Xu ◽

Haoran Yu ◽

Boyang Yang

Keyword(s):

Kernel Method ◽

Source Code ◽

Detection Methods ◽

Abstract Syntax ◽

Plagiarism Detection ◽

Abstract Syntax Tree ◽

Syntax Tree ◽

Tree Kernel ◽

Document Frequency ◽

Abstract Syntax Trees

In this paper, we introduce a source code plagiarism detection method, named WASTK (Weighted Abstract Syntax Tree Kernel), for computer science education. Different from other plagiarism detection methods, WASTK takes some aspects other than the similarity between programs into account. WASTK firstly transfers the source code of a program to an abstract syntax tree and then gets the similarity by calculating the tree kernel of two abstract syntax trees. To avoid misjudgment caused by trivial code snippets or frameworks given by instructors, an idea similar to TF-IDF (Term Frequency-Inverse Document Frequency) in the field of information retrieval is applied. Each node in an abstract syntax tree is assigned a weight by TF-IDF. WASTK is evaluated on different datasets and, as a result, performs much better than other popular methods like Sim and JPlag.

Download Full-text