code clones Latest Research Papers

Software developers frequently reuse source code from repositories as it saves development time and effort. Code clones (similar code fragments) accumulated in these repositories represent often repeated functionalities and are candidates for reuse in an exploratory or rapid development. To facilitate code clone reuse, we previously presented DeepClone, a novel deep learning approach for modeling code clones along with non-cloned code to predict the next set of tokens (possibly a complete clone method body) based on the code written so far. The probabilistic nature of language modeling, however, can lead to code output with minor syntax or logic errors. To resolve this, we propose a novel approach called Clone-Advisor. We apply an information retrieval technique on top of DeepClone output to recommend real clone methods closely matching the predicted clone method, thus improving the original output by DeepClone. In this paper we have discussed and refined our previous work on DeepClone in much more detail. Moreover, we have quantitatively evaluated the performance and effectiveness of Clone-Advisor in clone method recommendation.

Download Full-text

The Existence and Co-Modifications of Code Clones within or across Microservices

10.1145/3475716.3475784 ◽

2021 ◽

Author(s):

Ran Mo ◽

Yang Zhao ◽

Qiong Feng ◽

Zengyang Li

Keyword(s):

Code Clones

Download Full-text

Two-Pass Technique for Clone Detection and Type Classification Using Tree-Based Convolution Neural Network

Applied Sciences ◽

10.3390/app11146613 ◽

2021 ◽

Vol 11 (14) ◽

pp. 6613

Author(s):

Young-Bin Jo ◽

Jihyun Lee ◽

Cheol-Jung Yoo

Keyword(s):

Neural Network ◽

Average Rate ◽

Convolution Neural Network ◽

Clone Detection ◽

Code Clones ◽

Classification Technique ◽

Development Costs ◽

Code Quality ◽

Type Information ◽

Type Classification

Appropriate reliance on code clones significantly reduces development costs and hastens the development process. Reckless cloning, in contrast, reduces code quality and ultimately adds costs and time. To avoid this scenario, many researchers have proposed methods for clone detection and refactoring. The developed techniques, however, are only reliably capable of detecting clones that are either entirely identical or that only use modified identifiers, and do not provide clone-type information. This paper proposes a two-pass clone classification technique that uses a tree-based convolution neural network (TBCNN) to detect multiple clone types, including clones that are not wholly identical or to which only small changes have been made, and automatically classify them by type. Our method was validated with BigCloneBench, a well-known and wildly used dataset of cloned code. Our experimental results validate that our technique detected clones with an average rate of 96% recall and precision, and classified clones with an average rate of 78% recall and precision.

Download Full-text

Enhancing the Software Clone Detection in BigCloneBench

International Journal of Open Source Software and Processes ◽

10.4018/ijossp.2021070102 ◽

2021 ◽

Vol 12 (3) ◽

pp. 17-31

Author(s):

Amandeep Kaur ◽

Munish Saini

Keyword(s):

Structural Information ◽

Source Code ◽

Maintenance Cost ◽

Clone Detection ◽

Abstract Syntax ◽

Code Clones ◽

Major Drawback ◽

Abstract Syntax Tree ◽

Detection Techniques ◽

Code Clone

In the software system, the code snippets that are copied and pasted in the same software or another software result in cloning. The basic cause of cloning is either a programmer‘s constraint or language constraints. An increase in the maintenance cost of software is the major drawback of code clones. So, clone detection techniques are required to remove or refactor the code clone. Recent studies exhibit the abstract syntax tree (AST) captures the structural information of source code appropriately. Many researchers used tree-based convolution for identifying the clone, but this technique has certain drawbacks. Therefore, in this paper, the authors propose an approach that finds the semantic clone through square-based convolution by taking abstract syntax representation of source code. Experimental results show the effectiveness of the approach to the popular BigCloneBench benchmark.

Download Full-text

Analysis of Software Clones

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit217290 ◽

2021 ◽

pp. 439-450

Author(s):

Chavi Ralhan ◽

Rakesh Bishnoi ◽

Ankit ◽

Anjali ◽

Hitesh Kumar

Keyword(s):

Entire Range ◽

Cost Benefit ◽

The State ◽

Future Research ◽

Code Clones ◽

Software Frameworks ◽

The Past ◽

Software Clones

Copied code or code clones are a sort of code that contrarily affect the improvement and support of software frameworks. Software clone research in the past generally cantered around the discovery. what's more, examination of code clones, while research lately reaches out to the entire range of clone the board. In the last decade, three reviews showed up in the writing, which cover the recognition, examination and transformative attributes of code clones. This paper presents a complete overview on the state of the workmanship in clone the board, with top to bottom examination of clone the executives exercise (e.g., following, refactoring, cost benefit investigation) past the recognition and examination. This is the main overview on clone the board, where we highlight the accomplishments up until now, and uncover roads for additional exploration essential towards an incorporated clone the board framework. We accept that we have worked really hard in studying the territory of clone the board and that this work may fill in as a guide for future research in the area.

Download Full-text

Identifying High-Level Concept Clones in Software Programs Using Method’s Descriptive Documentation

Symmetry ◽

10.3390/sym13030447 ◽

2021 ◽

Vol 13 (3) ◽

pp. 447

Author(s):

Aditi Gupta ◽

Rinkaj Goyal

Keyword(s):

Similarity Measures ◽

Latent Semantic Indexing ◽

Coarse Grained ◽

Software Systems ◽

Abstract Data Type ◽

Code Clones ◽

Text Corpus ◽

Fine Grained ◽

Corpus Size ◽

High Level

Software clones are code fragments with similar or nearly similar functionality or structures. These clones are introduced in a project either accidentally or deliberately during software development or maintenance process. The presence of clones poses a significant threat to the maintenance of software systems and is on the top of the list of code smell types. Clones can be simple (fine-grained) or high-level (coarse-grained), depending on the chosen granularity of code for the clone detection. Simple clones are generally viewed at the lines/statements level, whereas high-level clones have granularity as a block, method, class, or file. High-level clones are said to be composed of multiple simple clones. This study aims to detect high-level conceptual code clones (having granularity as java methods) in java-based projects, which is extendable to the projects developed in other languages as well. Conceptual code clones are the ones implementing a similar higher-level abstraction such as an Abstract Data Type (ADT) list. Based on the assumption that “similar documentation implies similar methods”, the proposed mechanism uses “documentation” associated with methods to identify method-level concept clones. As complete documentation does not contribute to the method’s semantics, we extracted only the description part of the method’s documentation, which led to two benefits: increased efficiency and reduced text corpus size. Further, we used Latent Semantic Indexing (LSI) with different combinations of weight and similarity measures to identify similar descriptions in the text corpus. To show the efficacy of the proposed approach, we validated it using three java open source systems of sufficient length. The findings suggest that the proposed mechanism can detect methods implementing similar high-level concepts with improved recall values.

Download Full-text

A comparative study of test code clones and production code clones

Journal of Systems and Software ◽

10.1016/j.jss.2021.110940 ◽

2021 ◽

pp. 110940

Author(s):

Brent van Bladel ◽

Serge Demeyer

Keyword(s):

Comparative Study ◽

Code Clones ◽

Production Code

Download Full-text

Jupyter Notebooks on GitHub: Characteristics and Code Clones

The Art Science and Engineering of Programming ◽

10.22152/programming-journal.org/2021/5/15 ◽

2021 ◽

Vol 5 (3) ◽

Author(s):

Malin Källén ◽

Tobias Wrigstad

Keyword(s):

Code Clones

Download Full-text

Using Dynamic Time Warping to Detect Clones in Software Systems

International Journal of Software Innovation ◽

10.4018/ijsi.2021010103 ◽

2021 ◽

Vol 9 (1) ◽

pp. 20-36

Author(s):

Mostefai Abdelkader

Keyword(s):

Time Series ◽

Dynamic Time Warping ◽

Software Systems ◽

Clone Detection ◽

Code Clones ◽

Time Warping ◽

Code Clone ◽

Software Modules ◽

Software Clone Detection ◽

Dynamic Time

Software clone detection is a widely researched area over the last two decades. Code clones are fragments of code judged similar by some metric of similarity. This paper proposes an approach for code clone detection using dynamic time warping technique (i.e., DTW). DTW is a well-known algorithm for aligning and measuring similarity of time series and it has been found effective in many domains where similarity plays an important role such as speech and gesture recognition. The proposed approach finds clones in three steps. First software modules are extracted. Then, the extracted modules are turned to time series. Finally, the time series are compared using the DTW algorithm to find clones. The results of the experiment conducted on a well-known Benchmark show that the approach can detect clones effectively in software systems.

Download Full-text

Development of Porting Analyzer to Search Cross-Language Code Clones Using Levenshtein Distance

Smart Computing Techniques and Applications - Smart Innovation, Systems and Technologies ◽

10.1007/978-981-16-0878-0_60 ◽

2021 ◽

pp. 623-632

Author(s):

Sanjay B. Ankali ◽

Latha Parthiban

Keyword(s):

Levenshtein Distance ◽

Code Clones ◽

Cross Language

Download Full-text

code clones
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Clone-advisor: recommending code tokens and clone methods with deep learning and information retrieval

The Existence and Co-Modifications of Code Clones within or across Microservices

Two-Pass Technique for Clone Detection and Type Classification Using Tree-Based Convolution Neural Network

Enhancing the Software Clone Detection in BigCloneBench

Analysis of Software Clones

Identifying High-Level Concept Clones in Software Programs Using Method’s Descriptive Documentation

A comparative study of test code clones and production code clones

Jupyter Notebooks on GitHub: Characteristics and Code Clones

Using Dynamic Time Warping to Detect Clones in Software Systems

Development of Porting Analyzer to Search Cross-Language Code Clones Using Levenshtein Distance

Export Citation Format

code clonesRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Clone-advisor: recommending code tokens and clone methods with deep learning and information retrieval

The Existence and Co-Modifications of Code Clones within or across Microservices

Two-Pass Technique for Clone Detection and Type Classification Using Tree-Based Convolution Neural Network

Enhancing the Software Clone Detection in BigCloneBench

Analysis of Software Clones

Identifying High-Level Concept Clones in Software Programs Using Method’s Descriptive Documentation

A comparative study of test code clones and production code clones

Jupyter Notebooks on GitHub: Characteristics and Code Clones

Using Dynamic Time Warping to Detect Clones in Software Systems

Development of Porting Analyzer to Search Cross-Language Code Clones Using Levenshtein Distance

code clones
Recently Published Documents