code clone Latest Research Papers

What You See is What it Means! Semantic Representation Learning of Code based on Visualization and Transfer Learning

ACM Transactions on Software Engineering and Methodology ◽

10.1145/3485135 ◽

2022 ◽

Vol 31 (2) ◽

pp. 1-34

Author(s):

Patrick Keller ◽

Abdoul Kader Kaboré ◽

Laura Plein ◽

Jacques Klein ◽

Yves Le Traon ◽

...

Keyword(s):

Transfer Learning ◽

Language Processing ◽

State Of The Art ◽

Semantic Representation ◽

Source Code ◽

Visual Representations ◽

Representation Learning ◽

Classification Problem ◽

Semantic Code ◽

Code Clone

Recent successes in training word embeddings for Natural Language Processing ( NLP ) tasks have encouraged a wave of research on representation learning for source code, which builds on similar NLP methods. The overall objective is then to produce code embeddings that capture the maximum of program semantics. State-of-the-art approaches invariably rely on a syntactic representation (i.e., raw lexical tokens, abstract syntax trees, or intermediate representation tokens) to generate embeddings, which are criticized in the literature as non-robust or non-generalizable. In this work, we investigate a novel embedding approach based on the intuition that source code has visual patterns of semantics. We further use these patterns to address the outstanding challenge of identifying semantic code clones. We propose the WySiWiM ( ‘ ‘What You See Is What It Means ” ) approach where visual representations of source code are fed into powerful pre-trained image classification neural networks from the field of computer vision to benefit from the practical advantages of transfer learning. We evaluate the proposed embedding approach on the task of vulnerable code prediction in source code and on two variations of the task of semantic code clone identification: code clone detection (a binary classification problem), and code classification (a multi-classification problem). We show with experiments on the BigCloneBench (Java), Open Judge (C) that although simple, our WySiWiM approach performs as effectively as state-of-the-art approaches such as ASTNN or TBCNN. We also showed with data from NVD and SARD that WySiWiM representation can be used to learn a vulnerable code detector with reasonable performance (accuracy ∼90%). We further explore the influence of different steps in our approach, such as the choice of visual representations or the classification algorithm, to eventually discuss the promises and limitations of this research direction.

SSA-HIAST: A Novel Framework for Code Clone Detection

Computers Materials & Continua ◽

10.32604/cmc.2022.022659 ◽

2022 ◽

Vol 71 (2) ◽

pp. 2999-3017

Author(s):

Neha Saini ◽

Sukhdip Singh

Keyword(s):

Clone Detection ◽

Code Clone

Clone-advisor: recommending code tokens and clone methods with deep learning and information retrieval

PeerJ Computer Science ◽

10.7717/peerj-cs.737 ◽

2021 ◽

Vol 7 ◽

pp. e737

Author(s):

Muhammad Hammad ◽

Önder Babur ◽

Hamid Abdul Basit ◽

Mark van den Brand

Keyword(s):

Information Retrieval ◽

Deep Learning ◽

Rapid Development ◽

Code Clones ◽

Code Clone ◽

Retrieval Technique ◽

Novel Approach ◽

Modeling Code ◽

Similar Code ◽

Probabilistic Nature

Software developers frequently reuse source code from repositories as it saves development time and effort. Code clones (similar code fragments) accumulated in these repositories represent often repeated functionalities and are candidates for reuse in an exploratory or rapid development. To facilitate code clone reuse, we previously presented DeepClone, a novel deep learning approach for modeling code clones along with non-cloned code to predict the next set of tokens (possibly a complete clone method body) based on the code written so far. The probabilistic nature of language modeling, however, can lead to code output with minor syntax or logic errors. To resolve this, we propose a novel approach called Clone-Advisor. We apply an information retrieval technique on top of DeepClone output to recommend real clone methods closely matching the predicted clone method, thus improving the original output by DeepClone. In this paper we have discussed and refined our previous work on DeepClone in much more detail. Moreover, we have quantitatively evaluated the performance and effectiveness of Clone-Advisor in clone method recommendation.

Comparison between Code Clone Detection and Model Clone Detection

10.1109/icrito51393.2021.9596454 ◽

2021 ◽

Author(s):

G Shobha ◽

Ajay Rana ◽

Vineet Kansal ◽

Sarvesh Tanwar

Keyword(s):

Clone Detection ◽

Code Clone ◽

Model Clone Detection

A Review on Distance Measure Formula for Enhancing Match Detection Process of Generic Code Clone Detection Model in Java Application

10.1109/icsecs52883.2021.00058 ◽

2021 ◽

Author(s):

Noormaizzattul Akmaliza Abdullah ◽

Mohd Azwan Mohamad Hamza ◽

Al-Fahim Mubarak Ali

Keyword(s):

Distance Measure ◽

Clone Detection ◽

Detection Process ◽

Detection Model ◽

Code Clone ◽

Java Application

A Review on Distance Measure Formula for Enhancing Match Detection Process of Generic Code Clone Detection Model in Java Application

10.1109/icsecs52883.2021.00132 ◽

2021 ◽

Author(s):

Noormaizzattul Akmaliza Abdullah ◽

Mohd Azwan Mohamad Hamza ◽

Al-Fahim Mubarak Ali

Keyword(s):

Distance Measure ◽

Clone Detection ◽

Detection Process ◽

Detection Model ◽

Code Clone ◽

Java Application

AST-path Based Compare-Aggregate Network for Code Clone Detection

10.1109/ijcnn52387.2021.9534099 ◽

2021 ◽

Author(s):

Hongliang Liang ◽

Lu Ai

Keyword(s):

Clone Detection ◽

Code Clone ◽

Aggregate Network

Finding Software License Violations Through Binary Code Clone Detection - A Retrospective

ACM SIGSOFT Software Engineering Notes ◽

10.1145/3468744.3468752 ◽

2021 ◽

Vol 46 (3) ◽

pp. 24-25

Author(s):

Armijn Hemel ◽

Karl Trygve Kalleberg ◽

Rob Vermaas ◽

Eelco Dolstra

Keyword(s):

Open Source ◽

Original Problem ◽

Binary Code ◽

Source Code ◽

Clone Detection ◽

Problem Statement ◽

Open Source Code ◽

Code Clone ◽

Software License ◽

The Impact

Ten years ago, we published the article Finding software license violations through binary code clone detection at the MSR 2011 conference. Our paper was motivated by the tendency of em- bedded hardware vendors to only release binary blobs of their rmware, often violating the licensing terms of open-source soft- ware present inside those blobs. The techniques presented in our paper were designed to accurately identify open-source code hid- den inside binary blobs. Here, we give our perspectives on the impact of our work, both industrially and academically, and re- visit the original problem statement to see what has happened in the eld of open-source compliance in the intervening decade.

Leveraging Compiler Optimization for Code Clone Detection

10.18293/seke2021-032 ◽

2021 ◽

Author(s):

Shirish Singh

Keyword(s):

Compiler Optimization ◽

Clone Detection ◽

Code Clone

Enhancing the Software Clone Detection in BigCloneBench

International Journal of Open Source Software and Processes ◽

10.4018/ijossp.2021070102 ◽

2021 ◽

Vol 12 (3) ◽

pp. 17-31

Author(s):

Amandeep Kaur ◽

Munish Saini

Keyword(s):

Structural Information ◽

Source Code ◽

Maintenance Cost ◽

Clone Detection ◽

Abstract Syntax ◽

Code Clones ◽

Major Drawback ◽

Abstract Syntax Tree ◽

Detection Techniques ◽

Code Clone

In the software system, the code snippets that are copied and pasted in the same software or another software result in cloning. The basic cause of cloning is either a programmer‘s constraint or language constraints. An increase in the maintenance cost of software is the major drawback of code clones. So, clone detection techniques are required to remove or refactor the code clone. Recent studies exhibit the abstract syntax tree (AST) captures the structural information of source code appropriately. Many researchers used tree-based convolution for identifying the clone, but this technique has certain drawbacks. Therefore, in this paper, the authors propose an approach that finds the semantic clone through square-based convolution by taking abstract syntax representation of source code. Experimental results show the effectiveness of the approach to the popular BigCloneBench benchmark.

code clone
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

What You See is What it Means! Semantic Representation Learning of Code based on Visualization and Transfer Learning

SSA-HIAST: A Novel Framework for Code Clone Detection

Clone-advisor: recommending code tokens and clone methods with deep learning and information retrieval

Comparison between Code Clone Detection and Model Clone Detection

A Review on Distance Measure Formula for Enhancing Match Detection Process of Generic Code Clone Detection Model in Java Application

A Review on Distance Measure Formula for Enhancing Match Detection Process of Generic Code Clone Detection Model in Java Application

AST-path Based Compare-Aggregate Network for Code Clone Detection

Finding Software License Violations Through Binary Code Clone Detection - A Retrospective

Leveraging Compiler Optimization for Code Clone Detection

Enhancing the Software Clone Detection in BigCloneBench

Export Citation Format

code cloneRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

What You See is What it Means! Semantic Representation Learning of Code based on Visualization and Transfer Learning

SSA-HIAST: A Novel Framework for Code Clone Detection

Clone-advisor: recommending code tokens and clone methods with deep learning and information retrieval

Comparison between Code Clone Detection and Model Clone Detection

A Review on Distance Measure Formula for Enhancing Match Detection Process of Generic Code Clone Detection Model in Java Application

A Review on Distance Measure Formula for Enhancing Match Detection Process of Generic Code Clone Detection Model in Java Application

AST-path Based Compare-Aggregate Network for Code Clone Detection

Finding Software License Violations Through Binary Code Clone Detection - A Retrospective

Leveraging Compiler Optimization for Code Clone Detection

Enhancing the Software Clone Detection in BigCloneBench

code clone
Recently Published Documents