ENHANCING A HYBRID PRE-PROCESSING AND TRANSFORMATION PROCESS FOR CODE CLONE DETECTION IN .NET APPLICATION

Pre-processing and transformation are the first two common processes that occur in a code clone detection process. The purpose of these two processes is to transform the source codes into a more representable form that can be used later on as input for code clone detection. Main issue arises in both of these processes is the application of the pre-processing and transformation rules might cause loss of critical information thus affecting the code clone detection results. Therefore, this work proposes a combination pre-processing and transformation process that can produce a better source unit representation of .Net platform source code which is C#. Net and VB.Net by enhancing an existing work that was done on Java language without affecting the critical information in the source code. The proposed enhancement was tested and the result showed that the proposed work was able to produce the expected source unit for the .Net platform languages together.

Download Full-text

Integrated Reasoning Engine for Code Clone Detection

ABC Journal of Advanced Research ◽

10.18034/abcjar.v3i2.575 ◽

2014 ◽

Vol 3 (2) ◽

pp. 143-152 ◽

Cited By ~ 5

Author(s):

Naresh Babu Bynagari

Keyword(s):

Clone Detection ◽

Code Clones ◽

High Pitch ◽

Detection Process ◽

Code Clone ◽

Similar Code ◽

Reasoning Engine

This article seeks to foray into the nitty-gritty of integrated reasoning for code clone detection and how it is effectively carried out, given the amount of analytics usually associated with such activities. Detection of codes requires high-pitch familiarity with cloning systems and their workings. Hence, discovering similar code segments that are often regarded and seen as code imitations (clone) is not an easy responsibility. More especially, this very detection process might possess key purposes in the context of susceptibility findings, refactoring, and imitation detecting. Through the voyage of discovery this article intends to expose you to, you will realize that identical code segments, more often than not described as code clones, appear to be a serious duty, especially for large code bases <1; 2; 3; 4>. There are certain approaches and deep technicalities that this sort of detection is known for. Still, from the avalanche of resources that formed the bedrock of this article, one would discover the easiest formula to adopt in maneuvering such strenuous issues.

Download Full-text

One pass preprocessing for token-based source code clone detection

2014 IEEE 6th International Conference on Awareness Science and Technology (iCAST) ◽

10.1109/icawst.2014.6981824 ◽

2014 ◽

Cited By ~ 1

Author(s):

Dingkun Li ◽

Minghao Piao ◽

Ho Sun Shon ◽

Keun Ho Ryu ◽

Incheon Paik

Keyword(s):

Source Code ◽

Clone Detection ◽

Code Clone

Download Full-text

A Review on Distance Measure Formula for Enhancing Match Detection Process of Generic Code Clone Detection Model in Java Application

10.1109/icsecs52883.2021.00058 ◽

2021 ◽

Author(s):

Noormaizzattul Akmaliza Abdullah ◽

Mohd Azwan Mohamad Hamza ◽

Al-Fahim Mubarak Ali

Keyword(s):

Distance Measure ◽

Clone Detection ◽

Detection Process ◽

Detection Model ◽

Code Clone ◽

Java Application

Download Full-text

Finding Software License Violations Through Binary Code Clone Detection - A Retrospective

ACM SIGSOFT Software Engineering Notes ◽

10.1145/3468744.3468752 ◽

2021 ◽

Vol 46 (3) ◽

pp. 24-25

Author(s):

Armijn Hemel ◽

Karl Trygve Kalleberg ◽

Rob Vermaas ◽

Eelco Dolstra

Keyword(s):

Open Source ◽

Original Problem ◽

Binary Code ◽

Source Code ◽

Clone Detection ◽

Problem Statement ◽

Open Source Code ◽

Code Clone ◽

Software License ◽

The Impact

Ten years ago, we published the article Finding software license violations through binary code clone detection at the MSR 2011 conference. Our paper was motivated by the tendency of em- bedded hardware vendors to only release binary blobs of their rmware, often violating the licensing terms of open-source soft- ware present inside those blobs. The techniques presented in our paper were designed to accurately identify open-source code hid- den inside binary blobs. Here, we give our perspectives on the impact of our work, both industrially and academically, and re- visit the original problem statement to see what has happened in the eld of open-source compliance in the intervening decade.

Download Full-text

Intelligent token-based code clone detection system for large scale source code

Proceedings of the Conference on Research in Adaptive and Convergent Systems - RACS '19 ◽

10.1145/3338840.3355654 ◽

2019 ◽

Cited By ~ 1

Author(s):

Abdulrahman Abu Elkhail ◽

Jan Svacina ◽

Tomas Cerny

Keyword(s):

Large Scale ◽

Detection System ◽

Source Code ◽

Clone Detection ◽

Code Clone

Download Full-text

Gapped code clone detection with lightweight source code analysis

2013 21st International Conference on Program Comprehension (ICPC) ◽

10.1109/icpc.2013.6613837 ◽

2013 ◽

Cited By ~ 15

Author(s):

Hiroaki Murakami ◽

Keisuke Hotta ◽

Yoshiki Higo ◽

Hiroshi Igaki ◽

Shinji Kusumoto

Keyword(s):

Source Code ◽

Clone Detection ◽

Source Code Analysis ◽

Code Analysis ◽

Code Clone

Download Full-text

A Review on Distance Measure Formula for Enhancing Match Detection Process of Generic Code Clone Detection Model in Java Application

10.1109/icsecs52883.2021.00132 ◽

2021 ◽

Author(s):

Noormaizzattul Akmaliza Abdullah ◽

Mohd Azwan Mohamad Hamza ◽

Al-Fahim Mubarak Ali

Keyword(s):

Distance Measure ◽

Clone Detection ◽

Detection Process ◽

Detection Model ◽

Code Clone ◽

Java Application

Download Full-text

Shuffling and randomization for scalable source code clone detection

2012 6th International Workshop on Software Clones (IWSC) ◽

10.1109/iwsc.2012.6227875 ◽

2012 ◽

Cited By ~ 3

Author(s):

Iman Keivanloo ◽

Chanchal K. Roy ◽

Juergen Rilling ◽

Philippe Charland

Keyword(s):

Source Code ◽

Clone Detection ◽

Code Clone

Download Full-text

Enhancing the Software Clone Detection in BigCloneBench

International Journal of Open Source Software and Processes ◽

10.4018/ijossp.2021070102 ◽

2021 ◽

Vol 12 (3) ◽

pp. 17-31

Author(s):

Amandeep Kaur ◽

Munish Saini

Keyword(s):

Structural Information ◽

Source Code ◽

Maintenance Cost ◽

Clone Detection ◽

Abstract Syntax ◽

Code Clones ◽

Major Drawback ◽

Abstract Syntax Tree ◽

Detection Techniques ◽

Code Clone

In the software system, the code snippets that are copied and pasted in the same software or another software result in cloning. The basic cause of cloning is either a programmer‘s constraint or language constraints. An increase in the maintenance cost of software is the major drawback of code clones. So, clone detection techniques are required to remove or refactor the code clone. Recent studies exhibit the abstract syntax tree (AST) captures the structural information of source code appropriately. Many researchers used tree-based convolution for identifying the clone, but this technique has certain drawbacks. Therefore, in this paper, the authors propose an approach that finds the semantic clone through square-based convolution by taking abstract syntax representation of source code. Experimental results show the effectiveness of the approach to the popular BigCloneBench benchmark.

Download Full-text

Code Clone Detection with Hierarchical Attentive Graph Embedding

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s021819402150025x ◽

2021 ◽

Vol 31 (06) ◽

pp. 837-861

Author(s):

Xiujuan Ji ◽

Lei Liu ◽

Jingwen Zhu

Keyword(s):

Deep Learning ◽

Source Code ◽

Detection Performance ◽

Attention Mechanism ◽

Feature Representation ◽

Model Parameters ◽

Clone Detection ◽

Learning Approaches ◽

Convolutional Network ◽

Code Clone

Code clone serves as a typical programming manner that reuses the existing code to solve similar programming problems, which greatly facilitates software development but recurs program bugs and maintenance costs. Recently, deep learning-based detection approaches gradually present their effectiveness on feature representation and detection performance. Among them, deep learning approaches based on abstract syntax tree (AST) construct models relying on the node embedding technique. In AST, the semantic of nodes is obviously hierarchical, and the importance of nodes is quite different to determine whether the two code fragments are cloned or not. However, some approaches do not fully consider the hierarchical structure information of source code. Some approaches ignore the different importance of nodes when generating the features of source code. Thirdly, when the tree is very large and deep, many approaches are vulnerable to the gradient vanishing problem during training. In order to properly address these challenges, we propose a hierarchical attentive graph neural network embedding model-HAG for the code clone detection. Firstly, the attention mechanism is applied on nodes in AST to distinguish the importance of different nodes during the model training. In addition, the HAG adopts graph convolutional network (GCN) to propagate the code message on AST graph and then exploits a hierarchical differential pooling GCN to sufficiently capture the code semantics at different structure level. To evaluate the effectiveness of HAG, we conducted extensive experiments on public clone dataset and compared it with seven state-of-the-art clone detection models. The experimental results demonstrate that the HAG achieves superior detection performance compared with baseline models. Especially, in the detection of moderately Type-3 or Type-4 clones, the HAG particularly outperforms baselines, indicating the strong detection capability of HAG for semantic clones. Apart from that, the impacts of the hierarchical pooling, attention mechanism and critical model parameters are systematically discussed.

Download Full-text