scholarly journals ENHANCING A HYBRID PRE-PROCESSING AND TRANSFORMATION PROCESS FOR CODE CLONE DETECTION IN .NET APPLICATION

2015 ◽  
Vol 77 (9) ◽  
Author(s):  
Al-Fahim Mubarak-Ali ◽  
Shahida Sulaiman

Pre-processing and transformation are the first two common processes that occur in a code clone detection process. The purpose of these two processes is to transform the source codes into a more representable form that can be used later on as input for code clone detection. Main issue arises in both of these processes is the application of the pre-processing and transformation rules might cause loss of critical information thus affecting the code clone detection results. Therefore, this work proposes a combination pre-processing and transformation process that can produce a better source unit representation of .Net platform source code which is C#. Net and VB.Net by enhancing an existing work that was done on Java language without affecting the critical information in the source code. The proposed enhancement was tested and the result showed that the proposed work was able to produce the expected source unit for the .Net platform languages together.

2014 ◽  
Vol 3 (2) ◽  
pp. 143-152 ◽  
Author(s):  
Naresh Babu Bynagari

This article seeks to foray into the nitty-gritty of integrated reasoning for code clone detection and how it is effectively carried out, given the amount of analytics usually associated with such activities. Detection of codes requires high-pitch familiarity with cloning systems and their workings. Hence, discovering similar code segments that are often regarded and seen as code imitations (clone) is not an easy responsibility. More especially, this very detection process might possess key purposes in the context of susceptibility findings, refactoring, and imitation detecting. Through the voyage of discovery this article intends to expose you to, you will realize that identical code segments, more often than not described as code clones, appear to be a serious duty, especially for large code bases <1; 2; 3; 4>. There are certain approaches and deep technicalities that this sort of detection is known for. Still, from the avalanche of resources that formed the bedrock of this article, one would discover the easiest formula to adopt in maneuvering such strenuous issues.


2021 ◽  
Vol 46 (3) ◽  
pp. 24-25
Author(s):  
Armijn Hemel ◽  
Karl Trygve Kalleberg ◽  
Rob Vermaas ◽  
Eelco Dolstra

Ten years ago, we published the article Finding software license violations through binary code clone detection at the MSR 2011 conference. Our paper was motivated by the tendency of em- bedded hardware vendors to only release binary blobs of their rmware, often violating the licensing terms of open-source soft- ware present inside those blobs. The techniques presented in our paper were designed to accurately identify open-source code hid- den inside binary blobs. Here, we give our perspectives on the impact of our work, both industrially and academically, and re- visit the original problem statement to see what has happened in the eld of open-source compliance in the intervening decade.


Author(s):  
Iman Keivanloo ◽  
Chanchal K. Roy ◽  
Juergen Rilling ◽  
Philippe Charland

2021 ◽  
Vol 12 (3) ◽  
pp. 17-31
Author(s):  
Amandeep Kaur ◽  
Munish Saini

In the software system, the code snippets that are copied and pasted in the same software or another software result in cloning. The basic cause of cloning is either a programmer‘s constraint or language constraints. An increase in the maintenance cost of software is the major drawback of code clones. So, clone detection techniques are required to remove or refactor the code clone. Recent studies exhibit the abstract syntax tree (AST) captures the structural information of source code appropriately. Many researchers used tree-based convolution for identifying the clone, but this technique has certain drawbacks. Therefore, in this paper, the authors propose an approach that finds the semantic clone through square-based convolution by taking abstract syntax representation of source code. Experimental results show the effectiveness of the approach to the popular BigCloneBench benchmark.


Author(s):  
Xiujuan Ji ◽  
Lei Liu ◽  
Jingwen Zhu

Code clone serves as a typical programming manner that reuses the existing code to solve similar programming problems, which greatly facilitates software development but recurs program bugs and maintenance costs. Recently, deep learning-based detection approaches gradually present their effectiveness on feature representation and detection performance. Among them, deep learning approaches based on abstract syntax tree (AST) construct models relying on the node embedding technique. In AST, the semantic of nodes is obviously hierarchical, and the importance of nodes is quite different to determine whether the two code fragments are cloned or not. However, some approaches do not fully consider the hierarchical structure information of source code. Some approaches ignore the different importance of nodes when generating the features of source code. Thirdly, when the tree is very large and deep, many approaches are vulnerable to the gradient vanishing problem during training. In order to properly address these challenges, we propose a hierarchical attentive graph neural network embedding model-HAG for the code clone detection. Firstly, the attention mechanism is applied on nodes in AST to distinguish the importance of different nodes during the model training. In addition, the HAG adopts graph convolutional network (GCN) to propagate the code message on AST graph and then exploits a hierarchical differential pooling GCN to sufficiently capture the code semantics at different structure level. To evaluate the effectiveness of HAG, we conducted extensive experiments on public clone dataset and compared it with seven state-of-the-art clone detection models. The experimental results demonstrate that the HAG achieves superior detection performance compared with baseline models. Especially, in the detection of moderately Type-3 or Type-4 clones, the HAG particularly outperforms baselines, indicating the strong detection capability of HAG for semantic clones. Apart from that, the impacts of the hierarchical pooling, attention mechanism and critical model parameters are systematically discussed.


Sign in / Sign up

Export Citation Format

Share Document