A Source Code Plagiarism Detecting Method Using Sequence Alignment with Abstract Syntax Tree Elements

2015 ◽  
Vol 3 (3) ◽  
pp. 41-56 ◽  
Author(s):  
Hiroshi Kikuchi ◽  
Takaaki Goto ◽  
Mitsuo Wakatsuki ◽  
Tetsuro Nishino

Learning to program is an important subject in computer science courses. During programming exercises, plagiarism by copying and pasting can lead to problems for fair evaluation. Some methods of plagiarism detection are currently available, such as sim. However, because sim is easily influenced by changing the identifier or program statement order, it fails to do enough to support plagiarism detection. In this paper, the authors propose a plagiarism detection method which is not influenced by changing the identifier or program statement order. The authors also explain our method's capabilities by comparing it to the sim plagiarism detector. Furthermore, the authors reveal how our method successfully detects the presence of plagiarism.

Author(s):  
Hiroshi Kikuchi ◽  
Takaaki Goto ◽  
Mitsuo Wakatsuki ◽  
Tetsuro Nishino

Learning to program is an important subject in computer science courses. During programming exercises, plagiarism by copying and pasting can lead to problems for fair evaluation. Some methods of plagiarism detection are currently available, such as sim. However, because sim is easily influenced by changing the identifier or program statement order, it fails to do enough to support plagiarism detection. In this paper, the authors propose a plagiarism detection method which is not influenced by changing the identifier or program statement order. The authors also explain our method's capabilities by comparing it to the sim plagiarism detector. Furthermore, the authors reveal how our method successfully detects the presence of plagiarism.


2017 ◽  
Vol 2017 ◽  
pp. 1-8 ◽  
Author(s):  
Deqiang Fu ◽  
Yanyan Xu ◽  
Haoran Yu ◽  
Boyang Yang

In this paper, we introduce a source code plagiarism detection method, named WASTK (Weighted Abstract Syntax Tree Kernel), for computer science education. Different from other plagiarism detection methods, WASTK takes some aspects other than the similarity between programs into account. WASTK firstly transfers the source code of a program to an abstract syntax tree and then gets the similarity by calculating the tree kernel of two abstract syntax trees. To avoid misjudgment caused by trivial code snippets or frameworks given by instructors, an idea similar to TF-IDF (Term Frequency-Inverse Document Frequency) in the field of information retrieval is applied. Each node in an abstract syntax tree is assigned a weight by TF-IDF. WASTK is evaluated on different datasets and, as a result, performs much better than other popular methods like Sim and JPlag.


2019 ◽  
Vol 1 (1) ◽  
pp. 46-56 ◽  
Author(s):  
Victor R. L. Shen

Those students who major in computer science and/or engineering are required to design program codes in a variety of programming languages. However, many students submit their source codes they get from the Internet or friends with no or few modifications. Detecting the code plagiarisms done by students is very time-consuming and leads to the problems of unfair learning performance evaluation. This paper proposes a novel method to detect the source code plagiarisms by using a high-level fuzzy Petri net (HLFPN) based on abstract syntax tree (AST). First, the AST of each source code is generated after the lexical and syntactic analyses have been done. Second, token sequence is generated based on the AST. Using the AST can effectively detect the code plagiarism by changing the identifier or program statement order. Finally, the generated token sequences are compared with one another using an HLFPN to determine the code plagiarism. Furthermore, the experimental results have indicated that we can make better determination to detect the code plagiarism.


Sign in / Sign up

Export Citation Format

Share Document