Batch source-code plagiarism detection using an algorithm for the bounded longest common subsequence problem

Author(s):  
R. A. Castro Campos ◽  
F. J. Zaragoza Martinez
2015 ◽  
Vol 13 (4) ◽  
pp. 29-39 ◽  
Author(s):  
Xin Liu ◽  
Chan Xu ◽  
Boyu Ouyang

Nowadays, computer programming is getting more necessary in the course of program design in college education. However, the trick of plagiarizing plus a little modification exists among some students' home works. It's not easy for teachers to judge if there's plagiarizing in source code or not. Traditional detection algorithms cannot fit this condition. The author designed an effective and complete method to detect source code plagiarizing according to the popular way of students' plagiarizing. There are two basic concepts of the algorithm. One is to standardize the source code via filtration against to remove the majority noises intentionally blended by plagiarists. The other one is an improved Longest Common Subsequence algorithm for text matching, using statement as the unit for matching. The authors also designed an appropriate HASH function to increase the efficiency of matching. Based on the algorithm, a system was designed and proved to be practical and sufficient, which runs well and meet the practical requirement in application.


Author(s):  
Xin Liu ◽  
Chan Xu ◽  
Boyu Ouyang

Nowadays, computer programming is getting more necessary in the course of program design in college education. However, the trick of plagiarizing plus a little modification exists among some students' home works. It's not easy for teachers to judge if there's plagiarizing in source code or not. Traditional detection algorithms cannot fit this condition. The author designed an effective and complete method to detect source code plagiarizing according to the popular way of students' plagiarizing. There are two basic concepts of the algorithm. One is to standardize the source code via filtration against to remove the majority noises intentionally blended by plagiarists. The other one is an improved Longest Common Subsequence algorithm for text matching, using statement as the unit for matching. The authors also designed an appropriate HASH function to increase the efficiency of matching. Based on the algorithm, a system was designed and proved to be practical and sufficient, which runs well and meet the practical requirement in application.


2021 ◽  
Vol 166 ◽  
pp. 106041
Author(s):  
Marko Djukanovic ◽  
Christoph Berger ◽  
Günther R. Raidl ◽  
Christian Blum

Sign in / Sign up

Export Citation Format

Share Document