Automated Plagiarism Detection for Computer Programming Exercises Based on Patterns of Resubmission

Author(s):  
Narjes Tahaei ◽  
David C. Noelle
2020 ◽  
Vol 2020 ◽  
pp. 1-15
Author(s):  
Feng Zhang ◽  
Lulu Li ◽  
Cong Liu ◽  
Qingtian Zeng

Source code similarity detection has extensive applications in computer programming teaching and software intellectual property protection. In the teaching of computer programming courses, students may utilize some complex source code obfuscation techniques, e.g., opaque predicates, loop unrolling, and function inlining and outlining, to reduce the similarity between code fragments and avoid the plagiarism detection. Existing source code similarity detection approaches only consider static features of source code, making it difficult to cope with more complex code obfuscation techniques. In this paper, we propose a novel source code similarity detection approach by considering the dynamic features at runtime of source code using process mining. More specifically, given two pieces of source code, their running logs are obtained by source code instrumentation and execution. Next, process mining is used to obtain the flow charts of the two pieces of source code by analyzing their collected running logs. Finally, similarity of the two pieces of source code is measured by computing the similarity of these two flow charts. Experimental results show that the proposed approach can deal with more complex obfuscation techniques including opaque predicates and loop unrolling as well as function inlining and outlining, which cannot be handled by existing work properly. Therefore, we argue that our approach can defeat commonly used code obfuscation techniques more effectively for source code similarity detection than the existing state-of-the-art approaches.


2020 ◽  
Vol 2020 ◽  
pp. 1-15
Author(s):  
Feng Zhang ◽  
Guofan Li ◽  
Cong Liu ◽  
Qian Song

Source code similarity detection has various applications in code plagiarism detection and software intellectual property protection. In computer programming teaching, students may convert the source code written in one programming language into another language for their code assignment submission. Existing similarity measures of source code written in the same language are not applicable for the cross-language code similarity detection because of syntactic differences among different programming languages. Meanwhile, existing cross-language source similarity detection approaches are susceptible to complex code obfuscation techniques, such as replacing equivalent control structure and adding redundant statements. To solve this problem, we propose a cross-language code similarity detection (CLCSD) approach based on code flowcharts. In general, two source code fragments written in different programming languages are transformed into standardized code flowcharts (SCFC), and their similarity is obtained by measuring their corresponding SCFC. More specifically, we first introduce the standardized code flowchart (SCFC) model to be the uniform flowcharts representation of source code written in different languages. SCFC is language-independent, and therefore, it can be used as the intermediate structure for source code similarity detection. Meanwhile, transformation techniques are given to transform source code written in a specific programming language into an SCFC. Second, we propose the SCFC-SPGK algorithm based on the shortest path graph kernel to measure the similarity between two SCFCs. Thus, the similarity between two pieces of source code in different programming languages is given by the similarity between SCFCs. Experimental results show that compared with existing approaches, CLCSD has higher accuracy in cross-language source code similarity detection. Furthermore, CLCSD cannot only handle common source code obfuscation techniques used by students in computer programming teaching but also obtain nearly 90% accuracy in dealing with some complex obfuscation techniques.


Author(s):  
Woodrow Barfield ◽  
William K. LeBold ◽  
Gavriel Salvendy ◽  
Sogand Shodja

2015 ◽  
Vol 1 (3) ◽  
pp. 69
Author(s):  
Arjan Skuka

Despite the fact that introductory programming courses (IPCs) are taught at universities for more than thirty years, students still find computer programming very difficult to learn. Programming pedagogy deals with the methods and principles of teaching and learning computer programming. The programming pedagogical approaches that have been proposed to increase the efficiency of teaching and learning computer programming mostly focus on the tools, paradigms, programming languages and environments used in IPCs. To increase significantly the students’ success rates in IPCs, these approaches should be complemented with pedagogical explanation (PE) methods. This research is focused on a PE method of teaching sequential search of a matrix row (SSMR). The research was designed as experimental study with pretest-posttest control group model, involving students of Computer Engineering department Izmir University. While the experimental group was subjected to a pedagogical explanation method, a traditional explanation method was applied in the control group. To collect the research data, an achievement pretest, posttest and a questionnaire were developed and applied. The research findings showed the effectiveness of teaching SSMR by using a PE method. This method positively influenced students’ level of topic comprehension, which consequently improved their achievements. In order for students to understand better the other matrix programming operations, similar PE methods should be developed and used in IPCs. On a more general level, the results of this research suggested that PE methods should be developed and used for other topics that students usually find difficult to understand in IPCs. Using these methods can be a very important factor in significantly increasing students’ success in IPCs.


Sign in / Sign up

Export Citation Format

Share Document