An application for plagiarized source code detection based on a parse tree kernel

In this paper, we introduce a source code plagiarism detection method, named WASTK (Weighted Abstract Syntax Tree Kernel), for computer science education. Different from other plagiarism detection methods, WASTK takes some aspects other than the similarity between programs into account. WASTK firstly transfers the source code of a program to an abstract syntax tree and then gets the similarity by calculating the tree kernel of two abstract syntax trees. To avoid misjudgment caused by trivial code snippets or frameworks given by instructors, an idea similar to TF-IDF (Term Frequency-Inverse Document Frequency) in the field of information retrieval is applied. Each node in an abstract syntax tree is assigned a weight by TF-IDF. WASTK is evaluated on different datasets and, as a result, performs much better than other popular methods like Sim and JPlag.

Download Full-text

An Ontology Alignment Based on Parse Tree Kernel for Combining Structural and Semantic Information without Explicit Enumeration of Features

2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology ◽

10.1109/wiiat.2008.239 ◽

2008 ◽

Author(s):

Jeong-Woo Son ◽

Seong-Bae Park ◽

Se-Young Park

Keyword(s):

Semantic Information ◽

Parse Tree ◽

Ontology Alignment ◽

Tree Kernel ◽

Explicit Enumeration

Download Full-text

A Visualization Tool for Relationship between Source Code and Parse Tree Using VR

2020 9th International Congress on Advanced Applied Informatics (IIAI-AAI) ◽

10.1109/iiai-aai50415.2020.00047 ◽

2020 ◽

Author(s):

Masateru Kishikawa ◽

Tetsuro Kakeshita

Keyword(s):

Source Code ◽

Parse Tree ◽

Visualization Tool

Download Full-text

A Study on the Identification and Classification of Relation Between Biotechnology Terms Using Semantic Parse Tree Kernel

Journal of the Korean Society for Library and Information Science ◽

10.4275/kslis.2011.45.2.251 ◽

2011 ◽

Vol 45 (2) ◽

pp. 251-275

Author(s):

Sung-Pil Choi ◽

Chang-Hoo Jeong ◽

Hong-Woo Chun ◽

Hyun-Yang Cho

Keyword(s):

Parse Tree ◽

Tree Kernel

Download Full-text

Computation of Program Source Code Similarity by Composition of Parse Tree and Call Graph

Mathematical Problems in Engineering ◽

10.1155/2015/429807 ◽

2015 ◽

Vol 2015 ◽

pp. 1-12 ◽

Cited By ~ 5

Author(s):

Hyun-Je Song ◽

Seong-Bae Park ◽

Se Young Park

Keyword(s):

Structural Information ◽

Source Code ◽

Real Data ◽

Kernel Functions ◽

Parse Tree ◽

Plagiarism Detection ◽

Data Set ◽

Source Codes ◽

Function Calls ◽

Syntactic Information

This paper proposes a novel method to compute how similar two program source codes are. Since a program source code is represented as a structural form, the proposed method adopts convolution kernel functions as a similarity measure. Actually, a program source code has two kinds of structural information. One is syntactic information and the other is the dependencies of function calls lying on the program. Since the syntactic information of a program is expressed as its parse tree, the syntactic similarity between two programs is computed by a parse tree kernel. The function calls within a program provide a global structure of a program and can be represented as a graph. Therefore, the similarity of function calls is computed with a graph kernel. Then, both structural similarities are reflected simultaneously into comparing program source codes by composing the parse tree and the graph kernels based on a cyclomatic complexity. According to the experimental results on a real data set for program plagiarism detection, the proposed method is proved to be effective in capturing the similarity between programs. The experiments show that the plagiarized pairs of programs are found correctly and thoroughly by the proposed method.

Download Full-text

Source code

TCP/IP Essentials ◽

10.1017/cbo9781139167246.016 ◽

2004 ◽

pp. 236-252

Keyword(s):

Source Code

Download Full-text

SISTEM INFORMASI MANAJEMEN PENGARSIPAN BERBASIS FRAMEWORK CODE IGNITER UNTUK MENTERTIBKAN PELAYANAN SURAT MENYURAT

INTERNAL (Information System Journal) ◽

10.32627/internal.v2i1.70 ◽

2019 ◽

Vol 2 (1) ◽

pp. 1-16

Author(s):

Nana Suarna

Keyword(s):

Source Code ◽

Web Based

Seiring waktu, jumlah surat dalam sebuah perusahaan semakin hari makin banyak, sehingga muncul permasalahan dalam mengelolanya administrasi suarat, baik ketika dalam pencatatan surat maupun proses disposisi, serta pada saat pencarian arsip surat. Hampir sebagai besar surat yang ada di kantor-kantor masih disimpan dalam bentuk file-file yang masih bersifat manual, sehingga memungkinkan surat tersebut menunpuk, dan memerlukan waktu yang lama dalam pencarian dan pemrosesannya. Dengan dibangunnya sistem manajemen pengarsipan bertujuan untuk mengatasi permasalahan-permasalahan tersebut di atas. Pemrograman saat ini, baik desktop maupun web based, semakin marak pengerjaannya menggunakan framework code igniter berbasis PHP. Framework CI memang dikembangkan untuk memudahkan dalam developing aplikasi dengan struktur file source code-nya menggunakan pendekatan Models-Views-Controller (MVC) dan pemrograman berorientasi objek, oleh sebab itu penulis menggunakan CI dalam developing aplikasi ini. Aplikasi sistem manajemen surat dan pengarsipan ini dapat diakses dalam internal perusahaan web, yang bertujuan untuk memudahkan karyawan dalam pengelolaan dan mengaksesnya surat menyurat, selain itu aplikasi ini juga memberikan kemudahan dalam proses pencatatan surat, disposisi, dan proses pencarian sehingga aplikasisi ini memiliki performa yang handal, mudah untuk di-maintenance dan dikembangkan lebih lanjut seiring perkembangan kebutuhan penggunanya.

Download Full-text

Impact of Clone Refactoring on External Quality Attributes of Open Source Softwares

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit183833 ◽

2018 ◽

pp. 86-94

Author(s):

Himanshi Vashisht ◽

Sanjay Bharadwaj ◽

Sushma Sharma

Keyword(s):

Open Source ◽

Internal Structure ◽

Software Quality ◽

Source Code ◽

Quality Attributes ◽

Software Component ◽

External Quality ◽

Code Refactoring ◽

Observable Behaviour

Code refactoring is a “Process of restructuring an existing source code.”. It also helps in improving the internal structure of the code without really affecting its external behaviour”. It changes a source code in such a way that it does not alter the external behaviour yet still it improves its internal structure. It is a way to clean up code that minimizes the chances of introducing bugs. Refactoring is a change made to the internal structure of a software component to make it easier to understand and cheaper to modify, without changing the observable behaviour of that software component. Bad smells indicate that there is something wrong in the code that have to refactor. There are different tools that are available to identify and emove these bad smells. A software has two types of quality attributes- Internal and external. In this paper we will study the effect of clone refactoring on software quality attributes.

Download Full-text

A practical approach on non-regular sampling and universal demosaicing of raw image sensor data

London Imaging Meeting ◽

10.2352/issn.2694-118x.2020.lim-17 ◽

2020 ◽

Vol 2020 (1) ◽

pp. 91-95

Author(s):

Philipp Backes ◽

Jan Fröhlich

Keyword(s):

Image Quality ◽

Source Code ◽

Image Sensor ◽

Sensor Data ◽

Color Filter ◽

Lowpass Filter ◽

Sampling Errors ◽

Regular Sampling ◽

Single Sensor ◽

Similar Image Quality

Non-regular sampling is a well-known method to avoid aliasing in digital images. However, the vast majority of single sensor cameras use regular organized color filter arrays (CFAs), that require an optical-lowpass filter (OLPF) and sophisticated demosaicing algorithms to suppress sampling errors. In this paper a variety of non-regular sampling patterns are evaluated, and a new universal demosaicing algorithm based on the frequency selective reconstruction is presented. By simulating such sensors it is shown that images acquired with non-regular CFAs and no OLPF can lead to a similar image quality compared to their filtered and regular sampled counterparts. The MATLAB source code and results are available at: http://github. com/PhilippBackes/dFSR

Download Full-text