scholarly journals The effectiveness of source code obfuscation: An experimental assessment

Author(s):  
Mariano Ceccato ◽  
Massimiliano Di Penta ◽  
Jasvir Nagra ◽  
Paolo Falcarin ◽  
Filippo Ricca ◽  
...  
2020 ◽  
Vol 2020 ◽  
pp. 1-15
Author(s):  
Feng Zhang ◽  
Lulu Li ◽  
Cong Liu ◽  
Qingtian Zeng

Source code similarity detection has extensive applications in computer programming teaching and software intellectual property protection. In the teaching of computer programming courses, students may utilize some complex source code obfuscation techniques, e.g., opaque predicates, loop unrolling, and function inlining and outlining, to reduce the similarity between code fragments and avoid the plagiarism detection. Existing source code similarity detection approaches only consider static features of source code, making it difficult to cope with more complex code obfuscation techniques. In this paper, we propose a novel source code similarity detection approach by considering the dynamic features at runtime of source code using process mining. More specifically, given two pieces of source code, their running logs are obtained by source code instrumentation and execution. Next, process mining is used to obtain the flow charts of the two pieces of source code by analyzing their collected running logs. Finally, similarity of the two pieces of source code is measured by computing the similarity of these two flow charts. Experimental results show that the proposed approach can deal with more complex obfuscation techniques including opaque predicates and loop unrolling as well as function inlining and outlining, which cannot be handled by existing work properly. Therefore, we argue that our approach can defeat commonly used code obfuscation techniques more effectively for source code similarity detection than the existing state-of-the-art approaches.


2014 ◽  
Vol 519-520 ◽  
pp. 391-394
Author(s):  
Zhi Yue Wang ◽  
Wei Min Wu

With the rapid development of scripting language, javascript has been widely used in the application and web development industry.One of the main disadvantages of javascript for developers of commercial applications is the ease by which customers and competitors can study, analyze, and reuse the source code . Therefore, this paper proposes a technique of javascript code obfuscation based on control flow tansformations. The experimental results prove that,its effective protect javascript source code.


Author(s):  
Krishna Yadav ◽  
Rutuja Kamble ◽  
Sumit Kale ◽  
Yash Pazare ◽  
Pallavi Ahire

2020 ◽  
Vol 2020 ◽  
pp. 1-15
Author(s):  
Feng Zhang ◽  
Guofan Li ◽  
Cong Liu ◽  
Qian Song

Source code similarity detection has various applications in code plagiarism detection and software intellectual property protection. In computer programming teaching, students may convert the source code written in one programming language into another language for their code assignment submission. Existing similarity measures of source code written in the same language are not applicable for the cross-language code similarity detection because of syntactic differences among different programming languages. Meanwhile, existing cross-language source similarity detection approaches are susceptible to complex code obfuscation techniques, such as replacing equivalent control structure and adding redundant statements. To solve this problem, we propose a cross-language code similarity detection (CLCSD) approach based on code flowcharts. In general, two source code fragments written in different programming languages are transformed into standardized code flowcharts (SCFC), and their similarity is obtained by measuring their corresponding SCFC. More specifically, we first introduce the standardized code flowchart (SCFC) model to be the uniform flowcharts representation of source code written in different languages. SCFC is language-independent, and therefore, it can be used as the intermediate structure for source code similarity detection. Meanwhile, transformation techniques are given to transform source code written in a specific programming language into an SCFC. Second, we propose the SCFC-SPGK algorithm based on the shortest path graph kernel to measure the similarity between two SCFCs. Thus, the similarity between two pieces of source code in different programming languages is given by the similarity between SCFCs. Experimental results show that compared with existing approaches, CLCSD has higher accuracy in cross-language source code similarity detection. Furthermore, CLCSD cannot only handle common source code obfuscation techniques used by students in computer programming teaching but also obtain nearly 90% accuracy in dealing with some complex obfuscation techniques.


2013 ◽  
Vol 61 (9) ◽  
pp. 46-50
Author(s):  
Veena Garg ◽  
Atul Srivastava ◽  
Atul Mishra

2019 ◽  
Vol 62 (11) ◽  
pp. 1671-1683 ◽  
Author(s):  
Roger Santos Ferreira ◽  
Denilson Alves Pereira

Abstract Sentiment analysis has been the main focus of plenty of research efforts, particularly justified by its commercial significance, both for consumers and businesses. Thus, many methods have been proposed so far, and the most prominent have been compared in terms of effectiveness. Nonetheless, the literature is deficient when it comes to assessing the efficiency of these methods for processing large volumes of data. In this study, we performed an experimental assessment of the efficiency of 22 methods in total, whose implementations were available. We also proposed and assessed an environment for distributed processing methods for sentiment analysis, using the Apache Spark platform, named BigFeel. In this environment, the existing methods, outlined to run in a non-distributed way, can be adapted, without altering their source code, to run in a distributed manner. The experimental results reveal that (i) few methods are efficient in their native form, (ii) the methods improve their efficiency after having been integrated into BigFeel, (iii) some of them, which were unfeasible to process a large dataset, became viable when deployed in a computer cluster and (iv) some methods can only handle small datasets, even in a distributed manner.


Author(s):  
Alessio Viticchie ◽  
Leonardo Regano ◽  
Marco Torchiano ◽  
Cataldo Basile ◽  
Mariano Ceccato ◽  
...  
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document