scholarly journals Topology-Inspired Method Recovers Obfuscated Term Information From Induced Software Call-Stacks

Author(s):  
Kelly Maggs ◽  
Vanessa Robins

Fuzzing is a systematic large-scale search for software vulnerabilities achieved by feeding a sequence of randomly mutated input files to the program of interest with the goal being to induce a crash. The information about inputs, software execution traces, and induced call stacks (crashes) can be used to pinpoint and fix errors in the code or exploited as a means to damage an adversary’s computer software. In black box fuzzing, the primary unit of information is the call stack: a list of nested function calls and line numbers that report what the code was executing at the time it crashed. The source code is not always available in practice, and in some situations even the function names are deliberately obfuscated (i.e., removed or given generic names). We define a topological object called the call-stack topology to capture the relationships between module names, function names and line numbers in a set of call stacks obtained via black-box fuzzing. In a proof-of-concept study, we show that structural properties of this object in combination with two elementary heuristics allow us to build a logistic regression model to predict the locations of distinct function names over a set of call stacks. We show that this model can extract function name locations with around 80% precision in data obtained from fuzzing studies of various linux programs. This has the potential to benefit software vulnerability experts by increasing their ability to read and compare call stacks more efficiently.

2020 ◽  
Author(s):  
Cut Nabilah Damni

AbstrakSoftware komputer atau perangkat lunak komputer merupakan kumpulan instruksi (program atau prosedur) untuk dapat melaksanakan pekerjaan secara otomatis dengan cara mengolah atau memproses kumpulan intruksi (data) yang diberikan. (Yahfizham, 2019 : 19) Sebagian besar dari software komputer dibuat oleh (programmer) dengan menggunakan bahasa pemprograman. Orang yang membuat bahasa pemprograman menuliskan perintah dalam bahasa pemprograman seperti layaknya bahasa yang digunakan oleh orang pada umumnya dalam melakukan perbincangan. Perintah-perintah tersebut dinamakan (source code). Program komputer lainnya dinamakan (compiler) yang digunakan pada (source code) dan kemudian mengubah perintah tersebut kedalam bahasa yang dimengerti oleh komputer lalu hasilnya dinamakan program executable (EXE). Pada dasarnya, komputer selalu memiliki perangkat lunak komputer atau software yang terdiri dari sistem operasi, sistem aplikasi dan bahasa pemograman.AbstractComputer software or computer software is a collection of instructions (programs or procedures) to be able to carry out work automatically by processing or processing the collection of instructions (data) provided. (Yahfizham, 2019: 19) Most of the computer software is made by (programmers) using the programming language. People who make programming languages write commands in the programming language like the language used by people in general in conducting conversation. The commands are called (source code). Other computer programs called (compilers) are used in (source code) and then change the command into a language understood by the computer and the results are called executable programs (EXE). Basically, computers always have computer software or software consisting of operating systems, application systems and programming languages.


Author(s):  
Ruiyang Song ◽  
Kuang Xu

We propose and analyze a temporal concatenation heuristic for solving large-scale finite-horizon Markov decision processes (MDP), which divides the MDP into smaller sub-problems along the time horizon and generates an overall solution by simply concatenating the optimal solutions from these sub-problems. As a “black box” architecture, temporal concatenation works with a wide range of existing MDP algorithms. Our main results characterize the regret of temporal concatenation compared to the optimal solution. We provide upper bounds for general MDP instances, as well as a family of MDP instances in which the upper bounds are shown to be tight. Together, our results demonstrate temporal concatenation's potential of substantial speed-up at the expense of some performance degradation.


Technologies ◽  
2020 ◽  
Vol 9 (1) ◽  
pp. 3
Author(s):  
Gábor Antal ◽  
Zoltán Tóth ◽  
Péter Hegedűs ◽  
Rudolf Ferenc

Bug prediction aims at finding source code elements in a software system that are likely to contain defects. Being aware of the most error-prone parts of the program, one can efficiently allocate the limited amount of testing and code review resources. Therefore, bug prediction can support software maintenance and evolution to a great extent. In this paper, we propose a function level JavaScript bug prediction model based on static source code metrics with the addition of a hybrid (static and dynamic) code analysis based metric of the number of incoming and outgoing function calls (HNII and HNOI). Our motivation for this is that JavaScript is a highly dynamic scripting language for which static code analysis might be very imprecise; therefore, using a purely static source code features for bug prediction might not be enough. Based on a study where we extracted 824 buggy and 1943 non-buggy functions from the publicly available BugsJS dataset for the ESLint JavaScript project, we can confirm the positive impact of hybrid code metrics on the prediction performance of the ML models. Depending on the ML algorithm, applied hyper-parameters, and target measures we consider, hybrid invocation metrics bring a 2–10% increase in model performances (i.e., precision, recall, F-measure). Interestingly, replacing static NOI and NII metrics with their hybrid counterparts HNOI and HNII in itself improves model performances; however, using them all together yields the best results.


Author(s):  
Willem Vos ◽  
Petter Norli ◽  
Emilie Vallee

This paper describes a novel technique for the detection of cracks in pipelines. The proposed in-line inspection technique has the ability to detect crack features at random angles in the pipeline, such as axial, circumferential, and any angle in between. This ability is novel to the current ILI technology offering and will also add value by detecting cracks in deformed pipes (i.e. in dents), and cracks associated with the girth weld (mid weld cracks, rapid cooling cracks and cracks parallel to the weld). Furthermore, the technology is suitable for detection of cracks in spiral welded pipes, both parallel to the spiral weld as well as perpendicular to the weld. Integrity issues around most features described above are not addressed with ILI tools, often forcing operators to perform hydrostatic tests to ensure pipeline safety. The technology described here is based on the use of wideband ultrasound inline inspection tools that are already in operation. They are designed for the inspection of structures operating in challenging environments such as offshore pipelines. Adjustments to the front-end analog system and data collection from a grid of transducers allow the tools to detect cracks in any orientation in the line. Description of changes to the test set-up are presented as well as the theoretical background behind crack detection. Historical development of the technology will be presented, such as early laboratory testing and proof of concept. The proof of concept data will be compared to the theoretical predictions. A detailed set of results are presented. These are from tests that were performed on samples sourced from North America and Europe which contain SCC features. Results from ongoing testing will be presented, which involved large-scale testing on SCC features in gas-filled pipe spools.


2021 ◽  
Author(s):  
Aleksandar Kovačević ◽  
Jelena Slivka ◽  
Dragan Vidaković ◽  
Katarina-Glorija Grujić ◽  
Nikola Luburić ◽  
...  

<p>Code smells are structures in code that often have a negative impact on its quality. Manually detecting code smells is challenging and researchers proposed many automatic code smell detectors. Most of the studies propose detectors based on code metrics and heuristics. However, these studies have several limitations, including evaluating the detectors using small-scale case studies and an inconsistent experimental setting. Furthermore, heuristic-based detectors suffer from limitations that hinder their adoption in practice. Thus, researchers have recently started experimenting with machine learning (ML) based code smell detection. </p><p>This paper compares the performance of multiple ML-based code smell detection models against multiple traditionally employed metric-based heuristics for detection of God Class and Long Method code smells. We evaluate the effectiveness of different source code representations for machine learning: traditionally used code metrics and code embeddings (code2vec, code2seq, and CuBERT).<br></p><p>We perform our experiments on the large-scale, manually labeled MLCQ dataset. We consider the binary classification problem – we classify the code samples as smelly or non-smelly and use the F1-measure of the minority (smell) class as a measure of performance. In our experiments, the ML classifier trained using CuBERT source code embeddings achieved the best performance for both God Class (F-measure of 0.53) and Long Method detection (F-measure of 0.75). With the help of a domain expert, we perform the error analysis to discuss the advantages of the CuBERT approach.<br></p><p>This study is the first to evaluate the effectiveness of pre-trained neural source code embeddings for code smell detection to the best of our knowledge. A secondary contribution of our study is the systematic evaluation of the effectiveness of multiple heuristic-based approaches on the same large-scale, manually labeled MLCQ dataset.<br></p>


Author(s):  
Emanuele Iannone ◽  
Roberta Guadagni ◽  
Filomena Ferrucci ◽  
Andrea De Lucia ◽  
Fabio Palomba

Author(s):  
Manjula Peiris ◽  
James H. Hill

This chapter discusses how to adapt system execution traces to support analysis of software system performance properties, such as end-to-end response time, throughput, and service time. This is important because system execution traces contain complete snapshots of a systems execution—making them useful artifacts for analyzing software system performance properties. Unfortunately, if system execution traces do not contain the required properties, then analysis of performance properties is hard. In this chapter, the authors discuss: (1) what properties are required to analysis performance properties in a system execution trace; (2) different approaches for injecting the required properties into a system execution trace to support performance analysis; and (3) show, by example, the solution for one approach that does not require modifying the original source code of the system that produced the system execution.


Sign in / Sign up

Export Citation Format

Share Document