Recovering Runtime Structures of Software Systems from Static Source Code

Bug prediction aims at finding source code elements in a software system that are likely to contain defects. Being aware of the most error-prone parts of the program, one can efficiently allocate the limited amount of testing and code review resources. Therefore, bug prediction can support software maintenance and evolution to a great extent. In this paper, we propose a function level JavaScript bug prediction model based on static source code metrics with the addition of a hybrid (static and dynamic) code analysis based metric of the number of incoming and outgoing function calls (HNII and HNOI). Our motivation for this is that JavaScript is a highly dynamic scripting language for which static code analysis might be very imprecise; therefore, using a purely static source code features for bug prediction might not be enough. Based on a study where we extracted 824 buggy and 1943 non-buggy functions from the publicly available BugsJS dataset for the ESLint JavaScript project, we can confirm the positive impact of hybrid code metrics on the prediction performance of the ML models. Depending on the ML algorithm, applied hyper-parameters, and target measures we consider, hybrid invocation metrics bring a 2–10% increase in model performances (i.e., precision, recall, F-measure). Interestingly, replacing static NOI and NII metrics with their hybrid counterparts HNOI and HNII in itself improves model performances; however, using them all together yields the best results.

Download Full-text

Means of integrating static source security analysis technology into the software development environment

Modern information security ◽

10.31673/2409-7292.2020.035499 ◽

2020 ◽

Author(s):

N. V. Goryuk ◽

Keyword(s):

Software Development ◽

Static Analysis ◽

Software Security ◽

Source Code ◽

Security Analysis ◽

Development Environment ◽

Static Source ◽

Software Development Environment ◽

Further Development ◽

Code Development

The article investigates automation methods and means of integration of static source security analysis technology. The process of software security analysis, which is implemented by the technology of static analysis of the source code, is studied, and the methods of solving the problem of automation and integration of the technology into the source code development environment are offered. The perspective direction of further development of the technology of static analysis of the source code is established.

Download Full-text

Synthesis of Code Anomalies: Revealing Design Problems in the Source Code

10.5753/ctd.2016.9131 ◽

2020 ◽

Author(s):

Willian N. Oizumi ◽

Alessandro F. Garcia

Keyword(s):

Software Engineering ◽

Source Code ◽

New Technique ◽

Software Systems ◽

Identification Task ◽

Software Projects ◽

Design Problems ◽

Engineering Community ◽

Synthesis Technique ◽

A New Technique

Design problems affect most software projects and make their maintenance expensive and impeditive. Thus, the identification of potential design problems in the source code – which is very often the only available and upto-date artifact in a project – becomes essential in long-living software systems. This identification task is challenging as the reification of design problems in the source code tend to be scattered through several code elements. However, stateof-the-art techniques do not provide enough information to effectively help developers in this task. In this work, we address this challenge by proposing a new technique to support developers in revealing design problems. This technique synthesizes information about potential design problems, which are materialized in the implementation under the form of syntactic and semantic anomaly agglomerations. Our evaluation shows that the proposed synthesis technique helps to reveal more than 1200 design problems across 7 industry-strength systems, with a median precision of 71% and a median recall of 78%. The relevance of our work has been widely recognized by the software engineering community through 2 awards and 7 publications in international and national venues.

Download Full-text

Non-Intrusive Adaptation of System Execution Traces for Performance Analysis of Software Systems

Advances in Systems Analysis, Software Engineering, and High Performance Computing - Handbook of Research on Emerging Advancements and Technologies in Software Engineering ◽

10.4018/978-1-4666-6026-7.ch021 ◽

2014 ◽

pp. 473-493

Author(s):

Manjula Peiris ◽

James H. Hill

Keyword(s):

Performance Analysis ◽

System Performance ◽

Source Code ◽

Software Systems ◽

Software System ◽

Original Source ◽

Performance Properties ◽

Execution Traces ◽

Execution Trace ◽

Analysis Of Performance

This chapter discusses how to adapt system execution traces to support analysis of software system performance properties, such as end-to-end response time, throughput, and service time. This is important because system execution traces contain complete snapshots of a systems execution—making them useful artifacts for analyzing software system performance properties. Unfortunately, if system execution traces do not contain the required properties, then analysis of performance properties is hard. In this chapter, the authors discuss: (1) what properties are required to analysis performance properties in a system execution trace; (2) different approaches for injecting the required properties into a system execution trace to support performance analysis; and (3) show, by example, the solution for one approach that does not require modifying the original source code of the system that produced the system execution.

Download Full-text

DebCheck: Efficient Checking for Open Source Code Clones in Software Systems

2011 IEEE 19th International Conference on Program Comprehension ◽

10.1109/icpc.2011.27 ◽

2011 ◽

Cited By ~ 7

Author(s):

James R. Cordy ◽

Chanchal K. Roy

Keyword(s):

Open Source ◽

Source Code ◽

Software Systems ◽

Code Clones ◽

Open Source Code

Download Full-text

Rank-based quality measurement of software systems in standardized source code

2007 10th International Conference on Computer and Information Technology ◽

10.1109/iccitechn.2007.4579434 ◽

2007 ◽

Author(s):

Md. Raihan Masud ◽

Md. Abul Khaer ◽

M. M. A. Hashem

Keyword(s):

Source Code ◽

Quality Measurement ◽

Software Systems

Download Full-text

Development in Vulkan: a domain-specific approach

Proceedings of the Institute for System Programming of RAS ◽

10.15514/ispras-2021-33(5)-11 ◽

2021 ◽

Vol 33 (5) ◽

pp. 181-204

Author(s):

Vladimir Frolov ◽

Vadim Sanzharov ◽

Vladimir Galaktionov ◽

Alexander Shcherbakov

Keyword(s):

Image Processing ◽

Ray Tracing ◽

Code Generation ◽

Source Code ◽

General Purpose ◽

Software Systems ◽

Specific Class ◽

Domain Specific ◽

Architectural Patterns ◽

C Program

In this paper we propose a high-level approach to developing GPU applications based on the Vulkan API. The purpose of the work is to reduce the complexity of developing and debugging applications that implement complex algorithms on the GPU using Vulkan. The proposed approach uses the technology of code generation by translating a C++ program into an optimized implementation in Vulkan, which includes automatic shader generation, resource binding, and the use of synchronization mechanisms (Vulkan barriers). The proposed solution is not a general-purpose programming technology, but specializes in specific tasks. At the same time, it has extensibility, which allows to adapt the solution to new problems. For single input C++ program, we can generate several implementations for different cases (via translator options) or different hardware. For example, a call to virtual functions can be implemented either through a switch construct in a kernel, or through sorting threads and an indirect dispatching via different kernels, or through the so-called callable shaders in Vulkan. Instead of creating a universal programming technology for building various software systems, we offer an extensible technology that can be customized for a specific class of applications. Unlike, for example, Halide, we do not use a domain-specific language, and the necessary knowledge is extracted from ordinary C++ code. Therefore, we do not extend with any new language constructs or directives and the input source code is assumed to be normal C++ source code (albeit with some restrictions) that can be compiled by any C++ compiler. We use pattern matching to find specific patterns (or patterns) in C++ code and convert them to GPU efficient code using Vulkan. Pattern are expressed through classes, member functions, and the relationship between them. Thus, the proposed technology makes it possible to ensure a cross-platform solution by generating different implementations of the same algorithm for different GPUs. At the same time, due to this, it allows you to provide access to specific hardware functionality required in computer graphics applications. Patterns are divided into architectural and algorithmic. The architectural pattern defines the domain and behavior of the translator as a whole (for example, image processing, ray tracing, neural networks, computational fluid dynamics and etc.). Algorithmic pattern express knowledge of data flow and control and define a narrower class of algorithms that can be efficiently implemented in hardware. Algorithmic patterns can occur within architectural patterns. For example, parallel reduction, compaction (parallel append), sorting, prefix sum, histogram calculation, map-reduce, etc. The proposed generator works on the principle of code morphing. The essence of this approach is that, having a certain class in the program and transformation rules, one can automatically generate another class with the desired properties (for example, the implementation of the algorithm on the GPU). The generated class inherits from the input class and thus has access to all data and functions of the input class. Overriding virtual functions in generated class helps user to carefully connect generated code to the other Vulkan code written by hand. Shaders can be generated in two variants: OpenCL shaders for google “clspv” compiler and GLSL shaders for an arbitrary GLSL compiler. Clspv variant is better for code which intensively uses pointers and the GLSL generator is better if specific HW features are used (like hardware ray tracing acceleration). We have demonstrated our technology on several examples related to image processing and ray tracing on which we get 30-100 times acceleration over multithreaded CPU implementation.

Download Full-text

A Context-Aware Neural Embedding for Function-Level Vulnerability Detection

Algorithms ◽

10.3390/a14110335 ◽

2021 ◽

Vol 14 (11) ◽

pp. 335

Author(s):

Hongwei Wei ◽

Guanjun Lin ◽

Lin Li ◽

Heming Jia

Keyword(s):

Data Flow ◽

Short Term Memory ◽

Source Code ◽

Control Flow ◽

Language Models ◽

Software Systems ◽

Support Vector ◽

Context Aware ◽

Vulnerability Detection ◽

Feature Representations

Exploitable vulnerabilities in software systems are major security concerns. To date, machine learning (ML) based solutions have been proposed to automate and accelerate the detection of vulnerabilities. Most ML techniques aim to isolate a unit of source code, be it a line or a function, as being vulnerable. We argue that a code segment is vulnerable if it exists in certain semantic contexts, such as the control flow and data flow; therefore, it is important for the detection to be context aware. In this paper, we evaluate the performance of mainstream word embedding techniques in the scenario of software vulnerability detection. Based on the evaluation, we propose a supervised framework leveraging pre-trained context-aware embeddings from language models (ELMo) to capture deep contextual representations, further summarized by a bidirectional long short-term memory (Bi-LSTM) layer for learning long-range code dependency. The framework takes directly a source code function as an input and produces corresponding function embeddings, which can be treated as feature sets for conventional ML classifiers. Experimental results showed that the proposed framework yielded the best performance in its downstream detection tasks. Using the feature representations generated by our framework, random forest and support vector machine outperformed four baseline systems on our data sets, demonstrating that the framework incorporated with ELMo can effectively capture the vulnerable data flow patterns and facilitate the vulnerability detection task.

Download Full-text

COMPARISON OF SOFTWARE COMPLEXITY OF SEARCH ALGORITHM USING CODE BASED COMPLEXITY METRICS

International Journal of Engineering Applied Sciences and Technology ◽

10.33564/ijeast.2021.v06i05.003 ◽

2021 ◽

Vol 6 (5) ◽

Author(s):

Bello Muriana ◽

Ogba Paul Onuh

Keyword(s):

Software Engineering ◽

Programming Languages ◽

Search Algorithm ◽

Source Code ◽

Search Algorithms ◽

Binary Search ◽

Software Systems ◽

Software Complexity ◽

Complexity Metrics ◽

Binary Search Algorithm

Measures of software complexity are essential part of software engineering. Complexity metrics can be used to forecast key information regarding the testability, reliability, and manageability of software systems from study of the source code. This paper presents the results of three distinct software complexity metrics that were applied to two searching algorithms (Linear and Binary search algorithm). The goal is to compare the complexity of linear and binary search algorithms implemented in (Python, Java, and C++ languages) and measure the sample algorithms using line of code, McCabe and Halstead metrics. The findings indicate that the program difficulty of Halstead metrics has minimal value for both linear and binary search when implemented in python. Analysis of Variance (ANOVA) was adopted to determine whether there is any statistically significant differences between the search algorithms when implemented in the three programming languages and it was revealed that the three (3) programming languages do not vary considerably for both linear and binary search techniques which implies that any of the (3) programming languages is suitable for coding linear and binary search algorithms.

Download Full-text

Một công cụ phát sinh mã giao diện người dùng từ lược đồ lớp trong UML

Tạp chí Khoa học ◽

10.54607/hcmue.js.14.12.317(2017) ◽

2019 ◽

Vol 14 (12) ◽

pp. 66

Author(s):

Trần Anh Thi ◽

Vũ Thanh Nguyên

Keyword(s):

Formal Specification ◽

Development Process ◽

Source Code ◽

Work Load ◽

Software Systems ◽

Interactive Software ◽

Design Concepts ◽

Class Diagrams ◽

And Behavior ◽

Behavior Models

Producing source code that implements the GUI takes a great deal of effort in software development, especially for interactive software systems. This work load, generally considered tedious and burdensome, is inadequately automated given the richness of conceptual design and behavior models generated in earlier stages of the development process. A few frameworks have been proposed for generating GUI code based on formal specification or code annotation, requiring extra work to be done in addition to conceptually designing the software system in question. We propose a mechanism that generates GUI code from UML class diagrams expressed in XMI. Our approach takes into account the associations between design concepts and their composition hierarchy that is explicitly expressed in the UML language.

Download Full-text