Symbolic value-flow static analysis: deep, precise, complete modeling of Ethereum smart contracts

We present a static analysis approach that combines concrete values and symbolic expressions. This symbolic value-flow (“symvalic”) analysis models program behavior with high precision, e.g., full path sensitivity. To achieve deep modeling of program semantics, the analysis relies on a symbiotic relationship between a traditional static analysis fixpoint computation and a symbolic solver: the solver does not merely receive a complex “path condition” to solve, but is instead invoked repeatedly (often tens or hundreds of thousands of times), in close cooperation with the flow computation of the analysis. The result of the symvalic analysis architecture is a static modeling of program behavior that is much more complete than symbolic execution, much more precise than conventional static analysis, and domain-agnostic: no special-purpose definition of anti-patterns is necessary in order to compute violations of safety conditions with high precision. We apply the analysis to the domain of Ethereum smart contracts. This domain represents a fundamental challenge for program analysis approaches: despite numerous publications, research work has not been effective at uncovering vulnerabilities of high real-world value. In systematic comparison of symvalic analysis with past tools, we find significantly increased completeness (shown as 83-96% statement coverage and more true error reports) combined with much higher precision, as measured by rate of true positive reports. In terms of real-world impact, since the beginning of 2021, the analysis has resulted in the discovery and disclosure of several critical vulnerabilities, over funds in the many millions of dollars. Six separate bug bounties totaling over $350K have been awarded for these disclosures.

Download Full-text

Obtaining Real-World Benchmark Programs from Open-Source Repositories Through Abstract-Semantics Preserving Transformations

10.18122/td/1644/boisestate ◽

2020 ◽

Author(s):

Maria Paquin

Keyword(s):

Open Source ◽

Static Analysis ◽

Real World ◽

Program Analysis ◽

Symbolic Execution ◽

Analysis Techniques ◽

Second Stage ◽

The Third ◽

Third Stage ◽

Transformation Algorithms

Benchmark programs are an integral part of program analysis research. Researchers use benchmark programs to evaluate existing techniques and test the feasibility of new approaches. The larger and more realistic the set of benchmarks, the more confident a researcher can be about the correctness and reproducibility of their results. However, obtaining an adequate set of benchmark programs has been a long-standing challenge in the program analysis community. In this thesis, we present the APT tool, a framework we designed and implemented to automate the generation of realistic benchmark programs suitable for program analysis evaluations. Our tool targets intra-procedural analyses that operate on an integer domain, specifically symbolic execution. The framework is composed of three main stages. In the first stage, the tool extracts potential benchmark programs from open-source repositories suitable for symbolic execution. In the second stage, the tool transforms the extracted programs into compilable, stand-alone benchmarks by removing external dependencies and nonlinear expressions. In the third stage, the benchmarks are verified and made available for the user. We have designed our transformation algorithms to remove program dependencies and nonlinear expressions while preserving their semantics-equivalence in the abstraction of symbolic analysis. That is, we want the information the analysis computes on the original program and its transformed version to be equivalent. Our work provides static analysis researchers with concise, compilable benchmark programs that are relevant to symbolic execution, allowing them to focus their efforts on advancing analysis techniques. Furthermore, our work benefits the software engineering community by enabling static analysis researchers to perform benchmarking with a large, realistic set of programs, thus strengthening the empirical evidence of the advancements in static program analysis.

Download Full-text

Detection of the Hardcoded Login Information from Socket and String Compare Symbols

Annals of Emerging Technologies in Computing ◽

10.33166/aetic.2021.01.003 ◽

2021 ◽

Vol 5 (1) ◽

pp. 28-39

Author(s):

Minami Yoda ◽

Shuji Sakuraba ◽

Yuichi Sei ◽

Yasuyuki Tahara ◽

Akihiko Ohsuga

Keyword(s):

Internet Of Things ◽

Static Analysis ◽

Real World ◽

Symbolic Execution ◽

The Internet ◽

User Input ◽

Network Function ◽

Private Data ◽

String Search ◽

Iot Devices

Internet of Things (IoT) for smart homes enhances convenience; however, it also introduces the risk of the leakage of private data. TOP10 IoT of OWASP 2018 shows that the first vulnerability is ”Weak, easy to predict, or embedded passwords.” This problem poses a risk because a user can not fix, change, or detect a password if it is embedded in firmware because only the developer of the firmware can control an update. In this study, we propose a lightweight method to detect the hardcoded username and password in IoT devices using a static analysis called Socket Search and String Search to protect from first vulnerability from 2018 OWASP TOP 10 for the IoT device. The hardcoded login information can be obtained by comparing the user input with strcmp or strncmp. Previous studies analyzed the symbols of strcmp or strncmp to detect the hardcoded login information. However, those studies required a lot of time because of the usage of complicated algorithms such as symbolic execution. To develop a lightweight algorithm, we focus on a network function, such as the socket symbol in firmware, because the IoT device is compromised when it is invaded by someone via the Internet. We propose two methods to detect the hardcoded login information: string search and socket search. In string search, the algorithm finds a function that uses the strcmp or strncmp symbol. In socket search, the algorithm finds a function that is referenced by the socket symbol. In this experiment, we measured the ability of our proposed method by searching six firmware in the real world that has a backdoor. We ran three methods: string search, socket search, and whole search to compare the two methods. As a result, all methods found login information from five of six firmware and one unexpected password. Our method reduces the analysis time. The whole search generally takes 38 mins to complete, but our methods finish the search in 4-6 min.

Download Full-text

Heuristic Guided Selective Path Exploration for Loop Structure in Coverage Testing

International Journal of Open Source Software and Processes ◽

10.4018/ijossp.2017040104 ◽

2017 ◽

Vol 8 (2) ◽

pp. 59-75

Author(s):

Xu-zhou Zhang ◽

Yun-zhan Gong ◽

Ya-Wen Wang

Keyword(s):

Program Analysis ◽

Symbolic Execution ◽

Loop Structure ◽

Program Behavior ◽

Function Calls ◽

Heuristic Strategy ◽

Coverage Testing ◽

Dynamic Execution ◽

And Function ◽

Combinatorial Strategy

Static program analysis is a strong technique for analyzing program behavior, but suffers from scalability problem, such as path explosion which is caused by the presence of loops and function calls. This article applies the selective execution mechanism and heuristic strategy on exploring paths through loops. This combinatorial strategy tries to alleviate the path explosion problem from three aspects: 1) exploring loops with different approaches according to their relative position to a specific target; 2) combining static analysis, dynamic execution, and symbolic execution to deal with the separated program; 3) applying a heuristic strategy on offering guidance for the path exploration. These approaches are integrated to automatically generate paths for specified targets in loop structure. Experimental results show that the authors' proposed strategy is available for combination of different loops. It outperforms some existing techniques on achieving better coverage for programs containing loops, and is applicable in engineering.

Download Full-text

Model checking driven static analysis for the real world: designing and tuning large scale bug detection

Innovations in Systems and Software Engineering ◽

10.1007/s11334-012-0192-5 ◽

2012 ◽

Vol 9 (1) ◽

pp. 45-56 ◽

Cited By ~ 5

Author(s):

Ansgar Fehnker ◽

Ralf Huuck

Keyword(s):

Model Checking ◽

Static Analysis ◽

Real World ◽

Large Scale ◽

Bug Detection ◽

The Real

Download Full-text

Visual-Inertial Odometer-Based Global High Precision Indoor Human Navigation in a University Library

Abstracts of the ICA ◽

10.5194/ica-abs-1-142-2019 ◽

2019 ◽

Vol 1 ◽

pp. 1-2

Author(s):

Shinpei Ito ◽

Akinori Takahashi ◽

Ruochen Si ◽

Masatoshi Arikawa

Keyword(s):

Coordinate System ◽

High Precision ◽

Real World ◽

Indoor Navigation ◽

Experimental Result ◽

Coordinate Space ◽

Central Position ◽

Global Coordinate System ◽

Prototype System ◽

3D Environments

<p><strong>Abstract.</strong> AR (Augmented Reality) could be realized as a basic and high-level function on latest smartphones with a reasonable price. AR enables users to experience consistent three-dimensional (3D) spaces co-existing with 3D real and virtual objects with sensing real 3D environments and reconstructing them in the virtual world through a camera. The accuracy of sensing real 3D environments using an AR function, that is, visual-inertial odometer, of a smartphone is extremely higher than one of a GPS receiver on it, and can be less than one centimeter. However, current common AR applications generally focus on “small” real 3D spaces, not large real 3D spaces. In other words, most of the current AR applications are not designed for uses based on a geographic coordinate system.</p><p>We proposed a global extension of the visual-inertial odometer with an image recognition function of geo-referenced image markers installed in real 3D spaces. Examples of geo-referenced image markers can be generated from analog guide boards existing in the real world. We tested this framework of a global extension of the visual-inertial odometer embedded in a smartphone on the first floor in the central library of Akita University. The geo-referenced image markers such as floor map boards and book categories sign boards were registered in a database of 3D geo-referenced real-world scene images. Our prototype system developed on a smartphone, that is, iPhone XS, Apple Inc., could first recognized a floor map board (Fig. 1), and could determine the 3D precise distance and direction of the smartphone from the central position of the floor map board in a local 3D coordinate space with the origin point as the central positon of the board. Then, the system could convert the relative precise position and the relative direction of the smartphone’s camera in a local coordinate space into a global precise location and orientation of it. A subject was walking the first floor in the building of the library with a world tracking function of the smartphone. The experimental result shows that the error of tracking a real 3D space of a global coordinate system was accumulated, but not bad. The accumulated error was only about 30 centimeters after the subject’s walking about 30 meters (Fig. 2). We are now planning to improve our prototype system in the accuracy of indoor navigation with calibrating the location and orientation of a smartphone based sequential recognitions of multiple referenced scene image markers which have already existed for a general user services of the library before developing this proposed new services. As the conclusion, the experiment’s result of testing our prototype system was impressive, we are now preparing a more practical high-precision LBS which enables a user to be navigated to the exact location of a book of a user’s interest in a bookshelf on a floor with AR and floor map interfaces.</p>

Download Full-text

A Runtime System for XML Transformations in Java

BRICS Report Series ◽

10.7146/brics.v11i33.21858 ◽

2004 ◽

Vol 11 (33) ◽

Author(s):

Aske Simon Christensen ◽

Christian Kirkegaard ◽

Anders Møller

Keyword(s):

Static Analysis ◽

Programming Language ◽

Program Analysis ◽

Companion Paper ◽

General Purpose ◽

Runtime System ◽

Xml Documents ◽

Level Data ◽

High Level ◽

The Given

We show that it is possible to extend a general-purpose programming language with a convenient high-level data-type for manipulating XML documents while permitting (1) precise static analysis for guaranteeing validity of the constructed XML documents relative to the given DTD schemas, and (2) a runtime system where the operations can be performed efficiently. The system, named Xact, is based on a notion of immutable XML templates and uses XPath for deconstructing documents. A companion paper presents the program analysis; this paper focuses on the efficient runtime representation.

Download Full-text

CONFUZZIUS: A Data Dependency-Aware Hybrid Fuzzer for Smart Contracts

10.36227/techrxiv.14192459.v1 ◽

2021 ◽

Author(s):

Christof Ferreira Torres ◽

Antonio Ken Iannillo ◽

Arthur Gervais ◽

Radu State

Keyword(s):

State Of The Art ◽

Symbolic Execution ◽

Hybrid Approach ◽

Data Dependency ◽

Smart Contracts ◽

Dependency Analysis ◽

Bug Detection ◽

Code Coverage ◽

Traditional Programs ◽

Turing Complete

<div> <div> <p>Smart contracts are Turing-complete programs that are executed across a blockchain. Unlike traditional programs, once deployed, they cannot be modified. As smart contracts carry more value, they become more of an exciting target for attackers. Over the last years, they suffered from exploits costing millions of dollars due to simple programming mistakes. As a result, a variety of tools for detecting bugs have been proposed. Most of these tools rely on symbolic execution, which may yield false positives due to over-approximation. Recently, many fuzzers have been proposed to detect bugs in smart contracts. However, these tend to be more effective in finding shallow bugs and less effective in finding bugs that lie deep in the execution, therefore achieving low code coverage and many false negatives. An alternative that has proven to achieve good results in traditional programs is hybrid fuzzing, a combination of symbolic execution and fuzzing. In this work, we study hybrid fuzzing on smart contracts and present ConFuzzius, the first hybrid fuzzer for smart contracts. ConFuzzius uses evolutionary fuzzing to exercise shallow parts of a smart contract and constraint solving to generate inputs that satisfy complex conditions that prevent evolutionary fuzzing from exploring deeper parts. Moreover, ConFuzzius leverages dynamic data dependency analysis to efficiently generate sequences of transactions that are more likely to result in contract states in which bugs may be hidden. We evaluate the effectiveness of ConFuzzius by comparing it with state-of-the-art symbolic execution tools and fuzzers for smart contracts. Our evaluation on a curated dataset of 128 contracts and a dataset of 21K real-world contracts shows that our hybrid approach detects more bugs than state-of-the-art tools (up to 23%) and that it outperforms existing tools in terms of code coverage (up to 69%). We also demonstrate that data dependency analysis can boost bug detection up to 18%.</p> </div> </div>

Download Full-text

What Do We Know About Buffer Overflow Detection?

International Journal of Systems and Software Security and Protection ◽

10.4018/ijsssp.2018070101 ◽

2018 ◽

Vol 9 (3) ◽

pp. 1-33 ◽

Cited By ~ 1

Author(s):

Marcos Lordello Chaim ◽

Daniel Soares Santos ◽

Daniela Soares Cruzes

Keyword(s):

Program Analysis ◽

Ad Hoc ◽

Symbolic Execution ◽

Buffer Overflow ◽

False Alarms ◽

Memory Errors ◽

Number Of False Alarms ◽

Extensive Body ◽

Inspection Techniques ◽

Execution Models

Buffer overflow (BO) is a well-known and widely exploited security vulnerability. Despite the extensive body of research, BO is still a threat menacing security-critical applications. The authors present a comprehensive systematic review on techniques intended to detecting BO vulnerabilities before releasing a software to production. They found that most of the studies addresses several vulnerabilities or memory errors, being not specific to BO detection. The authors organized them in seven categories: program analysis, testing, computational intelligence, symbolic execution, models, and code inspection. Program analysis, testing and code inspection techniques are available for use by the practitioner. However, program analysis adoption is hindered by the high number of false alarms; testing is broadly used but in ad hoc manner; and code inspection can be used in practice provided it is added as a task of the software development process. New techniques combining object code analysis with techniques from different categories seem a promising research avenue towards practical BO detection.

Download Full-text

Code, and Other Laws of Blockchain†

Oxford Journal of Legal Studies ◽

10.1093/ojls/gqaa018 ◽

2020 ◽

Vol 40 (3) ◽

pp. 645-665

Author(s):

Mimi Zou

Keyword(s):

Social Norms ◽

Legal System ◽

Real World ◽

Smart Contracts ◽

Focal Points ◽

Relational Analysis ◽

Blockchain Technology ◽

The Law

Abstract There has been burgeoning interest among legal scholars in recent years regarding the implications of blockchain technology for the law. Two thoughtful monographs that go beyond the hyped claims of enthusiasts and cynics are Primavera De Filippi and Aaron Wright’s Blockchain and the Law: The Rule of Code and Kevin Werbach’s Blockchain and the New Architecture of Trust. While the two books have different focal points, both contain a common Laurence-Lessig-inspired theme of ‘code as law’ in which decentralised blockchain networks are viewed as a regulatory ‘modality’ or ‘architecture’ with its own system of rules. However, as this article argues, blockchain is not outside the law or the existing legal system. Code necessarily interacts with other modes of regulation, namely the market, social norms and law, in constraining the operation of blockchain applications such as smart contracts. This argument also situates smart contracts in a relational analysis of real-world contracting practices.

Download Full-text

CODE-CHANGE IMPACT ANALYSIS USING COUNTERFACTUALS: THEORY AND IMPLEMENTATION

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194013500460 ◽

2013 ◽

Vol 23 (10) ◽

pp. 1459-1486

Author(s):

MANUEL PERALTA ◽

SUPRATIK MUKHOPADHYAY

Keyword(s):

Static Analysis ◽

Theorem Proving ◽

Program Analysis ◽

Impact Analysis ◽

Source Code ◽

Analysis Framework ◽

Change Impact Analysis ◽

Code Change ◽

Change Impact ◽

Automated Tool

This article shows a novel program analysis framework based on Lewis' theory of counterfactuals. Using this framework we are capable of performing change-impact static analysis on a program's source code. In other words, we are able to prove the properties induced by changes to a given program before applying these changes. Our contribution is two-fold; we show how to use Lewis' logic of counterfactuals to prove that proposed changes to a program preserve its correctness. We report the development of an automated tool based on resolution and theorem proving for performing code change-impact analysis.

Download Full-text