Towards Extracting Control Flow Abstraction with Static Disassembly for Binary Code

Author(s):  
Jinxin Ma ◽  
Zhoujun Li ◽  
Chaojian Hu
Keyword(s):  
2020 ◽  
Vol 34 (01) ◽  
pp. 1145-1152 ◽  
Author(s):  
Zeping Yu ◽  
Rui Cao ◽  
Qiyi Tang ◽  
Sen Nie ◽  
Junzhou Huang ◽  
...  

Binary code similarity detection, whose goal is to detect similar binary functions without having access to the source code, is an essential task in computer security. Traditional methods usually use graph matching algorithms, which are slow and inaccurate. Recently, neural network-based approaches have made great achievements. A binary function is first represented as an control-flow graph (CFG) with manually selected block features, and then graph neural network (GNN) is adopted to compute the graph embedding. While these methods are effective and efficient, they could not capture enough semantic information of the binary code. In this paper we propose semantic-aware neural networks to extract the semantic information of the binary code. Specially, we use BERT to pre-train the binary code on one token-level task, one block-level task, and two graph-level tasks. Moreover, we find that the order of the CFG's nodes is important for graph similarity detection, so we adopt convolutional neural network (CNN) on adjacency matrices to extract the order information. We conduct experiments on two tasks with four datasets. The results demonstrate that our method outperforms the state-of-art models.


2015 ◽  
Vol 27 (10) ◽  
pp. 793-820
Author(s):  
Jing Qiu ◽  
Xiaohong Su ◽  
Peijun Ma
Keyword(s):  

2012 ◽  
Vol 198-199 ◽  
pp. 374-379
Author(s):  
Wei Gao ◽  
Jing He ◽  
Ke An ◽  
Hong Xia Gao ◽  
Cheng Liu

The paper presents a reverse analysis and crack-found methodology with the functional model as the centre for OS trusted mechanism, pays attention to the functional model establishing of binary code and the description methodology. And it designs general XML-based storage structure to serve the storage and transformation of information cross levels in the level functional model of trusted mechanism, develops the plugin of IDA, which takes and stores routine Control Flow Graph and level functional model automatically from the binary code.


Author(s):  
Vaibhav Sharma ◽  
Soha Hussein ◽  
Michael W. Whalen ◽  
Stephen McCamant ◽  
Willem Visser

Abstract Path-merging is a known technique for accelerating symbolic execution. One technique, named “veritesting” by Avgerinos et al. uses summaries of bounded control-flow regions and has been shown to accelerate symbolic execution of binary code. But, when applied to symbolic execution of Java code, veritesting needs to be extended to summarize dynamically dispatched methods and exceptional control-flow. Such an extension of veritesting has been implemented in Java Ranger by implementing as an extension of Symbolic PathFinder, a symbolic executor for Java bytecode. In this paper, we briefly describe the architecture of Java Ranger and describe its setup for SV-COMP 2020.


2021 ◽  
Vol 2021 ◽  
pp. 1-19
Author(s):  
Yan Wang ◽  
Peng Jia ◽  
Cheng Huang ◽  
Jiayong Liu ◽  
Peisong He

Binary code similarity comparison is the technique that determines if two functions are similar by only considering their compiled form, which has many applications, including clone detection, malware classification, and vulnerability discovery. However, it is challenging to design a robust code similarity comparison engine since different compilation settings that make logically similar assembly functions appear to be very different. Moreover, existing approaches suffer from high-performance overheads, lower robustness, or poor scalability. In this paper, a novel solution HBinSim is proposed by employing the multiview features of the function to address these challenges. It first extracts the syntactic and semantic features of each basic block by static analysis. HBinSim further analyzes the function and constructs a syntactic attribute control flow graph and a semantic attribute control flow graph for each function. Then, a hierarchical attention graph embedding network is designed for graph-structured data processing. The network model has a hierarchical structure that mirrors the hierarchical structure of the function. It has three levels of attention mechanisms applied at the instruction, basic block, and function level, enabling it to attend differentially to more and less critical content when constructing the function representation. We conduct extensive experiments to evaluate its effectiveness and efficiency. The results show that our tool outperforms the state-of-the-art binary code similarity comparison tools by a large margin against compilation diversity clone searching. A real-world vulnerabilities search case further demonstrates the usefulness of our system.


2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
Shen Wang ◽  
Xunzhi Jiang ◽  
Xiangzhan Yu ◽  
Xiaohui Su

Binary code homology analysis refers to detecting whether two pieces of binary code are compiled from the same piece of source code, which is a fundamental technique for many security applications, such as vulnerability search, plagiarism detection, and malware detection. With the increase in critical vulnerabilities in IoT devices, homology analysis is increasingly needed to perform cross-platform vulnerability searches. Existing methods for cross-platform binary code homology detection usually convert binary code to instruction sequences and do semantic embedding of the sequences as if they were natural language. However, the gap between natural language and binary code is large, and the spatial features of the binary code are easily lost by directly comparing the semantics. In this paper, we propose a GRU-based graph embedding method to compare the homology of binary functions. First, the attribute control flow graph (ACFG) is built for the assembly function, then the GRU-based graph embedding neural network is used to generate the embedding vector for the ACFG, and finally the homology of the binary code is determined by calculating the distance between the embedding vectors. The experimental results show that our method greatly improves the detection accuracy of negative samples compared with Gemini, the latest method based on graph embedding binary code similarity detection.


2020 ◽  
Vol 16 (2) ◽  
pp. 214
Author(s):  
Wang Yong ◽  
Liu SanMing ◽  
Li Jun ◽  
Cheng Xiangyu ◽  
Zhou Wan

Sign in / Sign up

Export Citation Format

Share Document