scholarly journals Asteria: Deep Learning-based AST-Encoding for Cross-platform Binary Code Similarity Detection

Author(s):  
Shouguo Yang ◽  
Long Cheng ◽  
Yicheng Zeng ◽  
Zhe Lang ◽  
Hongsong Zhu ◽  
...  
2021 ◽  
Vol 18 (4) ◽  
pp. 4528-4551
Author(s):  
Xiaodong Zhu ◽  
◽  
Liehui Jiang ◽  
Zeng Chen ◽  

2021 ◽  
Vol 168 ◽  
pp. 114348
Author(s):  
Donghai Tian ◽  
Xiaoqi Jia ◽  
Rui Ma ◽  
Shuke Liu ◽  
Wenjing Liu ◽  
...  

2019 ◽  
Vol 11 (1) ◽  
pp. 1-1
Author(s):  
Sabrina Kletz ◽  
Marco Bertini ◽  
Mathias Lux

Having already discussed MatConvNet and Keras, let us continue with an open source framework for deep learning, which takes a new and interesting approach. TensorFlow.js is not only providing deep learning for JavaScript developers, but it's also making applications of deep learning available in the WebGL enabled web browsers, or more specifically, Chrome, Chromium-based browsers, Safari and Firefox. Recently node.js support has been added, so TensorFlow.js can be used to directly control TensorFlow without the browser. TensorFlow.js is easy to install. As soon as a browser is installed one is ready to go. Browser based, cross platform applications, e.g. running with Electron, can also make use of TensorFlow.js without an additional install. The performance, however, depends on the browser the client is running, and memory and GPU on the client device. More specifically, one cannot expect to analyze 4K videos on a mobile phone in real time. While it's easy to install, and it's easy to develop based on TensorFlow.js, there are drawbacks: (i) developers have less control over where the machine learning actually takes place (e.g. on CPU or GPU), that it is running in the same sandbox as all web pages in the browser do, and (ii) that in the current release it still has rough edges and is not considered stable enough to use in production.


2020 ◽  
Vol 34 (01) ◽  
pp. 1145-1152 ◽  
Author(s):  
Zeping Yu ◽  
Rui Cao ◽  
Qiyi Tang ◽  
Sen Nie ◽  
Junzhou Huang ◽  
...  

Binary code similarity detection, whose goal is to detect similar binary functions without having access to the source code, is an essential task in computer security. Traditional methods usually use graph matching algorithms, which are slow and inaccurate. Recently, neural network-based approaches have made great achievements. A binary function is first represented as an control-flow graph (CFG) with manually selected block features, and then graph neural network (GNN) is adopted to compute the graph embedding. While these methods are effective and efficient, they could not capture enough semantic information of the binary code. In this paper we propose semantic-aware neural networks to extract the semantic information of the binary code. Specially, we use BERT to pre-train the binary code on one token-level task, one block-level task, and two graph-level tasks. Moreover, we find that the order of the CFG's nodes is important for graph similarity detection, so we adopt convolutional neural network (CNN) on adjacency matrices to extract the order information. We conduct experiments on two tasks with four datasets. The results demonstrate that our method outperforms the state-of-art models.


Author(s):  
Zhengping Luo ◽  
Tao Hou ◽  
Xiangrong Zhou ◽  
Hui Zeng ◽  
Zhuo Lu

2020 ◽  
Vol 49 (4) ◽  
pp. 495-510
Author(s):  
Muhammad Mansoor ◽  
Zahoor ur Rehman ◽  
Muhammad Shaheen ◽  
Muhammad Attique Khan ◽  
Mohamed Habib

Similarity detection in the text is the main task for a number of Natural Language Processing (NLP) applications. As textual data is comparatively large in quantity and huge in volume than the numeric data, therefore measuring textual similarity is one of the important problems. Most of the similarity detection algorithms are based upon word to word matching, sentence/paragraph matching, and matching of the whole document. In this research, a novel approach is proposed using deep learning models, combining Long Short Term Memory network (LSTM) with Convolutional Neural Network (CNN) for measuring semantics similarity between two questions. The proposed model takes sentence pairs as input to measure the similarity between them. The model is tested on publicly available Quora’s dataset. The model in comparison to the existing techniques gave 87.50 % accuracy which is better than the previous approaches.


IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 120501-120512
Author(s):  
Hui Guo ◽  
Shuguang Huang ◽  
Cheng Huang ◽  
Min Zhang ◽  
Zulie Pan ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document