scholarly journals Order Matters: Semantic-Aware Neural Networks for Binary Code Similarity Detection

2020 ◽  
Vol 34 (01) ◽  
pp. 1145-1152 ◽  
Author(s):  
Zeping Yu ◽  
Rui Cao ◽  
Qiyi Tang ◽  
Sen Nie ◽  
Junzhou Huang ◽  
...  

Binary code similarity detection, whose goal is to detect similar binary functions without having access to the source code, is an essential task in computer security. Traditional methods usually use graph matching algorithms, which are slow and inaccurate. Recently, neural network-based approaches have made great achievements. A binary function is first represented as an control-flow graph (CFG) with manually selected block features, and then graph neural network (GNN) is adopted to compute the graph embedding. While these methods are effective and efficient, they could not capture enough semantic information of the binary code. In this paper we propose semantic-aware neural networks to extract the semantic information of the binary code. Specially, we use BERT to pre-train the binary code on one token-level task, one block-level task, and two graph-level tasks. Moreover, we find that the order of the CFG's nodes is important for graph similarity detection, so we adopt convolutional neural network (CNN) on adjacency matrices to extract the order information. We conduct experiments on two tasks with four datasets. The results demonstrate that our method outperforms the state-of-art models.

2020 ◽  
Vol 2020 ◽  
pp. 1-11
Author(s):  
Zhenyu Yang ◽  
Mingge Zhang ◽  
Guojing Liu ◽  
Mingyu Li

The recommendation method based on user sessions is mainly to model sessions as sequences in the assumption that user behaviors are independent and identically distributed, and then to use deep semantic information mining through Deep Neural Networks. Nevertheless, user behaviors may be a nonindependent intention at irregular points in time. For example, users may buy painkillers, books, or clothes for different reasons at different times. However, this has not been taken seriously in previous studies. Therefore, we propose a session recommendation method based on Neural Differential Equations in an attempt to predict user behavior forward or backward from any point in time. We used Ordinary Differential Equations to train the Graph Neural Network and could predict forward or backward at any point in time to model the user's nonindependent sessions. We tested for four real datasets and found that our model achieved the expected results and was superior to the existing session-based recommendations.


Author(s):  
Zhengping Luo ◽  
Tao Hou ◽  
Xiangrong Zhou ◽  
Hui Zeng ◽  
Zhuo Lu

2021 ◽  
Vol 7 (8) ◽  
pp. 146
Author(s):  
Joshua Ganter ◽  
Simon Löffler ◽  
Ron Metzger ◽  
Katharina Ußling ◽  
Christoph Müller

Collecting real-world data for the training of neural networks is enormously time- consuming and expensive. As such, the concept of virtualizing the domain and creating synthetic data has been analyzed in many instances. This virtualization offers many possibilities of changing the domain, and with that, enabling the relatively fast creation of data. It also offers the chance to enhance necessary augmentations with additional semantic information when compared with conventional augmentation methods. This raises the question of whether such semantic changes, which can be seen as augmentations of the virtual domain, contribute to better results for neural networks, when trained with data augmented this way. In this paper, a virtual dataset is presented, including semantic augmentations and automatically generated annotations, as well as a comparison between semantic and conventional augmentation for image data. It is determined that the results differ only marginally for neural network models trained with the two augmentation approaches.


2021 ◽  
Vol 16 (1) ◽  
pp. 1-23
Author(s):  
Keyu Yang ◽  
Yunjun Gao ◽  
Lei Liang ◽  
Song Bian ◽  
Lu Chen ◽  
...  

Text classification is a fundamental task in content analysis. Nowadays, deep learning has demonstrated promising performance in text classification compared with shallow models. However, almost all the existing models do not take advantage of the wisdom of human beings to help text classification. Human beings are more intelligent and capable than machine learning models in terms of understanding and capturing the implicit semantic information from text. In this article, we try to take guidance from human beings to classify text. We propose Crowd-powered learning for Text Classification (CrowdTC for short). We design and post the questions on a crowdsourcing platform to extract keywords in text. Sampling and clustering techniques are utilized to reduce the cost of crowdsourcing. Also, we present an attention-based neural network and a hybrid neural network to incorporate the extracted keywords as human guidance into deep neural networks. Extensive experiments on public datasets confirm that CrowdTC improves the text classification accuracy of neural networks by using the crowd-powered keyword guidance.


2019 ◽  
Vol 8 (2) ◽  
pp. 98-104
Author(s):  
Mohd Ashraf ◽  
Md. Zair Hussain

Image analysis and understanding, stands tall amongst all the technologies and face recognition is an eminent part of it. A face database is maintained as a logbook to identify an input face. This is accomplished by mere comparison amongst the face database. There are several face recognition techniques, of which, symmetry, Elastic Bunch Graph Matching (EBGM), and analytic-to-holistic recognition have been explored in this research paper. Other peculiar approaches like image based face recognition techniques like MLP, convolutional neural network, eigenfaces, associative neural networks, recirculation neural network and independent component analysis have been thoroughly discussed. Two vibrant face recognition databases, UMIST and ORL have proved to be extremely important in analyzing the results of face recognition. Eigen Face value approach has been anticipated with the associated analysis of results of face recognition. Another approach in face recognition is optimized multiperceptron, which will be acting as the reference to the optimized eigenfaces approach in this research paper, hence making this study more efficient through comparison.


2020 ◽  
Vol 2020 (10) ◽  
pp. 54-62
Author(s):  
Oleksii VASYLIEV ◽  

The problem of applying neural networks to calculate ratings used in banking in the decision-making process on granting or not granting loans to borrowers is considered. The task is to determine the rating function of the borrower based on a set of statistical data on the effectiveness of loans provided by the bank. When constructing a regression model to calculate the rating function, it is necessary to know its general form. If so, the task is to calculate the parameters that are included in the expression for the rating function. In contrast to this approach, in the case of using neural networks, there is no need to specify the general form for the rating function. Instead, certain neural network architecture is chosen and parameters are calculated for it on the basis of statistical data. Importantly, the same neural network architecture can be used to process different sets of statistical data. The disadvantages of using neural networks include the need to calculate a large number of parameters. There is also no universal algorithm that would determine the optimal neural network architecture. As an example of the use of neural networks to determine the borrower's rating, a model system is considered, in which the borrower's rating is determined by a known non-analytical rating function. A neural network with two inner layers, which contain, respectively, three and two neurons and have a sigmoid activation function, is used for modeling. It is shown that the use of the neural network allows restoring the borrower's rating function with quite acceptable accuracy.


2019 ◽  
Vol 2019 (1) ◽  
pp. 153-158
Author(s):  
Lindsay MacDonald

We investigated how well a multilayer neural network could implement the mapping between two trichromatic color spaces, specifically from camera R,G,B to tristimulus X,Y,Z. For training the network, a set of 800,000 synthetic reflectance spectra was generated. For testing the network, a set of 8,714 real reflectance spectra was collated from instrumental measurements on textiles, paints and natural materials. Various network architectures were tested, with both linear and sigmoidal activations. Results show that over 85% of all test samples had color errors of less than 1.0 ΔE2000 units, much more accurate than could be achieved by regression.


2020 ◽  
Vol 64 (3) ◽  
pp. 30502-1-30502-15
Author(s):  
Kensuke Fukumoto ◽  
Norimichi Tsumura ◽  
Roy Berns

Abstract A method is proposed to estimate the concentration of pigments mixed in a painting, using the encoder‐decoder model of neural networks. The model is trained to output a value that is the same as its input, and its middle output extracts a certain feature as compressed information about the input. In this instance, the input and output are spectral data of a painting. The model is trained with pigment concentration as the middle output. A dataset containing the scattering coefficient and absorption coefficient of each of 19 pigments was used. The Kubelka‐Munk theory was applied to the coefficients to obtain many patterns of synthetic spectral data, which were used for training. The proposed method was tested using spectral images of 33 paintings, which showed that the method estimates, with high accuracy, the concentrations that have a similar spectrum of the target pigments.


Author(s):  
Muhammad Faheem Mushtaq ◽  
Urooj Akram ◽  
Muhammad Aamir ◽  
Haseeb Ali ◽  
Muhammad Zulqarnain

It is important to predict a time series because many problems that are related to prediction such as health prediction problem, climate change prediction problem and weather prediction problem include a time component. To solve the time series prediction problem various techniques have been developed over many years to enhance the accuracy of forecasting. This paper presents a review of the prediction of physical time series applications using the neural network models. Neural Networks (NN) have appeared as an effective tool for forecasting of time series.  Moreover, to resolve the problems related to time series data, there is a need of network with single layer trainable weights that is Higher Order Neural Network (HONN) which can perform nonlinearity mapping of input-output. So, the developers are focusing on HONN that has been recently considered to develop the input representation spaces broadly. The HONN model has the ability of functional mapping which determined through some time series problems and it shows the more benefits as compared to conventional Artificial Neural Networks (ANN). The goal of this research is to present the reader awareness about HONN for physical time series prediction, to highlight some benefits and challenges using HONN.


Sign in / Sign up

Export Citation Format

Share Document