Detection of Variable Misuse Using Static Analysis Combined with Machine Learning

Author(s):  
Gleb Morgachev ◽  
Valery Ignatyev ◽  
Andrey Belevantsev
Symmetry ◽  
2020 ◽  
Vol 13 (1) ◽  
pp. 35
Author(s):  
Sungjoong Kim ◽  
Seongkyu Yeom ◽  
Haengrok Oh ◽  
Dongil Shin ◽  
Dongkyoo Shin

The development of information and communication technology (ICT) is making daily life more convenient by allowing access to information at anytime and anywhere and by improving the efficiency of organizations. Unfortunately, malicious code is also proliferating and becoming increasingly complex and sophisticated. In fact, even novices can now easily create it using hacking tools, which is causing it to increase and spread exponentially. It has become difficult for humans to respond to such a surge. As a result, many studies have pursued methods to automatically analyze and classify malicious code. There are currently two methods for analyzing it: a dynamic analysis method that executes the program directly and confirms the execution result, and a static analysis method that analyzes the program without executing it. This paper proposes a static analysis automation technique for malicious code that uses machine learning. This classification system was designed by combining a method for classifying malicious code using a portable executable (PE) structure and a method for classifying it using a PE structure. The system has 98.77% accuracy when classifying normal and malicious files. The proposed system can be used to classify various types of malware from PE files to shell code.


2021 ◽  
Vol 58 ◽  
pp. 102735
Author(s):  
Wadi’ Hijawi ◽  
Ja’far Alqatawna ◽  
Ala’ M. Al-Zoubi ◽  
Mohammad A. Hassonah ◽  
Hossam Faris

2018 ◽  
Vol 7 (4.6) ◽  
pp. 410
Author(s):  
Hetal Suresh ◽  
Joseph Raymond V

Mobile phones has become very integral part in our day to day life. In the digitalized world most of our day to day activities rely on mobile phone like banking activities, wallet payments, credentials, social accounts etc. Our system works in such a way that if there is an advantage to a technology there also exists a disadvantage. Every users have all their private and sensitive data in their mobile phones and download random applications from different platforms like play store, App store etc. There is a huge possibility that the applications downloaded are malicious applications. The existing system provides a solution for detection of such applications with the help of antivirus which has pre-built signatures that can be used to obtain an already existing malware which can be modified and manipulated by the hacker if they tend to do so. In this project, our purpose is to identify the malicious applications using Machine learning. By combining both static analysis and dynamic analysis we can use a Hybrid approach for analysing and detecting malware threats in android applications using Recurrent Neural Network (RNN). The main aim of this project will be to ensure that the application installed is benign, if it is not, it should block such applications and notify the user. 


2022 ◽  
Vol 24 (3) ◽  
pp. 1-25
Author(s):  
Nishtha Paul ◽  
Arpita Jadhav Bhatt ◽  
Sakeena Rizvi ◽  
Shubhangi

Frequency of malware attacks because Android apps are increasing day by day. Current studies have revealed startling facts about data harvesting incidents, where user’s personal data is at stake. To preserve privacy of users, a permission induced risk interface MalApp to identify privacy violations rising from granting permissions during app installation is proposed. It comprises of multi-fold process that performs static analysis based on app’s category. First, concept of reverse engineering is applied to extract app permissions to construct a Boolean-valued permission matrix. Second, ranking of permissions is done to identify the risky permissions across category. Third, machine learning and ensembling techniques have been incorporated to test the efficacy of the proposed approach on a data set of 404 benign and 409 malicious apps. The empirical studies have identified that our proposed algorithm gives a best case malware detection rate of 98.33%. The highlight of interface is that any app can be classified as benign or malicious even before running it using static analysis.


Author(s):  
Syed Khurram Jah Rizvi ◽  
Warda Aslam ◽  
Muhammad Shahzad ◽  
Shahzad Saleem ◽  
Muhammad Moazam Fraz

AbstractEnterprises are striving to remain protected against malware-based cyber-attacks on their infrastructure, facilities, networks and systems. Static analysis is an effective approach to detect the malware, i.e., malicious Portable Executable (PE). It performs an in-depth analysis of PE files without executing, which is highly useful to minimize the risk of malicious PE contaminating the system. Yet, instant detection using static analysis has become very difficult due to the exponential rise in volume and variety of malware. The compelling need of early stage detection of malware-based attacks significantly motivates research inclination towards automated malware detection. The recent machine learning aided malware detection approaches using static analysis are mostly supervised. Supervised malware detection using static analysis requires manual labelling and human feedback; therefore, it is less effective in rapidly evolutionary and dynamic threat space. To this end, we propose a progressive deep unsupervised framework with feature attention block for static analysis-based malware detection (PROUD-MAL). The framework is based on cascading blocks of unsupervised clustering and features attention-based deep neural network. The proposed deep neural network embedded with feature attention block is trained on the pseudo labels. To evaluate the proposed unsupervised framework, we collected a real-time malware dataset by deploying low and high interaction honeypots on an enterprise organizational network. Moreover, endpoint security solution is also deployed on an enterprise organizational network to collect malware samples. After post processing and cleaning, the novel dataset consists of 15,457 PE samples comprising 8775 malicious and 6681 benign ones. The proposed PROUD-MAL framework achieved an accuracy of more than 98.09% with better quantitative performance in standard evaluation parameters on collected dataset and outperformed other conventional machine learning algorithms. The implementation and dataset are available at https://bit.ly/35Sne3a.


2021 ◽  
Author(s):  
Recep Sinan ARSLAN

Abstract The number of applications prepared for use on mobile devices has increased rapidly with the widespread use of the Android OS. This has resulted in the undesired installation of Android apks that violate user privacy or malicious. The increasing similarity between Android malware and benign applications makes it difficult to distinguish them from each other and causes a situation of concern for users. In this study, FG-Droid, a machine-learning based classifier with an efficient working system, using the method of grouping the features obtained by static analysis, was proposed. It was created as a result of experiments with Machine learning (ML), DNN, RNN, LSTM and GRU based models using Drebin, Genome and Arslan datasets. Experimental results reveal that FG-Droid has achieved 97.7% AUC score with a vector includes only 11 static features, and ExtraTree algorithm. FG-Droid analyze the applications with using the least number of features compare to previous studies, and required the least processing time for training and prediction. As a result, it has been shown that Android malware can be detected in high accuracy rate with an effective feature set and there is no need to use a large number of features extracted with different techniques (static, dynamic or hybrid).


2021 ◽  
Author(s):  
Zhenshuo Chen ◽  
Eoin Brophy ◽  
Tomas Ward

<div>Network and system security are incredibly critical issues now. Due to the rapid proliferation of malware, traditional analysis methods struggle with enormous samples.</div><div>In this paper, we propose four easy-to-extract and small-scale features, including sizes and permissions of Windows PE sections, content complexity, and import libraries, to classify malware families, and use automatic machine learning to search for the best model and hyper-parameters for each feature and their combinations. Compared with detailed behavior-related features like API sequences, proposed features provide macroscopic information about malware. The analysis is based on static disassembly scripts and hexadecimal machine code. Unlike dynamic behavior analysis, static analysis is resource-efficient and offers complete code coverage, but is vulnerable to code obfuscation and encryption.<br></div><div>The results demonstrate that features which work well in dynamic analysis are not necessarily effective when applied to static analysis. For instance, API 4-grams only achieve 57.96% accuracy and involve a relatively high dimensional feature set (5000 dimensions). In contrast, the novel proposed features together with a classical machine learning algorithm (Random Forest) presents very good accuracy at 99.40% and the feature vector is of much smaller dimension (40 dimensions). We demonstrate the effectiveness of this approach through integration in IDA Pro, which also facilitates the collection of new training samples and subsequent model retraining.<br></div>


Author(s):  
Marco Pistoia ◽  
Omer Tripp ◽  
David Lubensky

Mobile devices have revolutionized many aspects of our lives. Without realizing it, we often run on them programs that access and transmit private information over the network. Integrity concerns arise when mobile applications use untrusted data as input to security-sensitive computations. Program-analysis tools for integrity and confidentiality enforcement have become a necessity. Static-analysis tools are particularly attractive because they do not require installing and executing the program, and have the potential of never missing any vulnerability. Nevertheless, such tools often have high false-positive rates. In order to reduce the number of false positives, static analysis has to be very precise, but this is in conflict with the analysis' performance and scalability, requiring a more refined model of the application. This chapter proposes Phoenix, a novel solution that combines static analysis with machine learning to identify programs exhibiting suspicious operations. This approach has been widely applied to mobile applications obtaining impressive results.


Author(s):  
Marco Pistoia ◽  
Omer Tripp ◽  
David Lubensky

Mobile devices have revolutionized many aspects of our lives. Without realizing it, we often run on them programs that access and transmit private information over the network. Integrity concerns arise when mobile applications use untrusted data as input to security-sensitive computations. Program-analysis tools for integrity and confidentiality enforcement have become a necessity. Static-analysis tools are particularly attractive because they do not require installing and executing the program, and have the potential of never missing any vulnerability. Nevertheless, such tools often have high false-positive rates. In order to reduce the number of false positives, static analysis has to be very precise, but this is in conflict with the analysis' performance and scalability, requiring a more refined model of the application. This chapter proposes Phoenix, a novel solution that combines static analysis with machine learning to identify programs exhibiting suspicious operations. This approach has been widely applied to mobile applications obtaining impressive results.


Sign in / Sign up

Export Citation Format

Share Document