Improving the effectiveness and efficiency of dynamic malware analysis with machine learning

A Personalized Machine-Learning-Enabled Method for Efficient Research in Ethnopharmacology. The Case of the Southern Balkans and the Coastal Zone of Asia Minor

Applied Sciences ◽

10.3390/app11135826 ◽

2021 ◽

Vol 11 (13) ◽

pp. 5826

Author(s):

Evangelos Axiotis ◽

Andreas Kontogiannis ◽

Eleftherios Kalpoutzakis ◽

George Giannakopoulos

Keyword(s):

Machine Learning ◽

Coastal Zone ◽

Extraction Process ◽

Asia Minor ◽

Efficiency And Effectiveness ◽

Effectiveness And Efficiency ◽

Intelligent Tools ◽

Southern Balkans

Ethnopharmacology experts face several challenges when identifying and retrieving documents and resources related to their scientific focus. The volume of sources that need to be monitored, the variety of formats utilized, and the different quality of language use across sources present some of what we call “big data” challenges in the analysis of this data. This study aims to understand if and how experts can be supported effectively through intelligent tools in the task of ethnopharmacological literature research. To this end, we utilize a real case study of ethnopharmacology research aimed at the southern Balkans and the coastal zone of Asia Minor. Thus, we propose a methodology for more efficient research in ethnopharmacology. Our work follows an “expert–apprentice” paradigm in an automatic URL extraction process, through crawling, where the apprentice is a machine learning (ML) algorithm, utilizing a combination of active learning (AL) and reinforcement learning (RL), and the expert is the human researcher. ML-powered research improved the effectiveness and efficiency of the domain expert by 3.1 and 5.14 times, respectively, fetching a total number of 420 relevant ethnopharmacological documents in only 7 h versus an estimated 36 h of human-expert effort. Therefore, utilizing artificial intelligence (AI) tools to support the researcher can boost the efficiency and effectiveness of the identification and retrieval of appropriate documents.

Malware Analysis using Machine Learning and Deep Learning techniques

2020 SoutheastCon ◽

10.1109/southeastcon44009.2020.9368268 ◽

2020 ◽

Author(s):

Rajvardhan Patil ◽

Wei Deng

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Malware Analysis ◽

Learning Techniques

Detection of Malicious Software by Analyzing Distinct Artifacts Using Machine Learning and Deep Learning Algorithms

Electronics ◽

10.3390/electronics10141694 ◽

2021 ◽

Vol 10 (14) ◽

pp. 1694

Author(s):

Mathew Ashik ◽

A. Jyothish ◽

S. Anandaram ◽

P. Vinod ◽

Francesco Mercaldo ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Deep Learning ◽

Support Vector ◽

Malware Analysis ◽

Learning Approaches ◽

Dynamic Features ◽

System Calls ◽

Prevention Methods ◽

Structural Aspects

Malware is one of the most significant threats in today’s computing world since the number of websites distributing malware is increasing at a rapid rate. Malware analysis and prevention methods are increasingly becoming necessary for computer systems connected to the Internet. This software exploits the system’s vulnerabilities to steal valuable information without the user’s knowledge, and stealthily send it to remote servers controlled by attackers. Traditionally, anti-malware products use signatures for detecting known malware. However, the signature-based method does not scale in detecting obfuscated and packed malware. Considering that the cause of a problem is often best understood by studying the structural aspects of a program like the mnemonics, instruction opcode, API Call, etc. In this paper, we investigate the relevance of the features of unpacked malicious and benign executables like mnemonics, instruction opcodes, and API to identify a feature that classifies the executable. Prominent features are extracted using Minimum Redundancy and Maximum Relevance (mRMR) and Analysis of Variance (ANOVA). Experiments were conducted on four datasets using machine learning and deep learning approaches such as Support Vector Machine (SVM), Naïve Bayes, J48, Random Forest (RF), and XGBoost. In addition, we also evaluate the performance of the collection of deep neural networks like Deep Dense network, One-Dimensional Convolutional Neural Network (1D-CNN), and CNN-LSTM in classifying unknown samples, and we observed promising results using APIs and system calls. On combining APIs/system calls with static features, a marginal performance improvement was attained comparing models trained only on dynamic features. Moreover, to improve accuracy, we implemented our solution using distinct deep learning methods and demonstrated a fine-tuned deep neural network that resulted in an F1-score of 99.1% and 98.48% on Dataset-2 and Dataset-3, respectively.

Malware Analysis with Machine Learning for Evaluating the Integrity of Mission Critical Devices

Advances in Intelligent Systems and Computing - Intelligent Computing ◽

10.1007/978-3-030-52243-8_18 ◽

2020 ◽

pp. 224-243

Author(s):

Robert Heras ◽

Alexander Perez-Pons

Keyword(s):

Machine Learning ◽

Malware Analysis ◽

Mission Critical

Evolutionary Machine Learning for Classification with Incomplete Data

10.26686/wgtn.17072123 ◽

2021 ◽

Author(s):

◽

Cao Truong Tran

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Genetic Programming ◽

Incomplete Data ◽

Missing Values ◽

Machine Learning Techniques ◽

Feature Construction ◽

Classification Algorithms ◽

Learning Techniques ◽

Effectiveness And Efficiency

<p>Classification is a major task in machine learning and data mining. Many real-world datasets suffer from the unavoidable issue of missing values. Classification with incomplete data has to be carefully handled because inadequate treatment of missing values will cause large classification errors. Existing most researchers working on classification with incomplete data focused on improving the effectiveness, but did not adequately address the issue of the efficiency of applying the classifiers to classify unseen instances, which is much more important than the act of creating classifiers. A common approach to classification with incomplete data is to use imputation methods to replace missing values with plausible values before building classifiers and classifying unseen instances. This approach provides complete data which can be then used by any classification algorithm, but sophisticated imputation methods are usually computationally intensive, especially for the application process of classification. Another approach to classification with incomplete data is to build a classifier that can directly work with missing values. This approach does not require time for estimating missing values, but it often generates inaccurate and complex classifiers when faced with numerous missing values. A recent approach to classification with incomplete data which also avoids estimating missing values is to build a set of classifiers which then is used to select applicable classifiers for classifying unseen instances. However, this approach is also often inaccurate and takes a long time to find applicable classifiers when faced with numerous missing values. The overall goal of the thesis is to simultaneously improve the effectiveness and efficiency of classification with incomplete data by using evolutionary machine learning techniques for feature selection, clustering, ensemble learning, feature construction and constructing classifiers. The thesis develops approaches for improving imputation for classification with incomplete data by integrating clustering and feature selection with imputation. The approaches improve both the effectiveness and the efficiency of using imputation for classification with incomplete data. The thesis develops wrapper-based feature selection methods to improve input space for classification algorithms that are able to work directly with incomplete data. The methods not only improve the classification accuracy, but also reduce the complexity of classifiers able to work directly with incomplete data. The thesis develops a feature construction method to improve input space for classification algorithms with incomplete data by proposing interval genetic programming-genetic programming with a set of interval functions. The method improves the classification accuracy and reduces the complexity of classifiers. The thesis develops an ensemble approach to classification with incomplete data by integrating imputation, feature selection, and ensemble learning. The results show that the approach is more accurate, and faster than previous common methods for classification with incomplete data. The thesis develops interval genetic programming to directly evolve classifiers for incomplete data. The results show that classifiers generated by interval genetic programming can be more effective and efficient than classifiers generated the combination of imputation and traditional genetic programming. Interval genetic programming is also more effective than common classification algorithms able to work directly with incomplete data. In summary, the thesis develops a range of approaches for simultaneously improving the effectiveness and efficiency of classification with incomplete data by using a range of evolutionary machine learning techniques.</p>

Leveraging ontologies and machine-learning techniques for malware analysis into Android permissions ecosystems

Computers & Security ◽

10.1016/j.cose.2018.07.013 ◽

2018 ◽

Vol 78 ◽

pp. 429-453 ◽

Cited By ~ 9

Author(s):

Luiz C. Navarro ◽

Alexandre K.W. Navarro ◽

André Grégio ◽

Anderson Rocha ◽

Ricardo Dahab

Keyword(s):

Machine Learning ◽

Machine Learning Techniques ◽

Malware Analysis ◽

Learning Techniques

Static Malware Analysis Using Machine Learning Methods

Communications in Computer and Information Science - Recent Trends in Computer Networks and Distributed Systems Security ◽

10.1007/978-3-642-54525-2_39 ◽

2014 ◽

pp. 440-450 ◽

Cited By ~ 26

Author(s):

Hiran V. Nath ◽

Babu M. Mehtre

Keyword(s):

Machine Learning ◽

Malware Analysis ◽

Learning Methods ◽

Machine Learning Methods

Static and Dynamic Malware Analysis Using Machine Learning

First International Conference on Sustainable Technologies for Computational Intelligence - Advances in Intelligent Systems and Computing ◽

10.1007/978-981-15-0029-9_62 ◽

2019 ◽

pp. 793-806 ◽

Cited By ~ 4

Author(s):

Chandni Raghuraman ◽

Sandhya Suresh ◽

Suraj Shivshankar ◽

Radhika Chapaneri

Keyword(s):

Machine Learning ◽

Malware Analysis

Android Malware Analysis Using Machine Learning Classifiers

Algorithms for Intelligent Systems - Proceedings of International Conference on Computational Intelligence and Emerging Power System ◽

10.1007/978-981-16-4103-9_15 ◽

2021 ◽

pp. 171-179

Author(s):

Sakshi Jain ◽

Tarul Khandelwal ◽

Yash Jain ◽

Jyoti Gajrani

Keyword(s):

Machine Learning ◽

Malware Analysis ◽

Android Malware ◽

Machine Learning Classifiers ◽

Learning Classifiers

A Novel Malware Analysis Framework for Malware Detection and Classification using Machine Learning Approach

Proceedings of the 19th International Conference on Distributed Computing and Networking - ICDCN '18 ◽

10.1145/3154273.3154326 ◽

2018 ◽

Cited By ~ 6

Author(s):

Kamalakanta Sethi ◽

Shankar Kumar Chaudhary ◽

Bata Krishan Tripathy ◽

Padmalochan Bera

Keyword(s):

Machine Learning ◽

Malware Detection ◽

Learning Approach ◽

Malware Analysis ◽

Analysis Framework ◽

Machine Learning Approach ◽

Malware Detection And Classification