Estimation of similarity between functions extracted from x86 executable files

Katarina Berta; Sasa Stojanovic; Milos Cvetanovic; Zaharije Radivojevic

doi:10.2298/sjee1502253b

Estimation of similarity between functions extracted from x86 executable files

Serbian Journal of Electrical Engineering ◽

10.2298/sjee1502253b ◽

2015 ◽

Vol 12 (2) ◽

pp. 253-262

Author(s):

Katarina Berta ◽

Sasa Stojanovic ◽

Milos Cvetanovic ◽

Zaharije Radivojevic

Keyword(s):

Machine Learning ◽

Software Engineering ◽

Binary Code ◽

Source Code ◽

Future Research ◽

Malware Analysis ◽

Previous Knowledge ◽

The Third

Comparison of functions is required in various domains of software engineering. In most domains, comparison is done using source code, but in some domains, such as license violation or malware analysis, only binary code is available. The goal of this paper is to evaluate whether the existing solution meant for ARM architecture can be applied to x86 architecture. The existing solution encompasses multiple approaches, but for the purpose of this paper three representative approaches are implemented; two are based on machine learning, and the third does not require previous knowledge. Results show that the best recalls obtained for the first ten positions on both architectures are comparable and do not differ significantly. The results confirm that adaptation of all approaches of the existing solution is not only possible but also promising and represent adequate basis for future research.

Download Full-text

Osiris: A Malware Behavior Capturing System Implemented at Virtual Machine Monitor Layer

Mathematical Problems in Engineering ◽

10.1155/2013/402438 ◽

2013 ◽

Vol 2013 ◽

pp. 1-11 ◽

Cited By ~ 4

Author(s):

Ying Cao ◽

Qiguang Miao ◽

Jiachen Liu ◽

Weisheng Li

Keyword(s):

Virtual Machine ◽

Binary Code ◽

Source Code ◽

Detection Algorithm ◽

Main Process ◽

Malware Analysis ◽

Network Environment ◽

Virtual Machine Monitor ◽

Analysis Process ◽

Malicious Behaviors

To perform behavior based malware analysis, behavior capturing is an important prerequisite. In this paper, we present Osiris system which is a tool to capture behaviors of executable files in Windows system. It collects API calls invoked not only by main process of the analysis file, but also API calls invoked by child processes which are created by main process, injected processes if process injection happens, and service processes if the main process creates services. By modifying the source code of Qemu, Osiris is implemented at the virtual machine monitor layer and has the following advantages. First, it does not rewrite the binary code of analysis file or interfere with its normal execution, so that behavior data are obtained more stealthily and transparently. Second, it employs a multi-virtual machine framework to simulate the network environment for malware analysis, so that network behaviors of a malware are stimulated to a large extend. Third, besides network environment, it also simulates most common host events to stimulate potential malicious behaviors of a malware. The experimental results show that Osiris automates the malware analysis process and provides good behavior data for the following detection algorithm.

Download Full-text

An Introduction to Reinforcement Learning

Decision Theory Models for Applications in Artificial Intelligence ◽

10.4018/978-1-60960-165-2.ch004 ◽

2012 ◽

pp. 63-80

Author(s):

Eduardo F. Morales ◽

Julio H. Zaragoza

Keyword(s):

Machine Learning ◽

Reinforcement Learning ◽

Conceptual Framework ◽

Research Area ◽

Previous Knowledge ◽

Mathematical Discussion ◽

Search Spaces ◽

Common Solution ◽

The Third ◽

Research Challenges

This chapter provides a concise introduction to Reinforcement Learning (RL) from a machine learning perspective. It provides the required background to understand the chapters related to RL in this book. It makes no assumption on previous knowledge in this research area and includes short descriptions of some of the latest trends, which are normally excluded from other introductions or overviews on RL. The chapter provides more emphasis on the general conceptual framework and ideas of RL rather than on presenting a rigorous mathematical discussion that may require a great deal of effort by the reader. The first section provides a general introduction to the area. The following section describes the most common solution techniques. In the third section, some of the most recent techniques proposed to deal with large search spaces are described. Finally, the last section provides some final remarks and current research challenges in RL.

Download Full-text

Applying Rule Induction in Software Prediction

Advances in Machine Learning Applications in Software Engineering ◽

10.4018/978-1-59140-941-1.ch011 ◽

2011 ◽

pp. 265-286

Author(s):

Bhekisipho Twala ◽

Michelle Cartwright ◽

Martin Shepperd

Keyword(s):

Machine Learning ◽

Statistical Analysis ◽

Software Engineering ◽

Principal Components ◽

Research Work ◽

Rule Induction ◽

Future Research ◽

Human Beings ◽

Data Application ◽

Key Issues

Recently, the use of machine learning (ML) algorithms has proven to be of great practical value in solving a variety of software engineering problems including software prediction, for example, cost and defect processes. An important advantage of machine learning over statistical analysis as a modelling technique lies in the fact that the interpretation of production rules is more straightforward and intelligible to human beings than, say, principal components and patterns with numbers that represent their meaning. The main focus of this chapter is upon rule induction (RI): providing some background and key issues on RI and further examining how RI has been utilised to handle uncertainties in data. Application of RI in prediction and other software engineering tasks is considered. The chapter concludes by identifying future research work when applying rule induction in software prediction. Such future research work might also help solve new problems related to rule induction and prediction.

Download Full-text

COVID-19 Outbreak Prediction with Machine Learning

10.34055/osf.io/xr4js ◽

2020 ◽

Author(s):

Sina Faizollahzadeh Ardabili ◽

Amir Mosavi ◽

Pedram Ghamisi ◽

Filip Ferdinand ◽

Annamaria R. Varkonyi-Koczy ◽

...

Keyword(s):

Machine Learning ◽

Prediction Models ◽

Fuzzy Inference ◽

Control Measures ◽

Future Research ◽

Complex Nature ◽

Inference System ◽

Wide Range ◽

Standard Models ◽

High Level

Several outbreak prediction models for COVID-19 are being used by officials around the world to make informed-decisions and enforce relevant control measures. Among the standard models for COVID-19 global pandemic prediction, simple epidemiological and statistical models have received more attention by authorities, and they are popular in the media. Due to a high level of uncertainty and lack of essential data, standard models have shown low accuracy for long-term prediction. Although the literature includes several attempts to address this issue, the essential generalization and robustness abilities of existing models needs to be improved. This paper presents a comparative analysis of machine learning and soft computing models to predict the COVID-19 outbreak as an alternative to SIR and SEIR models. Among a wide range of machine learning models investigated, two models showed promising results (i.e., multi-layered perceptron, MLP, and adaptive network-based fuzzy inference system, ANFIS). Based on the results reported here, and due to the highly complex nature of the COVID-19 outbreak and variation in its behavior from nation-to-nation, this study suggests machine learning as an effective tool to model the outbreak. This paper provides an initial benchmarking to demonstrate the potential of machine learning for future research. Paper further suggests that real novelty in outbreak prediction can be realized through integrating machine learning and SEIR models.

Download Full-text

Android Malware Detection Techniques: A Literature Review

Recent Patents on Engineering ◽

10.2174/1872212114999200710143847 ◽

2020 ◽

Vol 14 ◽

Author(s):

Meghna Dhalaria ◽

Ekta Gandotra

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Malware Detection ◽

Future Research ◽

Android Malware ◽

Detection Techniques ◽

Android Malware Detection ◽

Future Research Directions ◽

To Come ◽

Tools And Techniques

Purpose: This paper provides the basics of Android malware, its evolution and tools and techniques for malware analysis. Its main aim is to present a review of the literature on Android malware detection using machine learning and deep learning and identify the research gaps. It provides the insights obtained through literature and future research directions which could help researchers to come up with robust and accurate techniques for classification of Android malware. Design/Methodology/Approach: This paper provides a review of the basics of Android malware, its evolution timeline and detection techniques. It includes the tools and techniques for analyzing the Android malware statically and dynamically for extracting features and finally classifying these using machine learning and deep learning algorithms. Findings: The number of Android users is expanding very fast due to the popularity of Android devices. As a result, there are more risks to Android users due to the exponential growth of Android malware. On-going research aims to overcome the constraints of earlier approaches for malware detection. As the evolving malware are complex and sophisticated, earlier approaches like signature based and machine learning based are not able to identify these timely and accurately. The findings from the review shows various limitations of earlier techniques i.e. requires more detection time, high false positive and false negative rate, low accuracy in detecting sophisticated malware and less flexible. Originality/value: This paper provides a systematic and comprehensive review on the tools and techniques being employed for analysis, classification and identification of Android malicious applications. It includes the timeline of Android malware evolution, tools and techniques for analyzing these statically and dynamically for the purpose of extracting features and finally using these features for their detection and classification using machine learning and deep learning algorithms. On the basis of the detailed literature review, various research gaps are listed. The paper also provides future research directions and insights which could help researchers to come up with innovative and robust techniques for detecting and classifying the Android malware.

Download Full-text

Recommendation Systems for Education: Systematic Review

Electronics ◽

10.3390/electronics10141611 ◽

2021 ◽

Vol 10 (14) ◽

pp. 1611

Author(s):

María Cora Urdaneta-Ponte ◽

Amaia Mendez-Zorrilla ◽

Ibon Oleagordia-Ruiz

Keyword(s):

Machine Learning ◽

Systematic Review ◽

Formal Education ◽

Research Work ◽

Hybrid Approach ◽

Recommendation Systems ◽

Educational Resources ◽

Future Research ◽

Collaborative Approach ◽

Developmental Approach

Recommendation systems have emerged as a response to overload in terms of increased amounts of information online, which has become a problem for users regarding the time spent on their search and the amount of information retrieved by it. In the field of recommendation systems in education, the relevance of recommended educational resources will improve the student’s learning process, and hence the importance of being able to suitably and reliably ensure relevant, useful information. The purpose of this systematic review is to analyze the work undertaken on recommendation systems that support educational practices with a view to acquiring information related to the type of education and areas dealt with, the developmental approach used, and the elements recommended, as well as being able to detect any gaps in this area for future research work. A systematic review was carried out that included 98 articles from a total of 2937 found in main databases (IEEE, ACM, Scopus and WoS), about which it was able to be established that most are geared towards recommending educational resources for users of formal education, in which the main approaches used in recommendation systems are the collaborative approach, the content-based approach, and the hybrid approach, with a tendency to use machine learning in the last two years. Finally, possible future areas of research and development in this field are presented.

Download Full-text

A review of infant cry analysis and classification

EURASIP Journal on Audio Speech and Music Processing ◽

10.1186/s13636-021-00197-5 ◽

2021 ◽

Vol 2021 (1) ◽

Author(s):

Chunyan Ji ◽

Thosini Bamunu Mudiyanselage ◽

Yutong Gao ◽

Yi Pan

Keyword(s):

Neural Network ◽

Machine Learning ◽

Signal Analysis ◽

Future Research ◽

Prosodic Features ◽

Infant Cry ◽

Machine Learning Classification ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Processing Techniques

AbstractThis paper reviews recent research works in infant cry signal analysis and classification tasks. A broad range of literatures are reviewed mainly from the aspects of data acquisition, cross domain signal processing techniques, and machine learning classification methods. We introduce pre-processing approaches and describe a diversity of features such as MFCC, spectrogram, and fundamental frequency, etc. Both acoustic features and prosodic features extracted from different domains can discriminate frame-based signals from one another and can be used to train machine learning classifiers. Together with traditional machine learning classifiers such as KNN, SVM, and GMM, newly developed neural network architectures such as CNN and RNN are applied in infant cry research. We present some significant experimental results on pathological cry identification, cry reason classification, and cry sound detection with some typical databases. This survey systematically studies the previous research in all relevant areas of infant cry and provides an insight on the current cutting-edge works in infant cry signal analysis and classification. We also propose future research directions in data processing, feature extraction, and neural network classification fields to better understand, interpret, and process infant cry signals.

Download Full-text

A Review on Human–AI Interaction in Machine Learning and Insights for Medical Applications

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18042121 ◽

2021 ◽

Vol 18 (4) ◽

pp. 2121

Author(s):

Mansoureh Maadi ◽

Hadi Akbarzadeh Khorshidi ◽

Uwe Aickelin

Keyword(s):

Machine Learning ◽

Future Research ◽

Computational Power ◽

Medical Field ◽

Interactive Machine Learning ◽

Human In The Loop ◽

Human Interactions ◽

Scoping Literature Review ◽

Domain Expertise ◽

Expertise Level

Objective: To provide a human–Artificial Intelligence (AI) interaction review for Machine Learning (ML) applications to inform how to best combine both human domain expertise and computational power of ML methods. The review focuses on the medical field, as the medical ML application literature highlights a special necessity of medical experts collaborating with ML approaches. Methods: A scoping literature review is performed on Scopus and Google Scholar using the terms “human in the loop”, “human in the loop machine learning”, and “interactive machine learning”. Peer-reviewed papers published from 2015 to 2020 are included in our review. Results: We design four questions to investigate and describe human–AI interaction in ML applications. These questions are “Why should humans be in the loop?”, “Where does human–AI interaction occur in the ML processes?”, “Who are the humans in the loop?”, and “How do humans interact with ML in Human-In-the-Loop ML (HILML)?”. To answer the first question, we describe three main reasons regarding the importance of human involvement in ML applications. To address the second question, human–AI interaction is investigated in three main algorithmic stages: 1. data producing and pre-processing; 2. ML modelling; and 3. ML evaluation and refinement. The importance of the expertise level of the humans in human–AI interaction is described to answer the third question. The number of human interactions in HILML is grouped into three categories to address the fourth question. We conclude the paper by offering a discussion on open opportunities for future research in HILML.

Download Full-text

Combining Virtual Reality and Organizational Neuroscience for Leadership Assessment

Applied Sciences ◽

10.3390/app11135956 ◽

2021 ◽

Vol 11 (13) ◽

pp. 5956

Author(s):

Elena Parra ◽

Irene Alice Chicchi Giglioli ◽

Jestine Philip ◽

Lucia Amalia Carrasco-Ribelles ◽

Javier Marín-Morales ◽

...

Keyword(s):

Machine Learning ◽

Eye Tracking ◽

Ecological Validity ◽

Objective Assessment ◽

Leadership Skills ◽

Three Dimensional ◽

Machine Learning Techniques ◽

Future Research ◽

Promising Tool ◽

Future Research Agenda

In this article, we introduce three-dimensional Serious Games (3DSGs) under an evidence-centered design (ECD) framework and use an organizational neuroscience-based eye-tracking measure to capture implicit behavioral signals associated with leadership skills. While ECD is a well-established framework used in the design and development of assessments, it has rarely been utilized in organizational research. The study proposes a novel 3DSG combined with organizational neuroscience methods as a promising tool to assess and recognize leadership-related behavioral patterns that manifest during complex and realistic social situations. We offer a research protocol for assessing task- and relationship-oriented leadership skills that uses ECD, eye-tracking measures, and machine learning. Seamlessly embedding biological measures into 3DSGs enables objective assessment methods that are based on machine learning techniques to achieve high ecological validity. We conclude by describing a future research agenda for the combined use of 3DSGs and organizational neuroscience methods for leadership and human resources.

Download Full-text

Federated Quantum Machine Learning

Entropy ◽

10.3390/e23040460 ◽

2021 ◽

Vol 23 (4) ◽

pp. 460

Author(s):

Samuel Yen-Chi Chen ◽

Shinjae Yoo

Keyword(s):

Machine Learning ◽

Data Privacy ◽

Research Direction ◽

Future Research ◽

Quantum Computers ◽

Training Time ◽

Quantum Neural Network ◽

Distributed Training ◽

Machine Learning Model ◽

Quantum Machine Learning

Distributed training across several quantum computers could significantly improve the training time and if we could share the learned model, not the data, it could potentially improve the data privacy as the training would happen where the data is located. One of the potential schemes to achieve this property is the federated learning (FL), which consists of several clients or local nodes learning on their own data and a central node to aggregate the models collected from those local nodes. However, to the best of our knowledge, no work has been done in quantum machine learning (QML) in federation setting yet. In this work, we present the federated training on hybrid quantum-classical machine learning models although our framework could be generalized to pure quantum machine learning model. Specifically, we consider the quantum neural network (QNN) coupled with classical pre-trained convolutional model. Our distributed federated learning scheme demonstrated almost the same level of trained model accuracies and yet significantly faster distributed training. It demonstrates a promising future research direction for scaling and privacy aspects.

Download Full-text