OpCode-Level Function Call Graph Based Android Malware Classification Using Deep Learning

Due to the openness of an Android system, many Internet of Things (IoT) devices are running the Android system and Android devices have become a common control terminal for IoT devices because of various sensors on them. With the popularity of IoT devices, malware on Android-based IoT devices is also increasing. People’s lives and privacy security are threatened. To reduce such threat, many researchers have proposed new methods to detect Android malware. Currently, most malware detection products on the market are based on malware signatures, which have a fast detection speed and normally a low false alarm rate for known malware families. However, they cannot detect unknown malware and are easily evaded by malware that is confused or packaged. Many new solutions use syntactic features and machine learning techniques to classify Android malware. It has been known that analysis of the Function Call Graph (FCG) can capture behavioral features of malware well. This paper presents a new approach to classifying Android malware based on deep learning and OpCode-level FCG. The FCG is obtained through static analysis of Operation Code (OpCode), and the deep learning model we used is the Long Short-Term Memory (LSTM). We conducted experiments on a dataset with 1796 Android malware samples classified into two categories (obtained from Virusshare and AndroZoo) and 1000 benign Android apps. Our experimental results showed that our proposed approach with an accuracy of 97 % outperforms the state-of-the-art methods such as those proposed by Nikola et al. and Hou et al. (IJCAI-18) with the accuracy of 97 % and 91 % , respectively. The time consumption of our proposed approach is less than the other two methods.

Download Full-text

Android Malware Detection Based on Structural Features of the Function Call Graph

Electronics ◽

10.3390/electronics10020186 ◽

2021 ◽

Vol 10 (2) ◽

pp. 186

Author(s):

Yang Yang ◽

Xuehui Du ◽

Zhi Yang ◽

Xing Liu

Keyword(s):

Malware Detection ◽

Structural Features ◽

Coarse Grained ◽

Detection Methods ◽

Convolutional Network ◽

Android Malware ◽

Call Graph ◽

Android Apps ◽

Android Malware Detection ◽

Function Call

The openness of Android operating system not only brings convenience to users, but also leads to the attack threat from a large number of malicious applications (apps). Thus malware detection has become the research focus in the field of mobile security. In order to solve the problem of more coarse-grained feature selection and larger feature loss of graph structure existing in the current detection methods, we put forward a method named DGCNDroid for Android malware detection, which is based on the deep graph convolutional network. Our method starts by generating a function call graph for the decompiled Android application. Then the function call subgraph containing the sensitive application programming interface (API) is extracted. Finally, the function call subgraphs with structural features are trained as the input of the deep graph convolutional network. Thus the detection and classification of malicious apps can be realized. Through experimentation on a dataset containing 11,120 Android apps, the method proposed in this paper can achieve detection accuracy of 98.2%, which is higher than other existing detection methods.

Download Full-text

AFCGDroid: Deep Learning Based Android Malware Detection Using Attributed Function Call Graphs

Journal of Physics Conference Series ◽

10.1088/1742-6596/1693/1/012080 ◽

2020 ◽

Vol 1693 ◽

pp. 012080

Author(s):

Tong Lu ◽

Xiaoyuan Liu ◽

Jingwei Chen ◽

Naitian Hu ◽

Bo Liu

Keyword(s):

Deep Learning ◽

Malware Detection ◽

Android Malware ◽

Android Malware Detection ◽

Function Call ◽

Call Graphs

Download Full-text

AOHDL: Archimedes Optimization Based Hybrid Deep Learning Model for Soybean Plant Disease Classification

10.21203/rs.3.rs-302084/v1 ◽

2021 ◽

Author(s):

J. Annrose ◽

N. Herald Anantha Rufus ◽

C. R. Edwin Selva Rex ◽

D. Godwin Immanuel

Keyword(s):

Deep Learning ◽

Optimization Algorithm ◽

Short Term Memory ◽

Learning Model ◽

Disease Classification ◽

Training Data ◽

Machine Learning Techniques ◽

Soybean Plant ◽

Lower Accuracy ◽

Deep Learning Model

Abstract Bean which is botanically called Phaseolus vulgaris L belongs to the Fabaceae family.During bean disease identification, unnecessary economical losses occur due to the delay of the treatment period, incorrect treatment, and lack of knowledge. The existing deep learning and machine learning techniques met few issues such as high computational complexity, higher cost associated with the training data, more execution time, noise, feature dimensionality, lower accuracy, low speed, etc. To tackle these problems, we have proposed a hybrid deep learning model with an Archimedes optimization algorithm (HDL-AOA) for bean disease classification. In this work, there are five bean classes of which one is a healthy class whereas the remaining four classes indicate different diseases such as Bean halo blight, Pythium diseases, Rhizoctonia root rot, and Anthracnose abnormalities acquired from the Soybean (Large) Data Set.The hybrid deep learning technique is the combination of wavelet packet decomposition (WPD) and long short term memory (LSTM). Initially, the WPD decomposes the input images into four sub-series. For these sub-series, four LSTM networks were developed. During bean disease classification, an Archimedes optimization algorithm (AOA) enhances the classification accuracy for multiple single LSTM networks. MATLAB software implements the HDL-AOA model for bean disease classification. The proposed model accomplishes lower MAPE than other exiting methods. Finally, the proposed HDL-AOA model outperforms excellent classification results using different evaluation measures such as accuracy, specificity, sensitivity, precision, recall, and F-score.

Download Full-text

A Hybrid Optimized LSTM Models for Human Activity Recognition with IOT Devices

International Journal of Advanced Research in Science, Communication and Technology ◽

10.48175/ijarsct-2326 ◽

2021 ◽

pp. 182-189

Author(s):

S. Arokiaraj ◽

Dr. N. Viswanathan

Keyword(s):

Deep Learning ◽

Internet Of Things ◽

Short Term Memory ◽

The Other ◽

Learning Models ◽

Computational Overhead ◽

Temporal Features ◽

Human Movements ◽

Proposed Model ◽

Iot Devices

With the advent of Internet of things(IoT),HA (HA) recognition has contributed the more application in health care in terms of diagnosis and Clinical process. These devices must be aware of human movements to provide better aid in the clinical applications as well as user’s daily activity.Also , In addition to machine and deep learning algorithms, HA recognition systems has significantly improved in terms of high accurate recognition. However, the most of the existing models designed needs improvisation in terms of accuracy and computational overhead. In this research paper, we proposed a BAT optimized Long Short term Memory (BAT-LSTM) for an effective recognition of human activities using real time IoT systems. The data are collected by implanting the Internet of things) devices invasively. Then, proposed BAT-LSTM is deployed to extract the temporal features which are then used for classification to HA. Nearly 10,0000 dataset were collected and used for evaluating the proposed model. For the validation of proposed framework, accuracy, precision, recall, specificity and F1-score parameters are chosen and comparison is done with the other state-of-art deep learning models. The finding shows the proposed model outperforms the other learning models and finds its suitability for the HA recognition.

Download Full-text

Soybean Plant Disease Classification using Archimedes Optimization Algorithm based Hybrid Deep Learning Model

10.21203/rs.3.rs-281525/v1 ◽

2021 ◽

Author(s):

J. Annrose ◽

N. Herald Anantha Rufus ◽

C. R. Edwin Selva Rex ◽

D. Godwin Immanuel

Keyword(s):

Deep Learning ◽

Optimization Algorithm ◽

Short Term Memory ◽

Learning Model ◽

Disease Classification ◽

Training Data ◽

Machine Learning Techniques ◽

Soybean Plant ◽

Lower Accuracy ◽

Deep Learning Model

Download Full-text

Deploying Machine and Deep Learning Models for Efficient Data-Augmented Detection of COVID-19 Infections

Viruses ◽

10.3390/v12070769 ◽

2020 ◽

Vol 12 (7) ◽

pp. 769 ◽

Cited By ~ 12

Author(s):

Ahmed Sedik ◽

Abdullah M Iliyasu ◽

Basma Abd El-Rahiem ◽

Mohammed E. Abdel Samea ◽

Asmaa Abdel-Raheem ◽

...

Keyword(s):

Deep Learning ◽

Data Augmentation ◽

Short Term Memory ◽

Machine Learning Techniques ◽

Detection Accuracy ◽

Testing Time ◽

Learning Models ◽

Accurate Identification ◽

Infected People ◽

Corona Virus

This generation faces existential threats because of the global assault of the novel Corona virus 2019 (i.e., COVID-19). With more than thirteen million infected and nearly 600000 fatalities in 188 countries/regions, COVID-19 is the worst calamity since the World War II. These misfortunes are traced to various reasons, including late detection of latent or asymptomatic carriers, migration, and inadequate isolation of infected people. This makes detection, containment, and mitigation global priorities to contain exposure via quarantine, lockdowns, work/stay at home, and social distancing that are focused on “flattening the curve”. While medical and healthcare givers are at the frontline in the battle against COVID-19, it is a crusade for all of humanity. Meanwhile, machine and deep learning models have been revolutionary across numerous domains and applications whose potency have been exploited to birth numerous state-of-the-art technologies utilised in disease detection, diagnoses, and treatment. Despite these potentials, machine and, particularly, deep learning models are data sensitive, because their effectiveness depends on availability and reliability of data. The unavailability of such data hinders efforts of engineers and computer scientists to fully contribute to the ongoing assault against COVID-19. Faced with a calamity on one side and absence of reliable data on the other, this study presents two data-augmentation models to enhance learnability of the Convolutional Neural Network (CNN) and the Convolutional Long Short-Term Memory (ConvLSTM)-based deep learning models (DADLMs) and, by doing so, boost the accuracy of COVID-19 detection. Experimental results reveal improvement in terms of accuracy of detection, logarithmic loss, and testing time relative to DLMs devoid of such data augmentation. Furthermore, average increases of 4% to 11% in COVID-19 detection accuracy are reported in favour of the proposed data-augmented deep learning models relative to the machine learning techniques. Therefore, the proposed algorithm is effective in performing a rapid and consistent Corona virus diagnosis that is primarily aimed at assisting clinicians in making accurate identification of the virus.

Download Full-text

Sentiment Analysis of Lithuanian Texts Using Traditional and Deep Learning Approaches

Computers ◽

10.3390/computers8010004 ◽

2019 ◽

Vol 8 (1) ◽

pp. 4 ◽

Cited By ~ 4

Author(s):

Jurgita Kapočiūtė-Dzikienė ◽

Robertas Damaševičius ◽

Marcin Woźniak

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Sentiment Analysis ◽

Short Term Memory ◽

Machine Learning Techniques ◽

Support Vector ◽

Learning Approaches ◽

Full Dataset ◽

Learning Techniques ◽

Long Short Term Memory

We describe the sentiment analysis experiments that were performed on the Lithuanian Internet comment dataset using traditional machine learning (Naïve Bayes Multinomial—NBM and Support Vector Machine—SVM) and deep learning (Long Short-Term Memory—LSTM and Convolutional Neural Network—CNN) approaches. The traditional machine learning techniques were used with the features based on the lexical, morphological, and character information. The deep learning approaches were applied on the top of two types of word embeddings (Vord2Vec continuous bag-of-words with negative sampling and FastText). Both traditional and deep learning approaches had to solve the positive/negative/neutral sentiment classification task on the balanced and full dataset versions. The best deep learning results (reaching 0.706 of accuracy) were achieved on the full dataset with CNN applied on top of the FastText embeddings, replaced emoticons, and eliminated diacritics. The traditional machine learning approaches demonstrated the best performance (0.735 of accuracy) on the full dataset with the NBM method, replaced emoticons, restored diacritics, and lemma unigrams as features. Although traditional machine learning approaches were superior when compared to the deep learning methods; deep learning demonstrated good results when applied on the small datasets.

Download Full-text

COMPARATIVE ANALYSIS AND EVALUATION OF THE APPLICATION OF DEEP LEARNING TECHNIQUES TO CYBERSECURITY DATASETS

DYNA INGENIERIA E INDUSTRIA ◽

10.6036/10007 ◽

2021 ◽

Vol 96 (5) ◽

pp. 528-533

Author(s):

XAVIER LARRIVA NOVO ◽

MARIO VEGA BARBAS ◽

VICTOR VILLAGRA ◽

JULIO BERROCAL

Keyword(s):

Machine Learning ◽

Deep Learning ◽

High Performance ◽

New Technologies ◽

Short Term Memory ◽

Machine Learning Techniques ◽

Short Term ◽

Term Memory ◽

Learning Techniques ◽

Long Short Term Memory

Cybersecurity has stood out in recent years with the aim of protecting information systems. Different methods, techniques and tools have been used to make the most of the existing vulnerabilities in these systems. Therefore, it is essential to develop and improve new technologies, as well as intrusion detection systems that allow detecting possible threats. However, the use of these technologies requires highly qualified cybersecurity personnel to analyze the results and reduce the large number of false positives that these technologies presents in their results. Therefore, this generates the need to research and develop new high-performance cybersecurity systems that allow efficient analysis and resolution of these results. This research presents the application of machine learning techniques to classify real traffic, in order to identify possible attacks. The study has been carried out using machine learning tools applying deep learning algorithms such as multi-layer perceptron and long-short-term-memory. Additionally, this document presents a comparison between the results obtained by applying the aforementioned algorithms and algorithms that are not deep learning, such as: random forest and decision tree. Finally, the results obtained are presented, showing that the long-short-term-memory algorithm is the one that provides the best results in relation to precision and logarithmic loss.

Download Full-text

Combining Deep Learning Models for Enhancing the Detection of Botnet Attacks in Multiple Sensors Internet of Things Networks

JOIV International Journal on Informatics Visualization ◽

10.30630/joiv.5.4.733 ◽

2021 ◽

Vol 5 (4) ◽

pp. 380

Author(s):

Abdulkareem A. Hezam ◽

Salama A. Mostafa ◽

Zirawani Baharum ◽

Alde Alanda ◽

Mohd Zaki Salikon

Keyword(s):

Neural Network ◽

Deep Learning ◽

Recurrent Neural Network ◽

Short Term Memory ◽

Denial Of Service ◽

Learning Models ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory ◽

Iot Devices

Distributed-Denial-of-Service impacts are undeniably significant, and because of the development of IoT devices, they are expected to continue to rise in the future. Even though many solutions have been developed to identify and prevent this assault, which is mainly targeted at IoT devices, the danger continues to exist and is now larger than ever. It is common practice to launch denial of service attacks in order to prevent legitimate requests from being completed. This is accomplished by swamping the targeted machines or resources with false requests in an attempt to overpower systems and prevent many or all legitimate requests from being completed. There have been many efforts to use machine learning to tackle puzzle-like middle-box problems and other Artificial Intelligence (AI) problems in the last few years. The modern botnets are so sophisticated that they may evolve daily, as in the case of the Mirai botnet, for example. This research presents a deep learning method based on a real-world dataset gathered by infecting nine Internet of Things devices with two of the most destructive DDoS botnets, Mirai and Bashlite, and then analyzing the results. This paper proposes the BiLSTM-CNN model that combines Bidirectional Long-Short Term Memory Recurrent Neural Network and Convolutional Neural Network (CNN). This model employs CNN for data processing and feature optimization, and the BiLSTM is used for classification. This model is evaluated by comparing its results with three standard deep learning models of CNN, Recurrent Neural Network (RNN), and long-Short Term Memory Recurrent Neural Network (LSTM–RNN). There is a huge need for more realistic datasets to fully test such models' capabilities, and where N-BaIoT comes, it also includes multi-device IoT data. The N-BaIoT dataset contains DDoS attacks with the two of the most used types of botnets: Bashlite and Mirai. The 10-fold cross-validation technique tests the four models. The obtained results show that the BiLSTM-CNN outperforms all other individual classifiers in every aspect in which it achieves an accuracy of 89.79% and an error rate of 0.1546 with a very high precision of 93.92% with an f1-score and recall of 85.73% and 89.11%, respectively. The RNN achieves the highest accuracy among the three individual models, with an accuracy of 89.77%, followed by LSTM, which achieves the second-highest accuracy of 89.71%. CNN, on the other hand, achieves the lowest accuracy among all classifiers of 89.50%.

Download Full-text

Deep Learning with a Long Short-Term Memory Networks Approach for Rainfall-Runoff Simulation

Water ◽

10.3390/w10111543 ◽

2018 ◽

Vol 10 (11) ◽

pp. 1543 ◽

Cited By ~ 59

Author(s):

Caihong Hu ◽

Qiang Wu ◽

Hui Li ◽

Shengqi Jian ◽

Nan Li ◽

...

Keyword(s):

Deep Learning ◽

Short Term Memory ◽

Network Models ◽

Machine Learning Techniques ◽

Rainfall Runoff ◽

Ann Model ◽

Runoff Simulation ◽

Simulation Performance ◽

Artificial Neural Network Ann ◽

Runoff Events

Considering the high random and non-static property of the rainfall-runoff process, lots of models are being developed in order to learn about such a complex phenomenon. Recently, Machine learning techniques such as the Artificial Neural Network (ANN) and other networks have been extensively used by hydrologists for rainfall-runoff modelling as well as for other fields of hydrology. However, deep learning methods such as the state-of-the-art for LSTM networks are little studied in hydrological sequence time-series predictions. We deployed ANN and LSTM network models for simulating the rainfall-runoff process based on flood events from 1971 to 2013 in Fen River basin monitored through 14 rainfall stations and one hydrologic station in the catchment. The experimental data were from 98 rainfall-runoff events in this period. In between 86 rainfall-runoff events were used as training set, and the rest were used as test set. The results show that the two networks are all suitable for rainfall-runoff models and better than conceptual and physical based models. LSTM models outperform the ANN models with the values of R 2 and N S E beyond 0.9, respectively. Considering different lead time modelling the LSTM model is also more stable than ANN model holding better simulation performance. The special units of forget gate makes LSTM model better simulation and more intelligent than ANN model. In this study, we want to propose new data-driven methods for flood forecasting.

Download Full-text