scholarly journals A Multi-Tier Streaming Analytics Model of 0-Day Ransomware Detection Using Machine Learning

2020 ◽  
Vol 10 (9) ◽  
pp. 3210 ◽  
Author(s):  
Hiba Zuhair ◽  
Ali Selamat ◽  
Ondrej Krejcar

Desktop and portable platform-based information systems become the most tempting target of crypto and locker ransomware attacks during the last decades. Hence, researchers have developed anti-ransomware tools to assist the Windows platform at thwarting ransomware attacks, protecting the information, preserving the users’ privacy, and securing the inter-related information systems through the Internet. Furthermore, they utilized machine learning to devote useful anti-ransomware tools that detect sophisticated versions. However, such anti-ransomware tools remain sub-optimal in efficacy, partial to analyzing ransomware traits, inactive to learn significant and imbalanced data streams, limited to attributing the versions’ ancestor families, and indecisive about fusing the multi-descent versions. In this paper, we propose a hybrid machine learner model, which is a multi-tiered streaming analytics model that classifies various ransomware versions of 14 families by learning 24 static and dynamic traits. The proposed model classifies ransomware versions to their ancestor families numerally and fuses those of multi-descent families statistically. Thus, it classifies ransomware versions among 40K corpora of ransomware, malware, and good-ware versions through both semi-realistic and realistic environments. The supremacy of this ransomware streaming analytics model among competitive anti-ransomware technologies is proven experimentally and justified critically with the average of 97% classification accuracy, 2.4% mistake rate, and 0.34% miss rate under comparative and realistic test.

2021 ◽  
Vol 40 (5) ◽  
pp. 9471-9484
Author(s):  
Yilun Jin ◽  
Yanan Liu ◽  
Wenyu Zhang ◽  
Shuai Zhang ◽  
Yu Lou

With the advancement of machine learning, credit scoring can be performed better. As one of the widely recognized machine learning methods, ensemble learning has demonstrated significant improvements in the predictive accuracy over individual machine learning models for credit scoring. This study proposes a novel multi-stage ensemble model with multiple K-means-based selective undersampling for credit scoring. First, a new multiple K-means-based undersampling method is proposed to deal with the imbalanced data. Then, a new selective sampling mechanism is proposed to select the better-performing base classifiers adaptively. Finally, a new feature-enhanced stacking method is proposed to construct an effective ensemble model by composing the shortlisted base classifiers. In the experiments, four datasets with four evaluation indicators are used to evaluate the performance of the proposed model, and the experimental results prove the superiority of the proposed model over other benchmark models.


2021 ◽  
Vol 11 (21) ◽  
pp. 9797
Author(s):  
Solaf A. Hussain ◽  
Nadire Cavus ◽  
Boran Sekeroglu

Obesity or excessive body fat causes multiple health problems and diseases. However, obesity treatment and control need an accurate determination of body fat percentage (BFP). The existing methods for BFP estimation require several procedures, which reduces their cost-effectivity and generalization. Therefore, developing cost-effective models for BFP estimation is vital for obesity treatment. Machine learning models, particularly hybrid models, have a strong ability to analyze challenging data and perform predictions by combining different characteristics of the models. This study proposed a hybrid machine learning model based on support vector regression and emotional artificial neural networks (SVR-EANNs) for accurate recent BFP prediction using a primary BFP dataset. SVR was applied as a consistent attribute selection model on seven properties and measurements, using the left-out sensitivity analysis, and the regression ability of the EANN was considered in the prediction phase. The proposed model was compared to seven benchmark machine learning models. The obtained results show that the proposed hybrid model (SVR-EANN) outperformed other machine learning models by achieving superior results in the three considered evaluation metrics. Furthermore, the proposed model suggested that abdominal circumference is a significant factor in BFP prediction, while age has a minor effect.


2020 ◽  
Vol 44 (7-8) ◽  
pp. 499-514
Author(s):  
Yi Zheng ◽  
Hyunjung Cheon ◽  
Charles M. Katz

This study explores advanced techniques in machine learning to develop a short tree-based adaptive classification test based on an existing lengthy instrument. A case study was carried out for an assessment of risk for juvenile delinquency. Two unique facts of this case are (a) the items in the original instrument measure a large number of distinctive constructs; (b) the target outcomes are of low prevalence, which renders imbalanced training data. Due to the high dimensionality of the items, traditional item response theory (IRT)-based adaptive testing approaches may not work well, whereas decision trees, which are developed in the machine learning discipline, present as a promising alternative solution for adaptive tests. A cross-validation study was carried out to compare eight tree-based adaptive test constructions with five benchmark methods using data from a sample of 3,975 subjects. The findings reveal that the best-performing tree-based adaptive tests yielded better classification accuracy than the benchmark method IRT scoring with optimal cutpoints, and yielded comparable or better classification accuracy than the best benchmark method, random forest with balanced sampling. The competitive classification accuracy of the tree-based adaptive tests also come with an over 30-fold reduction in the length of the instrument, only administering between 3 to 6 items to any individual. This study suggests that tree-based adaptive tests have an enormous potential when used to shorten instruments that measure a large variety of constructs.


2019 ◽  
Vol 141 (8) ◽  
Author(s):  
Tae Hyong Kim ◽  
Ahnryul Choi ◽  
Hyun Mu Heo ◽  
Kyungran Kim ◽  
Kyungsuk Lee ◽  
...  

Pre-impact fall detection can send alarm service faster to reduce long-lie conditions and decrease the risk of hospitalization. Detecting various types of fall to determine the impact site or direction prior to impact is important because it increases the chance of decreasing the incidence or severity of fall-related injuries. In this study, a robust pre-impact fall detection model was developed to classify various activities and falls as multiclass and its performance was compared with the performance of previous developed models. Twelve healthy subjects participated in this study. All subjects were asked to place an inertial measuring unit module by fixing on a belt near the left iliac crest to collect accelerometer data for each activity. Our novel proposed model consists of feature calculation and infinite latent feature selection (ILFS) algorithm, auto labeling of activities, and application of machine learning classifiers for discrete and continuous time series data. Nine machine-learning classifiers were applied to detect falls prior to impact and derive final detection results by sorting the classifier. Our model showed the highest classification accuracy. Results for the proposed model that could classify as multiclass showed significantly higher average classification accuracy of 99.57 ± 0.01% for discrete data-based classifiers and 99.84 ± 0.02% for continuous time series-based classifiers than previous models (p < 0.01). In the future, multiclass pre-impact fall detection models can be applied to fall protector devices by detecting various activities for sending alerts or immediate feedback reactions to prevent falls.


Author(s):  
Deepali R. Vora ◽  
Kamatchi R. Iyer

The goodness measure of any institute lies in minimising the dropouts and targeting good placements. So, predicting students' performance is very interesting and an important task for educational information systems. Machine learning and deep learning are the emerging areas that truly entice more research practices. This research focuses on applying the deep learning methods to educational data for classification and prediction. The educational data of students from engineering domain with cognitive and non-cognitive parameters is considered. The hybrid model with support vector machine (SVM) and deep belief network (DBN) is devised. The SVM predicts class labels from preprocessed data. These class labels and actual class labels acts as input to the DBN to perform final classification. The hybrid model is further optimised using cuckoo search with Levy flight. The results clearly show that the proposed model SVM-LCDBN gives better performance as compared to simple hybrid model and hybrid model with traditional cuckoo search.


Energies ◽  
2018 ◽  
Vol 11 (9) ◽  
pp. 2235 ◽  
Author(s):  
Zigui Jiang ◽  
Rongheng Lin ◽  
Fangchun Yang

Time-series smart meter data can record precisely electricity consumption behaviors of every consumer in the smart grid system. A better understanding of consumption behaviors and an effective consumer categorization based on the similarity of these behaviors can be helpful for flexible demand management and effective energy control. In this paper, we propose a hybrid machine learning model including both unsupervised clustering and supervised classification for categorizing consumers based on the similarity of their typical electricity consumption behaviors. Unsupervised clustering algorithm is used to extract the typical electricity consumption behaviors and perform fuzzy consumer categorization, followed by a proposed novel algorithm to identify distinct consumer categories and their consumption characteristics. Supervised classification algorithm is used to classify new consumers and evaluate the validity of the identified categories. The proposed model is applied to a real dataset of U.S. non-residential consumers collected by smart meters over one year. The results indicate that large or special institutions usually have their distinct consumption characteristics while others such as some medium and small institutions or similar building types may have the same characteristics. Moreover, the comparison results with other methods show the improved performance of the proposed model in terms of category identification and classifying accuracy.


Author(s):  
Mohannad Elhamod ◽  
Kelly M. Diamond ◽  
A. Murat Maga ◽  
Yasin Bakis ◽  
Henry L. Bart ◽  
...  

AbstractFish species classification is an important task that is the foundation of many industrial, commercial, ecological, and scientific applications involving the study of fish distributions, dynamics, and evolution.While conventional approaches for this task use off-the-shelf machine learning (ML) methods such as existing Convolutional Neural Network (ConvNet) architectures, there is an opportunity to inform the ConvNet architecture using our knowledge of biological hierarchies among taxonomic classes.In this work, we propose infusing phylogenetic information into the model’s training to guide its structure and relationships among the extracted features. In our extensive experimental analyses, the proposed model, named Hierarchy-Guided Neural Network (HGNN), outperforms conventional ConvNet models in terms of classification accuracy under scarce training data conditions.We also observe that HGNN shows better resilience to adversarial occlusions, when some of the most informative patch regions of the image are intentionally blocked and their effect on classification accuracy is studied.


Author(s):  
Susheelamma K. H. ◽  
K. M. Ravikumar

<p class="Abstract">Several challenges are associated with online based learning systems, the most important of which is the lack of student motivation in various course materials and for various course activities. Further, it is important to identify student who are at risk of failing to complete the course on time. The existing models applied machine learning approach for solving it. However, these models are not efficient as they are trained using legacy data and also failed to address imbalanced data issues for both training and testing the classification approach. Further, they are not efficient for classifying new courses. For overcoming these research challenges, this work presented a novel design by training the learning model for identifying risk using current courses. Further, we present an XGBoost classification algorithm that can classify risk for new courses. Experiments are conducted to evaluate performance of proposed model. The outcome shows the proposed model attain significant performance over stat-of-art model in terms of ROC, F-measure, Precision and Recall.</p>


Author(s):  
Bharath Sudharsan ◽  
John G. Breslin ◽  
Muhammad Intizar Ali

2020 ◽  
Vol 12 (01) ◽  
pp. 2050003
Author(s):  
Ahmed Lasisi ◽  
Pengyu Li ◽  
Jian Chen

Highway-rail grade crossing (HRGC) accidents continue to be a major source of transportation casualties in the United States. This can be attributed to increased road and rail operations and/or lack of adequate safety programs based on comprehensive HRGC accidents analysis amidst other reasons. The focus of this study is to predict HRGC accidents in a given rail network based on a machine learning analysis of a similar network with cognate attributes. This study is an improvement on past studies that either attempt to predict accidents in a given HRGC or spatially analyze HRGC accidents for a particular rail line. In this study, a case for a hybrid machine learning and geographic information systems (GIS) approach is presented in a large rail network. The study involves collection and wrangling of relevant data from various sources; exploratory analysis, and supervised machine learning (classification and regression) of HRGC data from 2008 to 2017 in California. The models developed from this analysis were used to make binary predictions [98.9% accuracy & 0.9838 Receiver Operating Characteristic (ROC) score] and quantitative estimations of HRGC casualties in a similar network over the next 10 years. While results are spatially presented in GIS, this novel hybrid application of machine learning and GIS in HRGC accidents’ analysis will help stakeholders to pro-actively engage with casualties through addressing major accident causes as identified in this study. This paper is concluded with a Systems-Action-Management (SAM) approach based on text analysis of HRGC accident risk reports from Federal Railroad Administration.


Sign in / Sign up

Export Citation Format

Share Document