Accurate classification of pediatric colonic IBD subtype using a random forest machine learning classifier

Ransomware (RW) is a distinctive variety of malware that encrypts the files or locks the user’s system by keeping and taking their files hostage, which leads to huge financial losses to users. In this article, we propose a new model that extracts the novel features from the RW dataset and performs classification of the RW and benign files. The proposed model can detect a large number of RW from various families at runtime and scan the network, registry activities, and file system throughout the execution. API-call series was reutilized to represent the behavior-based features of RW. The technique extracts fourteen-feature vector at runtime and analyzes it by applying online machine learning algorithms to predict the RW. To validate the effectiveness and scalability, we test 78550 recent malign and benign RW and compare with the random forest and AdaBoost, and the testing accuracy is extended at 99.56%.

Download Full-text

Phybrata Sensors and Machine Learning for Enhanced Neurophysiological Diagnosis and Treatment

Sensors ◽

10.3390/s21217417 ◽

2021 ◽

Vol 21 (21) ◽

pp. 7417

Author(s):

Alex J. Hope ◽

Utkarsh Vashisth ◽

Matthew J. Parker ◽

Andreas B. Ralston ◽

Joshua M. Roper ◽

...

Keyword(s):

Machine Learning ◽

Time Series ◽

Random Forest ◽

Binary Classification ◽

Classification Performance ◽

Support Vector ◽

Use Case ◽

Signal Features ◽

Test Population

Concussion injuries remain a significant public health challenge. A significant unmet clinical need remains for tools that allow related physiological impairments and longer-term health risks to be identified earlier, better quantified, and more easily monitored over time. We address this challenge by combining a head-mounted wearable inertial motion unit (IMU)-based physiological vibration acceleration (“phybrata”) sensor and several candidate machine learning (ML) models. The performance of this solution is assessed for both binary classification of concussion patients and multiclass predictions of specific concussion-related neurophysiological impairments. Results are compared with previously reported approaches to ML-based concussion diagnostics. Using phybrata data from a previously reported concussion study population, four different machine learning models (Support Vector Machine, Random Forest Classifier, Extreme Gradient Boost, and Convolutional Neural Network) are first investigated for binary classification of the test population as healthy vs. concussion (Use Case 1). Results are compared for two different data preprocessing pipelines, Time-Series Averaging (TSA) and Non-Time-Series Feature Extraction (NTS). Next, the three best-performing NTS models are compared in terms of their multiclass prediction performance for specific concussion-related impairments: vestibular, neurological, both (Use Case 2). For Use Case 1, the NTS model approach outperformed the TSA approach, with the two best algorithms achieving an F1 score of 0.94. For Use Case 2, the NTS Random Forest model achieved the best performance in the testing set, with an F1 score of 0.90, and identified a wider range of relevant phybrata signal features that contributed to impairment classification compared with manual feature inspection and statistical data analysis. The overall classification performance achieved in the present work exceeds previously reported approaches to ML-based concussion diagnostics using other data sources and ML models. This study also demonstrates the first combination of a wearable IMU-based sensor and ML model that enables both binary classification of concussion patients and multiclass predictions of specific concussion-related neurophysiological impairments.

Download Full-text

Damage Classification of Composites Using Machine Learning

Volume 13: Safety Engineering, Risk, and Reliability Analysis ◽

10.1115/imece2019-11851 ◽

2019 ◽

Author(s):

Shweta Dabetwar ◽

Stephen Ekwaro-Osire ◽

João Paulo Dias

Keyword(s):

Machine Learning ◽

Composite Materials ◽

Random Forest ◽

Condition Monitoring ◽

Machine Learning Algorithms ◽

Support Vector ◽

Damage Classification ◽

Combining Data ◽

Ultrasonic Measurements

Abstract Composite materials have tremendous and ever-increasing applications in complex engineering systems; thus, it is important to develop non-destructive and efficient condition monitoring methods to improve damage prediction, thereby avoiding catastrophic failures and reducing standby time. Nondestructive condition monitoring techniques when combined with machine learning applications can contribute towards the stated improvements. Thus, the research question taken into consideration for this paper is “Can machine learning techniques provide efficient damage classification of composite materials to improve condition monitoring using features extracted from acousto-ultrasonic measurements?” In order to answer this question, acoustic-ultrasonic signals in Carbon Fiber Reinforced Polymer (CFRP) composites for distinct damage levels were taken from NASA Ames prognostics data repository. Statistical condition indicators of the signals were used as features to train and test four traditional machine learning algorithms such as K-nearest neighbors, support vector machine, Decision Tree and Random Forest, and their performance was compared and discussed. Results showed higher accuracy for Random Forest with a strong dependency on the feature extraction/selection techniques employed. By combining data analysis from acoustic-ultrasonic measurements in composite materials with machine learning tools, this work contributes to the development of intelligent damage classification algorithms that can be applied to advanced online diagnostics and health management strategies of composite materials, operating under more complex working conditions.

Download Full-text

Classification of EEG Signals Using a Genetic-Based Machine Learning Classifier

2007 29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society ◽

10.1109/iembs.2007.4352990 ◽

2007 ◽

Cited By ~ 12

Author(s):

B. T. Skinner ◽

H. T. Nguyen ◽

D. K. Liu

Keyword(s):

Machine Learning ◽

Eeg Signals ◽

Learning Classifier

Download Full-text

Forest Fire Prediction using Machine Learning Models based on DC, Wind and RH

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.f1026.0386s20 ◽

2020 ◽

Vol 8 (6S) ◽

pp. 142-143

Keyword(s):

Machine Learning ◽

Random Forest ◽

Forest Fire ◽

Random Forest Algorithm ◽

Learning Models ◽

Learning Classifier ◽

Machine Learning Models ◽

Classifier Algorithms

The paper points out forest fire prediction using machine learning models on the basis of viz. DC, Wind, RH out of the several machine learning classifier algorithms, It is relevant that random forest algorithm generates optimum accuracy(99.61%).

Download Full-text

Enhanced Changeover Detection in Industry 4.0 Environments with Machine Learning

Sensors ◽

10.3390/s21175896 ◽

2021 ◽

Vol 21 (17) ◽

pp. 5896

Author(s):

Eddi Miller ◽

Vladyslav Borysenko ◽

Moritz Heusinger ◽

Niklas Niedner ◽

Bastian Engelmann ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Binary Classification ◽

Model Performance ◽

Support Vector ◽

Milling Machine ◽

Vector Machines ◽

Changeover Times ◽

Flow Power

Changeover times are an important element when evaluating the Overall Equipment Effectiveness (OEE) of a production machine. The article presents a machine learning (ML) approach that is based on an external sensor setup to automatically detect changeovers in a shopfloor environment. The door statuses, coolant flow, power consumption, and operator indoor GPS data of a milling machine were used in the ML approach. As ML methods, Decision Trees, Support Vector Machines, (Balanced) Random Forest algorithms, and Neural Networks were chosen, and their performance was compared. The best results were achieved with the Random Forest ML model (97% F1 score, 99.72% AUC score). It was also carried out that model performance is optimal when only a binary classification of a changeover phase and a production phase is considered and less subphases of the changeover process are applied.

Download Full-text

Optimizing taxonomic classification of marker gene amplicon sequences

10.7287/peerj.preprints.3208v2 ◽

2018 ◽

Cited By ~ 4

Author(s):

Nicholas A Bokulich ◽

Benjamin D Kaehler ◽

Jai Ram Rideout ◽

Matthew Dillon ◽

Evan Bolyen ◽

...

Keyword(s):

Machine Learning ◽

Sequence Data ◽

Marker Gene ◽

Parameter Tuning ◽

Operating Conditions ◽

Evaluation Framework ◽

Taxonomic Classification ◽

Consensus Methods ◽

Learning Classifier

Background: Taxonomic classification of marker-gene sequences is an important step in microbiome analysis. Results: We present q2-feature-classifier ( https://github.com/qiime2/q2-feature-classifier ), a QIIME 2 plugin containing several novel machine-learning and alignment-based taxonomy classifiers that meet or exceed the accuracy of existing methods for marker-gene amplicon sequence classification. We evaluated and optimized several commonly used taxonomic classification methods (RDP, BLAST, UCLUST) and several new methods (a scikit-learn naive Bayes machine-learning classifier, and alignment-based taxonomy consensus methods of VSEARCH, BLAST+, and SortMeRNA) for classification of marker-gene amplicon sequence data. Conclusions: Our results illustrate the importance of parameter tuning for optimizing classifier performance, and we make recommendations regarding parameter choices for a range of standard operating conditions. q2-feature-classifier and our evaluation framework, tax-credit, are both free, open-source, BSD-licensed packages available on GitHub.

Download Full-text

Optimizing taxonomic classification of marker gene sequences

10.7287/peerj.preprints.3208v1 ◽

2017 ◽

Cited By ~ 4

Author(s):

Nicholas A Bokulich ◽

Benjamin D Kaehler ◽

Jai Ram Rideout ◽

Matthew Dillon ◽

Evan Bolyen ◽

...

Keyword(s):

Machine Learning ◽

Marker Gene ◽

Parameter Tuning ◽

Operating Conditions ◽

Evaluation Framework ◽

Taxonomic Classification ◽

Gene Sequences ◽

Learning Classifier ◽

Classifier Performance

Background. Taxonomic classification of marker-gene sequences is an important step in microbiome analysis. Results. We present q2-feature-classifier (https://github.com/qiime2/q2-feature-classifier), a QIIME 2 plugin containing several novel machine-learning and alignment-based taxonomy classifiers that meet or exceed classification accuracy of existing methods. We evaluated and optimized several commonly used taxonomic classification methods (RDP, BLAST, BLAST+, UCLUST) and several new methods (a scikit-learn naive Bayes machine-learning classifier, and VSEARCH and SortMeRNA alignment-based methods). Conclusions. Our results illustrate the importance of parameter tuning for optimizing classifier performance, and we make explicit recommendations regarding parameter choices for a range of standard operating conditions. q2-feature-classifier and our evaluation framework, tax-credit, are both free, open-source, BSD-licensed packages available on GitHub.

Download Full-text

Reliable Crops Classification Using Limited Number of Sentinel-2 and Sentinel-1 Images

Remote Sensing ◽

10.3390/rs13163176 ◽

2021 ◽

Vol 13 (16) ◽

pp. 3176

Author(s):

Beata Hejmanowska ◽

Piotr Kramarczyk ◽

Ewa Głowienka ◽

Sławomir Mikrut

Keyword(s):

Machine Learning ◽

Random Forest ◽

Confusion Matrix ◽

High Accuracy ◽

Training Set ◽

Validation Data ◽

Comparative Accuracy ◽

The Difference ◽

Sentinel 2

The study presents the analysis of the possible use of limited number of the Sentinel-2 and Sentinel-1 to check if crop declarations that the EU farmers submit to receive subsidies are true. The declarations used in the research were randomly divided into two independent sets (training and test). Based on the training set, supervised classification of both single images and their combinations was performed using random forest algorithm in SNAP (ESA) and our own Python scripts. A comparative accuracy analysis was performed on the basis of two forms of confusion matrix (full confusion matrix commonly used in remote sensing and binary confusion matrix used in machine learning) and various accuracy metrics (overall accuracy, accuracy, specificity, sensitivity, etc.). The highest overall accuracy (81%) was obtained in the simultaneous classification of multitemporal images (three Sentinel-2 and one Sentinel-1). An unexpectedly high accuracy (79%) was achieved in the classification of one Sentinel-2 image at the end of May 2018. Noteworthy is the fact that the accuracy of the random forest method trained on the entire training set is equal 80% while using the sampling method ca. 50%. Based on the analysis of various accuracy metrics, it can be concluded that the metrics used in machine learning, for example: specificity and accuracy, are always higher then the overall accuracy. These metrics should be used with caution, because unlike the overall accuracy, to calculate these metrics, not only true positives but also false positives are used as positive results, giving the impression of higher accuracy. Correct calculation of overall accuracy values is essential for comparative analyzes. Reporting the mean accuracy value for the classes as overall accuracy gives a false impression of high accuracy. In our case, the difference was 10–16% for the validation data, and 25–45% for the test data.

Download Full-text