Correction to: Garnet major-element composition as an indicator of host-rock type: a machine learning approach using the random forest classifier

Oceanic and coastal ecosystems have undergone complex environmental changes in recent years, amid a context of climate change. These changes are also reflected in the dynamics of water-borne diseases as some of the causative agents of these illnesses are ubiquitous in the aquatic environment and their survival rates are impacted by changes in climatic conditions. Previous studies have established strong relationships between essential climate variables and the coastal distribution and seasonal dynamics of the bacteria Vibrio cholerae, pathogenic types of which are responsible for human cholera disease. In this study we provide a novel exploration of the potential of a machine learning approach to forecast environmental cholera risk in coastal India, home to more than 200 million inhabitants, utilising atmospheric, terrestrial and oceanic satellite-derived essential climate variables. A Random Forest classifier model is developed, trained and tested on a cholera outbreak dataset over the period 2010–2018 for districts along coastal India. The random forest classifier model has an Accuracy of 0.99, an F1 Score of 0.942 and a Sensitivity score of 0.895, meaning that 89.5% of outbreaks are correctly identified. Spatio-temporal patterns emerged in terms of the model’s performance based on seasons and coastal locations. Further analysis of the specific contribution of each Essential Climate Variable to the model outputs shows that chlorophyll-a concentration, sea surface salinity and land surface temperature are the strongest predictors of the cholera outbreaks in the dataset used. The study reveals promising potential of the use of random forest classifiers and remotely-sensed essential climate variables for the development of environmental cholera-risk applications. Further exploration of the present random forest model and associated essential climate variables is encouraged on cholera surveillance datasets in other coastal areas affected by the disease to determine the model’s transferability potential and applicative value for cholera forecasting systems.

Download Full-text

Machine Learning Approach for Malware Detection Using Random Forest Classifier on Process List Data Structure

Proceedings of the 2nd International Conference on Information System and Data Mining - ICISDM '18 ◽

10.1145/3206098.3206113 ◽

2018 ◽

Cited By ~ 1

Author(s):

Santosh Joshi ◽

Himanshu Upadhyay ◽

Leonel Lagos ◽

Naga Suryamitra Akkipeddi ◽

Valerie Guerra

Keyword(s):

Machine Learning ◽

Data Structure ◽

Random Forest ◽

Malware Detection ◽

Random Forest Classifier ◽

Learning Approach ◽

Machine Learning Approach

Download Full-text

Arabic tweeps dialect prediction based on machine learning approach

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v11i2.pp1627-1633 ◽

2021 ◽

Vol 11 (2) ◽

pp. 1627

Author(s):

Khaled Alrifai ◽

Ghaida Rebdawi ◽

Nada Ghneim

Keyword(s):

Machine Learning ◽

Random Forest ◽

Prediction Model ◽

Feature Vector ◽

Random Forest Classifier ◽

Learning Approach ◽

Feature Vectors ◽

Machine Learning Approach ◽

Important Trait

In this paper, we present our approach for profiling Arabic authors on twitter, based on their tweets. We consider here the dialect of an Arabic author as an important trait to be predicted. For this purpose, many indicators, feature vectors and machine learning-based classifiers were implemented. The results of these classifiers were compared to find out the best dialect prediction model. The best dialect prediction model was obtained using random forest classifier with full forms and their stems as feature vector.

Download Full-text

Garnet major-element composition as an indicator of host-rock type: a machine learning approach using the random forest classifier

Contributions to Mineralogy and Petrology ◽

10.1007/s00410-021-01854-w ◽

2021 ◽

Vol 176 (12) ◽

Author(s):

Jan Schönig ◽

Hilmar von Eynatten ◽

Raimon Tolosana-Delgado ◽

Guido Meinhold

Keyword(s):

Machine Learning ◽

Random Forest ◽

Host Rock ◽

Major Element ◽

Learning Algorithm ◽

Bulk Composition ◽

Ultrahigh Pressure ◽

Major Element Composition ◽

Rock Composition ◽

High Discrimination

AbstractThe major-element chemical composition of garnet provides valuable petrogenetic information, particularly in metamorphic rocks. When facing detrital garnet, information about the bulk-rock composition and mineral paragenesis of the initial garnet-bearing host-rock is absent. This prevents the application of chemical thermo-barometric techniques and calls for quantitative empirical approaches. Here we present a garnet host-rock discrimination scheme that is based on a random forest machine-learning algorithm trained on a large dataset of 13,615 chemical analyses of garnet that covers a wide variety of garnet-bearing lithologies. Considering the out-of-bag error, the scheme correctly predicts the original garnet host-rock in (i) > 95% concerning the setting, that is either mantle, metamorphic, igneous, or metasomatic; (ii) > 84% concerning the metamorphic facies, that is either blueschist/greenschist, amphibolite, granulite, or eclogite/ultrahigh-pressure; and (iii) > 93% concerning the host-rock bulk composition, that is either intermediate–felsic/metasedimentary, mafic, ultramafic, alkaline, or calc–silicate. The wide coverage of potential host rocks, the detailed prediction classes, the high discrimination rates, and the successfully tested real-case applications demonstrate that the introduced scheme overcomes many issues related to previous schemes. This highlights the potential of transferring the applied discrimination strategy to the broad range of detrital minerals beyond garnet. For easy and quick usage, a freely accessible web app is provided that guides the user in five steps from garnet composition to prediction results including data visualization.

Download Full-text

SMO-RF:A machine learning approach by random forest for predicting class imbalancing followed by SMOTE

Materials Today Proceedings ◽

10.1016/j.matpr.2020.12.891 ◽

2021 ◽

Author(s):

Ankur Goyal ◽

Likhita Rathore ◽

Avinash Sharma

Keyword(s):

Machine Learning ◽

Random Forest ◽

Learning Approach ◽

Machine Learning Approach

Download Full-text

A machine learning approach using random forest and LASSO to predict wine quality

International Journal of Sustainable Agricultural Management and Informatics ◽

10.1504/ijsami.2021.10040429 ◽

2021 ◽

Vol 7 (3) ◽

pp. 1

Author(s):

Dimitris Ioannidis ◽

Ioannis Athanasiadis

Keyword(s):

Machine Learning ◽

Random Forest ◽

Learning Approach ◽

Wine Quality ◽

Machine Learning Approach

Download Full-text

Machine-learning and statistical methods for DDoS attack detection and defense system in software defined networks

10.32920/ryerson.14657556 ◽

2021 ◽

Author(s):

Merlin James Rukshan Dennis

Keyword(s):

Machine Learning ◽

Random Forest ◽

Statistical Approach ◽

Denial Of Service ◽

Attack Detection ◽

Learning Approach ◽

Ddos Attack ◽

Machine Learning Approach ◽

Ddos Detection ◽

Ddos Attack Detection

Distributed Denial of Service (DDoS) attack is a serious threat on today’s Internet. As the traffic across the Internet increases day by day, it is a challenge to distinguish between legitimate and malicious traffic. This thesis proposes two different approaches to build an efficient DDoS attack detection system in the Software Defined Networking environment. SDN is the latest networking approach which implements centralized controller, which is programmable. The central control and the programming capability of the controller are used in this thesis to implement the detection and mitigation mechanisms. In this thesis, two designed approaches, statistical approach and machine-learning approach, are proposed for the DDoS detection. The statistical approach implements entropy computation and flow statistics analysis. It uses the mean and standard deviation of destination entropy, new flow arrival rate, packets per flow and flow duration to compute various thresholds. These thresholds are then used to distinguish normal and attack traffic. The machine learning approach uses Random Forest classifier to detect the DDoS attack. We fine-tune the Random Forest algorithm to make it more accurate in DDoS detection. In particular, we introduce the weighted voting instead of the standard majority voting to improve the accuracy. Our result shows that the proposed machine-learning approach outperforms the statistical approach. Furthermore, it also outperforms other machine-learning approach found in the literature.

Download Full-text

Modeling of apartment prices in a Colombian context from a machine learning approach with stable-important attributes

DYNA ◽

10.15446/dyna.v87n212.80202 ◽

2020 ◽

Vol 87 (212) ◽

pp. 63-72

Author(s):

Jorge Iván Pérez Rave ◽

Favián González Echavarría ◽

Juan Carlos Correa Morales

Keyword(s):

Machine Learning ◽

Random Forest ◽

Learning Approach ◽

Predictive Capability ◽

Predictive Capacity ◽

Machine Learning Model ◽

Machine Learning Approach ◽

Property Price ◽

Object Of Study ◽

Online Pricing

The objective of this work is to develop a machine learning model for online pricing of apartments in a Colombian context. This article addresses three aspects: i) it compares the predictive capacity of linear regression, regression trees, random forest and bagging; ii) it studies the effect of a group of text attributes on the predictive capability of the models; and iii) it identifies the more stable-important attributes and interprets them from an inferential perspective to better understand the object of study. The sample consists of 15,177 observations of real estate. The methods of assembly (random forest and bagging) show predictive superiority with respect to others. The attributes derived from the text had a significant relationship with the property price (on a log scale). However, their contribution to the predictive capacity was almost nil, since four different attributes achieved highly accurate predictions and remained stable when the sample change.

Download Full-text

Metabolomic Approach for Discrimination of Cultivation Age and Ripening Stage in Ginseng Berry Using Gas Chromatography-Mass Spectrometry

Molecules ◽

10.3390/molecules24213837 ◽

2019 ◽

Vol 24 (21) ◽

pp. 3837 ◽

Cited By ~ 1

Author(s):

Seong-Eun Park ◽

Seung-Ho Seo ◽

Eun-Ju Kim ◽

Dae-Hun Park ◽

Kyung-Mok Park ◽

...

Keyword(s):

Machine Learning ◽

Mass Spectrometry ◽

Gas Chromatography ◽

Random Forest ◽

Gas Chromatography Mass Spectrometry ◽

Learning Approach ◽

Ripening Stage ◽

Ripening Stages ◽

Machine Learning Approach ◽

Metabolomic Approach

The purpose of this study was to analyze metabolic differences of ginseng berries according to cultivation age and ripening stage using gas chromatography-mass spectrometry (GC-MS)-based metabolomics method. Ginseng berries were harvested every week during five different ripening stages of three-year-old and four-year-old ginseng. Using identified metabolites, a random forest machine learning approach was applied to obtain predictive models for the classification of cultivation age or ripening stage. Principal component analysis (PCA) score plot showed a clear separation by ripening stage, indicating that continuous metabolic changes occurred until the fifth ripening stage. Three-year-old ginseng berries had higher levels of valine, glutamic acid, and tryptophan, but lower levels of lactic acid and galactose than four-year-old ginseng berries at fully ripened stage. Metabolic pathways affected by different cultivation age were involved in amino acid metabolism pathways. A random forest machine learning approach extracted some important metabolites for predicting cultivation age or ripening stage with low error rate. This study demonstrates that different cultivation ages or ripening stages of ginseng berry can be successfully discriminated using a GC-MS-based metabolomic approach together with random forest analysis.

Download Full-text

An Analytical Model for Prediction of Heart Disease using Machine Learning Classifiers

10.36227/techrxiv.14867175 ◽

2021 ◽

Author(s):

Diti Roy ◽

Md. Ashiq Mahmood ◽

Tamal Joyti Roy

Keyword(s):

Machine Learning ◽

Heart Disease ◽

Random Forest ◽

Learning Algorithm ◽

Modern Technology ◽

Learning Approach ◽

Data Sets ◽

Machine Learning Classifiers ◽

Machine Learning Approach ◽

Day By Day

Heart Disease is the most dominating disease which is taking a large number of deaths every year. A report from WHO in 2016 portrayed that every year at least 17 million people die of heart disease. This number is gradually increasing day by day and WHO estimated that this death toll will reach the summit of 75 million by 2030. Despite having modern technology and health care system predicting heart disease is still beyond limitations. As the Machine Learning algorithm is a vital source predicting data from available data sets we have used a machine learning approach to predict heart disease. We have collected data from the UCI repository. In our study, we have used Random Forest, Zero R, Voted Perceptron, K star classifier. We have got the best result through the Random Forest classifier with an accuracy of 97.69.

Download Full-text