Ranger Random Forest-Based Efficient Ensemble Learning Approach for Detecting Malicious URLs

Rapid, accurate and inexpensive methods are required to analyze plant traits throughout all crop growth stages for plant phenotyping. Few studies have comprehensively evaluated plant traits from multispectral cameras onboard UAV platforms. Additionally, machine learning algorithms tend to over- or underfit data and limited attention has been paid to optimizing their performance through an ensemble learning approach. This study aims to (1) comprehensively evaluate twelve rice plant traits estimated from aerial unmanned vehicle (UAV)-based multispectral images and (2) introduce Random Forest AdaBoost (RFA) algorithms as an optimization approach for estimating plant traits. The approach was tested based on a farmer’s field in Terengganu, Malaysia, for the off-season from February to June 2018, involving five rice cultivars and three nitrogen (N) rates. Four bands, thirteen indices and Random Forest-AdaBoost (RFA) regression models were evaluated against the twelve plant traits according to the growth stages. Among the plant traits, plant height, green leaf and storage organ biomass, and foliar nitrogen (N) content were estimated well, with a coefficient of determination (R2) above 0.80. In comparing the bands and indices, red, Normalized Difference Vegetation Index (NDVI), Ratio Vegetation Index (RVI), Red-Edge Wide Dynamic Range Vegetation Index (REWDRVI) and Red-Edge Soil Adjusted Vegetation Index (RESAVI) were remarkable in estimating all plant traits at tillering, booting and milking stages with R2 values ranging from 0.80–0.99 and root mean square error (RMSE) values ranging from 0.04–0.22. Milking was found to be the best growth stage to conduct estimations of plant traits. In summary, our findings demonstrate that an ensemble learning approach can improve the accuracy as well as reduce under/overfitting in plant phenotyping algorithms.

Download Full-text

Predicting transcription factor binding using ensemble random forest models

F1000Research ◽

10.12688/f1000research.16200.1 ◽

2018 ◽

Vol 7 ◽

pp. 1603 ◽

Cited By ~ 1

Author(s):

Fatemeh Behjati Ardakani ◽

Florian Schmidt ◽

Marcel H. Schulz

Keyword(s):

Random Forest ◽

Ensemble Learning ◽

Specific Binding ◽

False Positive Rate ◽

Computational Prediction ◽

Cell Types ◽

Learning Approach ◽

Cell Type ◽

Cell Type Specific

Background: Understanding the location and cell-type specific binding of Transcription Factors (TFs) is important in the study of gene regulation. Computational prediction of TF binding sites is challenging, because TFs often bind only to short DNA motifs and cell-type specific co-factors may work together with the same TF to determine binding. Here, we consider the problem of learning a general model for the prediction of TF binding using DNase1-seq data and TF motif description in form of position specific energy matrices (PSEMs). Methods: We use TF ChIP-seq data as a gold-standard for model training and evaluation. Our contribution is a novel ensemble learning approach using random forest classifiers. In the context of the ENCODE-DREAM in vivo TF binding site prediction challenge we consider different learning setups. Results: Our results indicate that the ensemble learning approach is able to better generalize across tissues and cell-types compared to individual tissue-specific classifiers or a classifier applied to the data aggregated across tissues. Furthermore, we show that incorporating DNase1-seq peaks is essential to reduce the false positive rate of TF binding predictions compared to considering the raw DNase1 signal. Conclusions: Analysis of important features reveals that the models preferentially select motifs of other TFs that are close interaction partners in existing protein protein-interaction networks. Code generated in the scope of this project is available on GitHub: https://github.com/SchulzLab/TFAnalysis (DOI: 10.5281/zenodo.1409697).

Download Full-text

Predicting transcription factor binding using ensemble random forest models

F1000Research ◽

10.12688/f1000research.16200.2 ◽

2019 ◽

Vol 7 ◽

pp. 1603 ◽

Cited By ~ 1

Author(s):

Fatemeh Behjati Ardakani ◽

Florian Schmidt ◽

Marcel H. Schulz

Keyword(s):

Random Forest ◽

Ensemble Learning ◽

Specific Binding ◽

False Positive Rate ◽

Computational Prediction ◽

Cell Types ◽

Learning Approach ◽

Cell Type ◽

Cell Type Specific

Background: Understanding the location and cell-type specific binding of Transcription Factors (TFs) is important in the study of gene regulation. Computational prediction of TF binding sites is challenging, because TFs often bind only to short DNA motifs and cell-type specific co-factors may work together with the same TF to determine binding. Here, we consider the problem of learning a general model for the prediction of TF binding using DNase1-seq data and TF motif description in form of position specific energy matrices (PSEMs). Methods: We use TF ChIP-seq data as a gold-standard for model training and evaluation. Our contribution is a novel ensemble learning approach using random forest classifiers. In the context of the ENCODE-DREAM in vivo TF binding site prediction challenge we consider different learning setups. Results: Our results indicate that the ensemble learning approach is able to better generalize across tissues and cell-types compared to individual tissue-specific classifiers or a classifier built based upon data aggregated across tissues. Furthermore, we show that incorporating DNase1-seq peaks is essential to reduce the false positive rate of TF binding predictions compared to considering the raw DNase1 signal. Conclusions: Analysis of important features reveals that the models preferentially select motifs of other TFs that are close interaction partners in existing protein protein-interaction networks. Code generated in the scope of this project is available on GitHub: https://github.com/SchulzLab/TFAnalysis (DOI: 10.5281/zenodo.1409697).

Download Full-text

Impute, Select, Decision Tree and Naïve Bayes (ISE-DNC): An Ensemble Learning Approach to Classify the Lung Cancer

SSRN Electronic Journal ◽

10.2139/ssrn.3667438 ◽

2020 ◽

Author(s):

Bhanumathi S ◽

Dr. Chandrashekara S N

Keyword(s):

Lung Cancer ◽

Decision Tree ◽

Ensemble Learning ◽

Naive Bayes ◽

Naïve Bayes ◽

Learning Approach

Download Full-text

A Novel Ensemble Learning Approach of Deep Learning Techniques to Monitor Distracted Driver Behaviour in Real Time

2021 1st International Conference on Artificial Intelligence and Data Analytics (CAIDA) ◽

10.1109/caida51941.2021.9425243 ◽

2021 ◽

Author(s):

Hafiz Umer Draz ◽

Muhammad Zeeshan Khan ◽

Muhammad Usman Ghani Khan ◽

Amjad Rehman ◽

Ibrahim Abunadi

Keyword(s):

Deep Learning ◽

Real Time ◽

Ensemble Learning ◽

Learning Approach ◽

Driver Behaviour ◽

Learning Techniques

Download Full-text

SMO-RF:A machine learning approach by random forest for predicting class imbalancing followed by SMOTE

Materials Today Proceedings ◽

10.1016/j.matpr.2020.12.891 ◽

2021 ◽

Author(s):

Ankur Goyal ◽

Likhita Rathore ◽

Avinash Sharma

Keyword(s):

Machine Learning ◽

Random Forest ◽

Learning Approach ◽

Machine Learning Approach

Download Full-text

Predicting purchase probability of retail items using an ensemble learning approach and historical data

2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA) ◽

10.1109/icmla51294.2020.00118 ◽

2020 ◽

Author(s):

Archika Sharma ◽

M. Omair Shafiq

Keyword(s):

Ensemble Learning ◽

Historical Data ◽

Learning Approach

Download Full-text

Ensemble Learning Approach with LASSO for Predicting Catalytic Reaction Rates

Synlett ◽

10.1055/a-1304-4878 ◽

2020 ◽

Author(s):

Akira Yada ◽

Kazuhiko Sato ◽

Tarojiro Matsumura ◽

Yasunobu Ando ◽

Kenji Nagata ◽

...

Keyword(s):

Ensemble Learning ◽

Reaction Rates ◽

Initial Reaction Rate ◽

Training Dataset ◽

Initial Reaction ◽

Learning Approach ◽

Learning Framework ◽

Machine Learning Approach ◽

Reasonable Prediction ◽

Epoxidation Of Alkenes

AbstractThe prediction of the initial reaction rate in the tungsten-catalyzed epoxidation of alkenes by using a machine learning approach is demonstrated. The ensemble learning framework used in this study consists of random sampling with replacement from the training dataset, the construction of several predictive models (weak learners), and the combination of their outputs. This approach enables us to obtain a reasonable prediction model that avoids the problem of overfitting, even when analyzing a small dataset.

Download Full-text