Comparison of boosted regression tree and random forest models for mapping topsoil organic carbon concentration in an alpine ecosystem

Over the years, rampant wildfires have plagued the state of California, creating economic and environmental loss. In 2018, wildfires cost nearly 800 million dollars in economic loss and claimed more than 100 lives in California. Over 1.6 million acres of land has burned and caused large sums of environmental damage. Although, recently, researchers have introduced machine learning models and algorithms in predicting the wildfire risks, these results focused on special perspectives and were restricted to a limited number of data parameters. In this paper, we have proposed two data-driven machine learning approaches based on random forest models to predict the wildfire risk at areas near Monticello and Winters, California. This study demonstrated how the models were developed and applied with comprehensive data parameters such as powerlines, terrain, and vegetation in different perspectives that improved the spatial and temporal accuracy in predicting the risk of wildfire including fire ignition. The combined model uses the spatial and the temporal parameters as a single combined dataset to train and predict the fire risk, whereas the ensemble model was fed separate parameters that were later stacked to work as a single model. Our experiment shows that the combined model produced better results compared to the ensemble of random forest models on separate spatial data in terms of accuracy. The models were validated with Receiver Operating Characteristic (ROC) curves, learning curves, and evaluation metrics such as: accuracy, confusion matrices, and classification report. The study results showed and achieved cutting-edge accuracy of 92% in predicting the wildfire risks, including ignition by utilizing the regional spatial and temporal data along with standard data parameters in Northern California.

Download Full-text

Incorporating space and time into random forest models for analyzing geospatial patterns of drug-related crime incidents in a major U.S. metropolitan area

Computers Environment and Urban Systems ◽

10.1016/j.compenvurbsys.2021.101599 ◽

2021 ◽

Vol 87 ◽

pp. 101599

Author(s):

Zhiyue Xia ◽

Kathleen Stewart ◽

Junchuan Fan

Keyword(s):

Random Forest ◽

Metropolitan Area ◽

Space And Time ◽

Forest Models ◽

Random Forest Models

Download Full-text

Landslide susceptibility assessment for a transmission line in Gansu Province, China by using a hybrid approach of fractal theory, information value, and random forest models

Environmental Earth Sciences ◽

10.1007/s12665-021-09737-w ◽

2021 ◽

Vol 80 (12) ◽

Author(s):

Binbin Zhao ◽

Yunfeng Ge ◽

Hongzhi Chen

Keyword(s):

Random Forest ◽

Landslide Susceptibility ◽

Fractal Theory ◽

Hybrid Approach ◽

Gansu Province ◽

Information Value ◽

Susceptibility Assessment ◽

Landslide Susceptibility Assessment ◽

Forest Models ◽

Random Forest Models

Download Full-text

Random forest models of 305-days milk yield for Holstein cows in Bulgaria

10.1063/5.0034778 ◽

2020 ◽

Author(s):

A. Yordanova ◽

H. Kulina

Keyword(s):

Random Forest ◽

Milk Yield ◽

Holstein Cows ◽

Forest Models ◽

Random Forest Models

Download Full-text

Classifying Very High-Dimensional Data with Random Forests Built from Small Subspaces

International Journal of Data Warehousing and Mining ◽

10.4018/jdwm.2012040103 ◽

2012 ◽

Vol 8 (2) ◽

pp. 44-63 ◽

Cited By ~ 30

Author(s):

Baoxun Xu ◽

Joshua Zhexue Huang ◽

Graham Williams ◽

Qiang Wang ◽

Yunming Ye

Keyword(s):

Random Forest ◽

High Dimensional Data ◽

Real Life ◽

Classification Performance ◽

Feature Weighting ◽

Random Forest Model ◽

High Dimensional ◽

Forest Model ◽

Forest Models ◽

Random Forest Models

The selection of feature subspaces for growing decision trees is a key step in building random forest models. However, the common approach using randomly sampling a few features in the subspace is not suitable for high dimensional data consisting of thousands of features, because such data often contains many features which are uninformative to classification, and the random sampling often doesn’t include informative features in the selected subspaces. Consequently, classification performance of the random forest model is significantly affected. In this paper, the authors propose an improved random forest method which uses a novel feature weighting method for subspace selection and therefore enhances classification performance over high-dimensional data. A series of experiments on 9 real life high dimensional datasets demonstrated that using a subspace size of features where M is the total number of features in the dataset, our random forest model significantly outperforms existing random forest models.

Download Full-text