A Comparative Assessment of Random Forest and k-Nearest Neighbor Classifiers for Gully Erosion Susceptibility Mapping

This research was conducted to determine which areas in the Robat Turk watershed in Iran are sensitive to gully erosion, and to define the relationship between gully erosion and geo-environmental factors by two data mining techniques, namely, Random Forest (RF) and k-Nearest Neighbors (KNN). First, 242 gully locations we determined in field surveys and mapped in ArcGIS software. Then, twelve gully-related conditioning factors were selected. Our results showed that, for both the RF and KNN models, altitude, distance to roads, and distance from the river had the highest influence upon gully erosion sensitivity. We assessed the gully erosion susceptibility maps using the Receiver Operating Characteristic (ROC) curve. Validation results showed that the RF and KNN models had Area Under the Curve (AUC) 87.4 and 80.9%, respectively. As a result, the RF method has better performance compared with the KNN method for mapping gully erosion susceptibility. Rainfall, altitude, and distance from a river were identified as the most important factors affecting gully erosion in this area. The methodology used in this research is transferable to other regions to determine which areas are prone to gully erosion and to explicitly delineate high-risk zones within these areas.

Download Full-text

Implementation of Artificial Intelligence Based Ensemble Models for Gully Erosion Susceptibility Assessment

Remote Sensing ◽

10.3390/rs12213620 ◽

2020 ◽

Vol 12 (21) ◽

pp. 3620

Author(s):

Indrajit Chowdhuri ◽

Subodh Chandra Pal ◽

Alireza Arabameri ◽

Asish Saha ◽

Rabin Chakrabortty ◽

...

Keyword(s):

Land Use Changes ◽

Area Under The Curve ◽

Regression Tree ◽

Gully Erosion ◽

Support Vector ◽

Boosted Regression Tree ◽

Conditioning Factors ◽

Susceptibility Maps ◽

Chotanagpur Plateau ◽

Additive Regression

The Rarh Bengal region in West Bengal, particularly the eastern fringe area of the Chotanagpur plateau, is highly prone to water-induced gully erosion. In this study, we analyzed the spatial patterns of a potential gully erosion in the Gandheswari watershed. This area is highly affected by monsoon rainfall and ongoing land-use changes. This combination causes intensive gully erosion and land degradation. Therefore, we developed gully erosion susceptibility maps (GESMs) using the machine learning (ML) algorithms boosted regression tree (BRT), Bayesian additive regression tree (BART), support vector regression (SVR), and the ensemble of the SVR-Bee algorithm. The gully erosion inventory maps are based on a total of 178 gully head-cutting points, taken as the dependent factor, and gully erosion conditioning factors, which serve as the independent factors. We validated the ML model results using the area under the curve (AUC), accuracy (ACC), true skill statistic (TSS), and Kappa coefficient index. The AUC result of the BRT, BART, SVR, and SVR-Bee models are 0.895, 0.902, 0.927, and 0.960, respectively, which show very good GESM accuracies. The ensemble model provides more accurate prediction results than any single ML model used in this study.

Download Full-text

Evaluation of Three Different Machine Learning Methods for Object-Based Artificial Terrace Mapping—A Case Study of the Loess Plateau, China

Remote Sensing ◽

10.3390/rs13051021 ◽

2021 ◽

Vol 13 (5) ◽

pp. 1021

Author(s):

Hu Ding ◽

Jiaming Na ◽

Shangjing Jiang ◽

Jie Zhu ◽

Kai Liu ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Loess Plateau ◽

Water Conservation ◽

Nearest Neighbor ◽

Gradient Boosting ◽

K Nearest Neighbor ◽

The Loess Plateau ◽

Object Based ◽

Extreme Gradient Boosting

Artificial terraces are of great importance for agricultural production and soil and water conservation. Automatic high-accuracy mapping of artificial terraces is the basis of monitoring and related studies. Previous research achieved artificial terrace mapping based on high-resolution digital elevation models (DEMs) or imagery. As a result of the importance of the contextual information for terrace mapping, object-based image analysis (OBIA) combined with machine learning (ML) technologies are widely used. However, the selection of an appropriate classifier is of great importance for the terrace mapping task. In this study, the performance of an integrated framework using OBIA and ML for terrace mapping was tested. A catchment, Zhifanggou, in the Loess Plateau, China, was used as the study area. First, optimized image segmentation was conducted. Then, features from the DEMs and imagery were extracted, and the correlations between the features were analyzed and ranked for classification. Finally, three different commonly-used ML classifiers, namely, extreme gradient boosting (XGBoost), random forest (RF), and k-nearest neighbor (KNN), were used for terrace mapping. The comparison with the ground truth, as delineated by field survey, indicated that random forest performed best, with a 95.60% overall accuracy (followed by 94.16% and 92.33% for XGBoost and KNN, respectively). The influence of class imbalance and feature selection is discussed. This work provides a credible framework for mapping artificial terraces.

Download Full-text

Tropical Balls and Its Applications to K Nearest Neighbor over the Space of Phylogenetic Trees

Mathematics ◽

10.3390/math9070779 ◽

2021 ◽

Vol 9 (7) ◽

pp. 779

Author(s):

Ruriko Yoshida

Keyword(s):

Supervised Learning ◽

Phylogenetic Trees ◽

Nearest Neighbor ◽

Nearest Neighbors ◽

High Dimensional ◽

Learning Method ◽

Dimensional Vector ◽

K Nearest Neighbor ◽

K Nearest Neighbors

A tropical ball is a ball defined by the tropical metric over the tropical projective torus. In this paper we show several properties of tropical balls over the tropical projective torus and also over the space of phylogenetic trees with a given set of leaf labels. Then we discuss its application to the K nearest neighbors (KNN) algorithm, a supervised learning method used to classify a high-dimensional vector into given categories by looking at a ball centered at the vector, which contains K vectors in the space.

Download Full-text

Quantifying the Influence of Achievement Emotions for Student Learning in MOOCs

Journal of Educational Computing Research ◽

10.1177/0735633120967318 ◽

2020 ◽

pp. 073563312096731

Author(s):

Bowen Liu ◽

Wanli Xing ◽

Yifang Zeng ◽

Yonghe Wu

Keyword(s):

Random Forest ◽

Nearest Neighbor ◽

Online Courses ◽

Learning Performance ◽

Support Vector ◽

K Nearest Neighbor ◽

Achievement Emotions ◽

Integrative Framework ◽

Emotional Interaction ◽

Performance Results

Massive Open Online Courses (MOOCs) have become a popular tool for worldwide learners. However, a lack of emotional interaction and support is an important reason for learners to abandon their learning and eventually results in poor learning performance. This study applied an integrative framework of achievement emotions to uncover their holistic influence on students’ learning by analyzing more than 400,000 forum posts from 13 MOOCs. Six machine-learning models were first built to automatically identify achievement emotions, including K-Nearest Neighbor, Logistic Regression, Naïve Bayes, Decision Tree, Random Forest, and Support Vector Machines. Results showed that Random Forest performed the best with a kappa of 0.83 and an ROC_AUC of 0.97. Then, multilevel modeling with the “Stepwise Build-up” strategy was used to quantify the effect of achievement emotions on students’ academic performance. Results showed that different achievement emotions influenced students’ learning differently. These findings allow MOOC platforms and instructors to provide relevant emotional feedback to students automatically or manually, thereby improving their learning in MOOCs.

Download Full-text

Spatial modeling and susceptibility zonation of landslides using random forest, naïve bayes and K-nearest neighbor in a complicated terrain

Earth Science Informatics ◽

10.1007/s12145-021-00653-y ◽

2021 ◽

Author(s):

Sherif Ahmed Abu El-Magd ◽

Sk Ajim Ali ◽

Quoc Bao Pham

Keyword(s):

Random Forest ◽

Spatial Modeling ◽

Nearest Neighbor ◽

Naive Bayes ◽

Naïve Bayes ◽

K Nearest Neighbor

Download Full-text

Ensembling evidential k-nearest neighbor classifiers through multi-modal perturbation

Applied Soft Computing ◽

10.1016/j.asoc.2006.10.002 ◽

2007 ◽

Vol 7 (3) ◽

pp. 1072-1083 ◽

Cited By ~ 29

Author(s):

Hakan Altınçay

Keyword(s):

Nearest Neighbor ◽

K Nearest Neighbor ◽

Nearest Neighbor Classifiers

Download Full-text

Precision Pig Farming Image Analysis Using Random Forest and Boruta Predictive Big Data Analysis Using Neural Network and K- Nearest Neighbor

2021 2nd International Conference on Intelligent Engineering and Management (ICIEM) ◽

10.1109/iciem51511.2021.9445328 ◽

2021 ◽

Author(s):

S. A. Shaik Mazhar ◽

G. Suseendran

Keyword(s):

Neural Network ◽

Image Analysis ◽

Big Data ◽

Data Analysis ◽

Random Forest ◽

Nearest Neighbor ◽

Big Data Analysis ◽

K Nearest Neighbor ◽

Pig Farming

Download Full-text

Parallel kNN Queries for Big Data Based on Voronoi Diagram Using MapReduce

Advances in Data Mining and Database Management - Handbook of Research on Innovative Database Query Processing Techniques ◽

10.4018/978-1-4666-8767-7.ch014 ◽

2015 ◽

pp. 392-414

Author(s):

Wei Yan

Keyword(s):

Big Data ◽

Voronoi Diagram ◽

Spatial Databases ◽

Nearest Neighbor ◽

Programming Model ◽

Dimensional Space ◽

Data Sets ◽

Two Dimensional ◽

K Nearest Neighbor ◽

K Nearest Neighbors

In cloud computing environments parallel kNN queries for big data is an important issue. The k nearest neighbor queries (kNN queries), designed to find k nearest neighbors from a dataset S for every object in another dataset R, is a primitive operator widely adopted by many applications including knowledge discovery, data mining, and spatial databases. This chapter proposes a parallel method of kNN queries for big data using MapReduce programming model. Firstly, this chapter proposes an approximate algorithm that is based on mapping multi-dimensional data sets into two-dimensional data sets, and transforming kNN queries into a sequence of two-dimensional point searches. Then, in two-dimensional space this chapter proposes a partitioning method using Voronoi diagram, which incorporates the Voronoi diagram into R-tree. Furthermore, this chapter proposes an efficient algorithm for processing kNN queries based on R-tree using MapReduce programming model. Finally, this chapter presents the results of extensive experimental evaluations which indicate efficiency of the proposed approach.

Download Full-text

Parallel Queries of Cluster-Based k Nearest Neighbor in MapReduce

Advances in Systems Analysis, Software Engineering, and High Performance Computing - Managing Big Data in Cloud Computing Environments ◽

10.4018/978-1-4666-9834-5.ch007 ◽

2016 ◽

pp. 163-182

Author(s):

Wei Yan

Keyword(s):

Spatial Data ◽

Spatial Databases ◽

Nearest Neighbor ◽

Programming Model ◽

K Nearest Neighbor ◽

K Nearest Neighbors ◽

Data Intensive ◽

Parallel Queries ◽

Massive Spatial Data ◽

Nearest Neighbor Queries

Parallel queries of k Nearest Neighbor for massive spatial data are an important issue. The k nearest neighbor queries (kNN queries), designed to find k nearest neighbors from a dataset S for every point in another dataset R, is a useful tool widely adopted by many applications including knowledge discovery, data mining, and spatial databases. In cloud computing environments, MapReduce programming model is a well-accepted framework for data-intensive application over clusters of computers. This chapter proposes a parallel method of kNN queries based on clusters in MapReduce programming model. Firstly, this chapter proposes a partitioning method of spatial data using Voronoi diagram. Then, this chapter clusters the data point after partition using k-means method. Furthermore, this chapter proposes an efficient algorithm for processing kNN queries based on k-means clusters using MapReduce programming model. Finally, extensive experiments evaluate the efficiency of the proposed approach.

Download Full-text