scholarly journals The Predictive Capability of a Novel Ensemble Tree-Based Algorithm for Assessing Groundwater Potential

2021 ◽  
Vol 13 (5) ◽  
pp. 2459
Author(s):  
Soyoung Park ◽  
Jinsoo Kim

Understanding the potential groundwater resource distribution is critical for sustainable groundwater development, conservation, and management strategies. This study analyzes and maps the groundwater potential in Busan Metropolitan City, South Korea, using random forest (RF), gradient boosting machine (GBM), and extreme gradient boosting (XGB) methods. Fourteen groundwater conditioning factors were evaluated for their contribution to groundwater potential assessment using an elastic net. Curvature, the stream power index, the distance from drainage, lineament density, and fault density were excluded from the subsequent analysis, while nine other factors were used to create groundwater potential maps (GMPs) using the RF, GBM, and XGB models. The accuracy of the resultant GPMs was tested using receiver operating characteristic curves and the seed cell area index, and the results were compared. The analysis showed that the three models used in this study satisfactorily predicted the spatial distribution of groundwater in the study area. In particular, the XGB model showed the highest prediction accuracy (0.818), followed by the GBM (0.802) and the RF models (0.794). The XGB model, which is the most recently developed technique, was found to best contribute to improving the accuracy of the GPMs. These results contribute to the establishment of a sustainable management plan for groundwater resources in the study area.

Sensors ◽  
2019 ◽  
Vol 19 (16) ◽  
pp. 3590 ◽  
Author(s):  
Bui ◽  
Moayedi ◽  
Kalantar ◽  
Osouli ◽  
Gör ◽  
...  

In this research, the novel metaheuristic algorithm Harris hawks optimization (HHO) is applied to landslide susceptibility analysis in Western Iran. To this end, the HHO is synthesized with an artificial neural network (ANN) to optimize its performance. A spatial database comprising 208 historical landslides, as well as 14 landslide conditioning factors—elevation, slope aspect, plan curvature, profile curvature, soil type, lithology, distance to the river, distance to the road, distance to the fault, land cover, slope degree, stream power index (SPI), topographic wetness index (TWI), and rainfall—is prepared to develop the ANN and HHO–ANN predictive tools. Mean square error and mean absolute error criteria are defined to measure the performance error of the models, and area under the receiving operating characteristic curve (AUROC) is used to evaluate the accuracy of the generated susceptibility maps. The findings showed that the HHO algorithm effectively improved the performance of ANN in both recognizing (AUROCANN = 0.731 and AUROCHHO–ANN = 0.777) and predicting (AUROCANN = 0.720 and AUROCHHO–ANN = 0.773) the landslide pattern.


2021 ◽  
Vol 13 (18) ◽  
pp. 3663
Author(s):  
Shenzhou Liu ◽  
Wenzhi Zeng ◽  
Lifeng Wu ◽  
Guoqing Lei ◽  
Haorui Chen ◽  
...  

Accurate estimation of the leaf area index (LAI) is essential for crop growth simulations and agricultural management. This study conducted a field experiment with rice and measured the LAI in different rice growth periods. The multispectral bands (B) including red edge (RE, 730 nm ± 16 nm), near-infrared (NIR, 840 nm ± 26 nm), green (560 nm ± 16 nm), red (650 nm ± 16 nm), blue (450 nm ± 16 nm), and visible light (RGB) were also obtained by an unmanned aerial vehicle (UAV) with multispectral sensors (DJI-P4M, SZ DJI Technology Co., Ltd.). Based on the bands, five vegetation indexes (VI) including Green Normalized Difference Vegetation Index (GNDVI), Leaf Chlorophyll Index (LCI), Normalized Difference Red Edge Index (NDRE), Normalized Difference Vegetation Index (NDVI), and Optimization Soil-Adjusted Vegetation Index (OSAVI) were calculated. The semi-empirical model (SEM), the random forest model (RF), and the Extreme Gradient Boosting model (XGBoost) were used to estimate rice LAI based on multispectral bands, VIs, and their combinations, respectively. The results indicated that the GNDVI had the highest accuracy in the SEM (R2 = 0.78, RMSE = 0.77). For the single band, NIR had the highest accuracy in both RF (R2 = 0.73, RMSE = 0.98) and XGBoost (R2 = 0.77, RMSE = 0.88). Band combination of NIR + red improved the estimation accuracy in both RF (R2 = 0.87, RMSE = 0.65) and XGBoost (R2 = 0.88, RMSE = 0.63). NDRE and LCI were the first two single VIs for LAI estimation using both RF and XGBoost. However, putting more than one VI together could only increase the LAI estimation accuracy slightly. Meanwhile, the bands + VIs combinations could improve the accuracy in both RF and XGBoost. Our study recommended estimating rice LAI by a combination of red + NIR + OSAVI + NDVI + GNDVI + LCI + NDRE (2B + 5V) with XGBoost to obtain high accuracy and overcome the potential over-fitting issue (R2 = 0.91, RMSE = 0.54).


2010 ◽  
Vol 2010 ◽  
pp. 1-15 ◽  
Author(s):  
H. A. Nefeslioglu ◽  
E. Sezer ◽  
C. Gokceoglu ◽  
A. S. Bozkir ◽  
T. Y. Duman

The main purpose of the present study is to investigate the possible application of decision tree in landslide susceptibility assessment. The study area having a surface area of 174.8  locates at the northern coast of the Sea of Marmara and western part of Istanbul metropolitan area. When applying data mining and extracting decision tree, geological formations, altitude, slope, plan curvature, profile curvature, heat load and stream power index parameters are taken into consideration as landslide conditioning factors. Using the predicted values, the landslide susceptibility map of the study area is produced. The AUC value of the produced landslide susceptibility map has been obtained as 89.6%. According to the results of the AUC evaluation, the produced map has exhibited a good enough performance.


2020 ◽  
Vol 12 (7) ◽  
pp. 2622 ◽  
Author(s):  
Phong Tung Nguyen ◽  
Duong Hai Ha ◽  
Huu Duy Nguyen ◽  
Tran Van Phong ◽  
Phan Trong Trinh ◽  
...  

Groundwater is one of the most important sources of fresh water all over the world, especially in those countries where rainfall is erratic, such as Vietnam. Nowadays, machine learning (ML) models are being used for the assessment of groundwater potential of the region. Credal decision trees (CDT) is one of the ML models which has been used in such studies. In the present study, the performance of the CDT has been improved using various ensemble frameworks such as Bagging, Dagging, Decorate, Multiboost, and Random SubSpace. Based on these methods, five hybrid models, namely BCDT, Dagging-CDT, Decorate-CDT, MBCDT, and RSSCDT, were developed and applied for groundwater potential mapping of DakLak province of Vietnam. Data of 227 groundwater wells of the study area were utilized for the construction and validation of the models. Twelve groundwater potential conditioning factors, namely rainfall, slope, elevation, river density, Sediment Transport Index (STI), curvature, flow direction, aspect, soil, land use, Topographic Wetness Index (TWI), and geology, were considered for the model studies. Various statistical measures, including area under receiver operating characteristic (AUC) curve, were applied to validate and compare the performance of the models. The results show that performance of the hybrid CDT ensemble models MBCDT (AUC = 0.770), BCDT (AUC = 0.731), Dagging-CDT (AUC = 0.763), Decorate-CDT (AUC = 0.750), and RSSCDT (AUC = 0.766) improved significantly in comparison to the single CDT (AUC = 0.722) model. Therefore, these developed hybrid models can be applied for better ground water potential mapping and groundwater resources management of the study area as well as other regions of the world.


2018 ◽  
Vol 2 (1) ◽  
pp. 16-27 ◽  
Author(s):  
Vaishnavi Mundalik ◽  
Clinton Fernandes ◽  
Ajaykumar Kadam ◽  
Bhavana Umrikar

Groundwater is an important source of drinking water in rural parts of India. Because of the increasing demand for water, it is essential to identify new sources for the sustainable development of this resource. The potential mapping and exploration of groundwater resources have become a breakthrough in the field of hydrogeological research. In the present paper, a groundwater prospects map is delineated for the assessment of groundwater availability in Kar basin on basaltic terrain, using remote sensing and Geographic Information System (GIS) techniques. Various thematic layers such as geology, slope, soil, geomorphology, drainage density and rainfall are prepared using satellite data, topographic maps and field data. The ranks and weights were assigned to each thematic layer and various categories of those thematic layers using AHP technique respectively. Further, a weighted overlay analysis was performed by reclassifying them in the GIS environment to prepare the groundwater potential map of the study area. The results show that groundwater prospects map classified into three classes low, moderate and high having area 17.12%, 38.26%, 44.62%, respectively. The overlay map with the groundwater potential zones in the study area has been found to be helpful for better planning and managing the resources.


2019 ◽  
Author(s):  
Kasper Van Mens ◽  
Joran Lokkerbol ◽  
Richard Janssen ◽  
Robert de Lange ◽  
Bea Tiemens

BACKGROUND It remains a challenge to predict which treatment will work for which patient in mental healthcare. OBJECTIVE In this study we compare machine algorithms to predict during treatment which patients will not benefit from brief mental health treatment and present trade-offs that must be considered before an algorithm can be used in clinical practice. METHODS Using an anonymized dataset containing routine outcome monitoring data from a mental healthcare organization in the Netherlands (n = 2,655), we applied three machine learning algorithms to predict treatment outcome. The algorithms were internally validated with cross-validation on a training sample (n = 1,860) and externally validated on an unseen test sample (n = 795). RESULTS The performance of the three algorithms did not significantly differ on the test set. With a default classification cut-off at 0.5 predicted probability, the extreme gradient boosting algorithm showed the highest positive predictive value (ppv) of 0.71(0.61 – 0.77) with a sensitivity of 0.35 (0.29 – 0.41) and area under the curve of 0.78. A trade-off can be made between ppv and sensitivity by choosing different cut-off probabilities. With a cut-off at 0.63, the ppv increased to 0.87 and the sensitivity dropped to 0.17. With a cut-off of at 0.38, the ppv decreased to 0.61 and the sensitivity increased to 0.57. CONCLUSIONS Machine learning can be used to predict treatment outcomes based on routine monitoring data.This allows practitioners to choose their own trade-off between being selective and more certain versus inclusive and less certain.


Author(s):  
Mohammad Hamim Zajuli Al Faroby ◽  
Mohammad Isa Irawan ◽  
Ni Nyoman Tri Puspaningsih

Protein Interaction Analysis (PPI) can be used to identify proteins that have a supporting function on the main protein, especially in the synthesis process. Insulin is synthesized by proteins that have the same molecular function covering different but mutually supportive roles. To identify this function, the translation of Gene Ontology (GO) gives certain characteristics to each protein. This study purpose to predict proteins that interact with insulin using the centrality method as a feature extractor and extreme gradient boosting as a classification algorithm. Characteristics using the centralized method produces  features as a central function of protein. Classification results are measured using measurements, precision, recall and ROC scores. Optimizing the model by finding the right parameters produces an accuracy of  and a ROC score of . The prediction model produced by XGBoost has capabilities above the average of other machine learning methods.


2021 ◽  
Vol 13 (5) ◽  
pp. 1021
Author(s):  
Hu Ding ◽  
Jiaming Na ◽  
Shangjing Jiang ◽  
Jie Zhu ◽  
Kai Liu ◽  
...  

Artificial terraces are of great importance for agricultural production and soil and water conservation. Automatic high-accuracy mapping of artificial terraces is the basis of monitoring and related studies. Previous research achieved artificial terrace mapping based on high-resolution digital elevation models (DEMs) or imagery. As a result of the importance of the contextual information for terrace mapping, object-based image analysis (OBIA) combined with machine learning (ML) technologies are widely used. However, the selection of an appropriate classifier is of great importance for the terrace mapping task. In this study, the performance of an integrated framework using OBIA and ML for terrace mapping was tested. A catchment, Zhifanggou, in the Loess Plateau, China, was used as the study area. First, optimized image segmentation was conducted. Then, features from the DEMs and imagery were extracted, and the correlations between the features were analyzed and ranked for classification. Finally, three different commonly-used ML classifiers, namely, extreme gradient boosting (XGBoost), random forest (RF), and k-nearest neighbor (KNN), were used for terrace mapping. The comparison with the ground truth, as delineated by field survey, indicated that random forest performed best, with a 95.60% overall accuracy (followed by 94.16% and 92.33% for XGBoost and KNN, respectively). The influence of class imbalance and feature selection is discussed. This work provides a credible framework for mapping artificial terraces.


Author(s):  
Irfan Ullah Khan ◽  
Nida Aslam ◽  
Malak Aljabri ◽  
Sumayh S. Aljameel ◽  
Mariam Moataz Aly Kamaleldin ◽  
...  

The COVID-19 outbreak is currently one of the biggest challenges facing countries around the world. Millions of people have lost their lives due to COVID-19. Therefore, the accurate early detection and identification of severe COVID-19 cases can reduce the mortality rate and the likelihood of further complications. Machine Learning (ML) and Deep Learning (DL) models have been shown to be effective in the detection and diagnosis of several diseases, including COVID-19. This study used ML algorithms, such as Decision Tree (DT), Logistic Regression (LR), Random Forest (RF), Extreme Gradient Boosting (XGBoost), and K-Nearest Neighbor (KNN) and DL model (containing six layers with ReLU and output layer with sigmoid activation), to predict the mortality rate in COVID-19 cases. Models were trained using confirmed COVID-19 patients from 146 countries. Comparative analysis was performed among ML and DL models using a reduced feature set. The best results were achieved using the proposed DL model, with an accuracy of 0.97. Experimental results reveal the significance of the proposed model over the baseline study in the literature with the reduced feature set.


Sign in / Sign up

Export Citation Format

Share Document