Using machine learning to generate high-resolution soil wetness maps for planning forest management

Author(s):  
William Lidberg ◽  
Johannes Larson ◽  
siddhartho Paul ◽  
Hjalmar Laudon ◽  
Anneli Ågren

<p>Open peatlands are a recognizable feature in the boreal landscape that are commonly mapped from aerial photographs. However, wet soils also occur on tree covered peatlands and in the riparian zones of forest streams and surrounding lakes. Comparisons between field data and available maps show that only 36 % of wet soils in the boreal landscape are marked on maps, making them difficult to manage. Wet soils have lower bearing capacity than dry soils and are more susceptible to soil disturbance from land-use management with heavy machinery. Topographical modelling of wet area indices has been suggested as a solution to this problem and high-resolution digital elevation models (DEM) derived from airborne LiDAR are becoming accessible in many countries. However, most of these topographical methods relies on the user to define appropriate threshold values in order to define wet areas. Soil textures, topography and climatic differences make any application difficult on a large scale. This complex landscape variability can be captured by utilizing machine learners that uses automated data mining methods to discover patterns in large data sets. By using soil moisture data from 20 000 field plots from the National Forest Inventory of Sweden, we combined information from 24 indices and ancillary environmental features using a machine learning known as extreme gradient boosting. Extreme gradient boosting used the field data to learn how to classify soil moisture and delivered high performance compared to many traditional single algorithm methods. With this method we mapped soil moisture at 2 m spatial resolution across the Swedish forest landscape in five days using a workstation with 32 cores. This new map captured 79 % (kappa 0.69) of all wet soils compared to only 36 % (kappa 0.39) captured by current maps. In addition to capture open wetlands this new map also capture riparian zones and previously unmapped cryptic wetlands underneath the forest canopy. The new maps can, for example, be used to plan hydrologically adapted buffer zones, suggest machine free zones near streams and lakes in order to prevent rutting from forestry machines to reduce sediment, mercury and nutrient loads to downstream streams, lakes and sea.</p>

2020 ◽  
Vol 9 (10) ◽  
pp. 569
Author(s):  
Ananta Man Singh Pradhan ◽  
Yun-Tae Kim

Landslides impact on human activities and socio-economic development, especially in mountainous areas. This study focuses on the comparison of the prediction capability of advanced machine learning techniques for the rainfall-induced shallow landslide susceptibility of Deokjeokri catchment and Karisanri catchment in South Korea. The influencing factors for landslides, i.e., topographic, hydrologic, soil, forest, and geologic factors, are prepared from various sources based on availability, and a multicollinearity test is also performed to select relevant causative factors. The landslide inventory maps of both catchments are obtained from historical information, aerial photographs and performed field surveys. In this study, Deokjeokri catchment is considered as a training area and Karisanri catchment as a testing area. The landslide inventories contain 748 landslide points in training and 219 points in testing areas. Three landslide susceptibility maps using machine learning models, i.e., Random Forest (RF), Extreme Gradient Boosting (XGBoost) and Deep Neural Network (DNN), are prepared and compared. The outcomes of the analyses are validated using the landslide inventory data. A receiver operating characteristic curve (ROC) method is used to verify the results of the models. The results of this study show that the training accuracy of RF is 0.756 and the testing accuracy is 0.703. Similarly, the training accuracy of XGBoost is 0.757 and testing accuracy is 0.74. The prediction of DNN revealed acceptable agreement between the susceptibility map and the existing landslides, with a training accuracy of 0.855 and testing accuracy of 0.802. The results showed that the DNN model achieved lower prediction error and higher accuracy results than other models for shallow landslide modeling in the study area.


Author(s):  
Ananta Man Singh Pradhan ◽  
Yun-Tae Kim

Landslides impact on human activities and socio-economic development especially in mountainous areas. This study focuses on the comparison of the prediction capability of advanced machine learning techniques for rainfall-induced shallow landslide susceptibility of Deokjeokri catchment and Karisanri catchment in South Korea. The influencing factors for landslides i.e. topographic, hydrologic, soil, forest, and geologic factors are prepared from various sources based on availability and a multicollinearity test is also performed to select relevant causative factors. The landslide inventory maps of both catchments are obtained from historical information, aerial photographs and performing field survey. In this study, Deokjeokri catchment is considered as a training area and Karisanri catchment as a testing area. The landslide inventories content 748 landslide points in training and 219 points in testing areas. Three landslide susceptibility maps using machine learning models i.e. Random Forest (RF), Extreme Gradient Boosting (XGBoost) and Deep Neural Network (DNN) are prepared and compared. The outcomes of the analyses are validated using the landslide inventory data. A receiver operating characteristic curve (ROC) method is used to verify the results of the models. The results of this study show that the training accuracy of RF is 0.757 and the testing accuracy is 0.74. Similarly, training accuracy of XGBoost is 0.756 and testing accuracy is 0.703. The prediction of DNN revealed acceptable agreement between susceptibility map and the existing landslides with training and testing accuracy of 0.855 and 0.802, respectively. The results showed that, the DNN model achieved lower prediction error and higher accuracy results than other models for shallow landslide modeling in the study area


Water ◽  
2020 ◽  
Vol 12 (3) ◽  
pp. 713 ◽  
Author(s):  
Aliva Nanda ◽  
Sumit Sen ◽  
Awshesh Nath Sharma ◽  
K. P. Sudheer

Soil temperature plays an important role in understanding hydrological, ecological, meteorological, and land surface processes. However, studies related to soil temperature variability are very scarce in various parts of the world, especially in the Indian Himalayan Region (IHR). Thus, this study aims to analyze the spatio-temporal variability of soil temperature in two nested hillslopes of the lesser Himalaya and to check the efficiency of different machine learning algorithms to estimate soil temperature in the data-scarce region. To accomplish this goal, grassed (GA) and agro-forested (AgF) hillslopes were instrumented with Odyssey water level and decagon soil moisture and temperature sensors. The average soil temperature of the south aspect hillslope (i.e., GA hillslope) was higher than the north aspect hillslope (i.e., AgF hillslope). After analyzing 40 rainfall events from both hillslopes, it was observed that a rainfall duration of greater than 7.5 h or an event with an average rainfall intensity greater than 7.5 mm/h results in more than 2 °C soil temperature drop. Further, a drop in soil temperature less than 1 °C was also observed during very high-intensity rainfall which has a very short event duration. During the rainy season, the soil temperature drop of the GA hillslope is higher than the AgF hillslope as the former one infiltrates more water. This observation indicates the significant correlation between soil moisture rise and soil temperature drop. The potential of four machine learning algorithms was also explored in predicting soil temperature under data-scarce conditions. Among the four machine learning algorithms, an extreme gradient boosting system (XGBoost) performed better for both the hillslopes followed by random forests (RF), multilayer perceptron (MLP), and support vector machine (SVMs). The addition of rainfall to meteorological and meteorological + soil moisture datasets did not improve the models considerably. However, the addition of soil moisture to meteorological parameters improved the model significantly.


Heart ◽  
2021 ◽  
pp. heartjnl-2020-318726 ◽  
Author(s):  
Takahiro Nakashima ◽  
Soshiro Ogata ◽  
Teruo Noguchi ◽  
Yoshio Tahara ◽  
Daisuke Onozuka ◽  
...  

ObjectivesTo evaluate a predictive model for robust estimation of daily out-of-hospital cardiac arrest (OHCA) incidence using a suite of machine learning (ML) approaches and high-resolution meteorological and chronological data.MethodsIn this population-based study, we combined an OHCA nationwide registry and high-resolution meteorological and chronological datasets from Japan. We developed a model to predict daily OHCA incidence with a training dataset for 2005–2013 using the eXtreme Gradient Boosting algorithm. A dataset for 2014–2015 was used to test the predictive model. The main outcome was the accuracy of the predictive model for the number of daily OHCA events, based on mean absolute error (MAE) and mean absolute percentage error (MAPE). In general, a model with MAPE less than 10% is considered highly accurate.ResultsAmong the 1 299 784 OHCA cases, 661 052 OHCA cases of cardiac origin (525 374 cases in the training dataset on which fourfold cross-validation was performed and 135 678 cases in the testing dataset) were included in the analysis. Compared with the ML models using meteorological or chronological variables alone, the ML model with combined meteorological and chronological variables had the highest predictive accuracy in the training (MAE 1.314 and MAPE 7.007%) and testing datasets (MAE 1.547 and MAPE 7.788%). Sunday, Monday, holiday, winter, low ambient temperature and large interday or intraday temperature difference were more strongly associated with OHCA incidence than other the meteorological and chronological variables.ConclusionsA ML predictive model using comprehensive daily meteorological and chronological data allows for highly precise estimates of OHCA incidence.


2021 ◽  
Author(s):  
Anneli M. Ågren ◽  
Johannes Larson ◽  
Siddhartho S. Paul ◽  
Hjalmar Laudon ◽  
William Lidberg

<p>To meet the sustainable development goals and enable protection of surface waters, there is a strong need to plan and align forest management with the needs of the environment. The number one tool to succeed in sustainable spatial planning is accurate and detailed maps. High resolution soil moisture mapping over spatial large extent remains a consistent challenge despite its substantial value in practical forestry and land management. Here we present a novel technique combining LIDAR-derived terrain indices and machine learning to model soil moisture at 2 m spatial resolution across the Swedish forest landscape with high accuracy. We used field data from about 20,000 sites across Sweden to train and evaluate multiple machine learning (ML) models. The predictor features included a suite of terrain indices generated from national LIDAR digital elevation model and other ancillary environmental features, including surficial geology, climate, land use information, allowing for adjustment of soil moisture maps to regional/local conditions. In our analysis, extreme gradient boosting (XGBoost) outperformed the other tested ML methods (Kappa = 0.69, MCC= 0.68), namely Artificial Neural Network, Random Forest, Support Vector Machine, and Naïve Bayes classification. The depth to water index, topographic wetness index, and wetlands derived from Swedish property maps were the most important predictors for all models. With the presented technique, it was possible to generate a multiclass model with 3 classes with Kappa and MCC of 0.58. Besides the classified moisture maps, we also investigated the potential of producing a continuous map from dry to wet soils. We argue that the probability of a pixel being classified as wet from the 2-class model can be used as an index of soil moisture from 0% – dry to 100% – wet and that such maps hold more valuable information for practical forest management than classified maps.</p><p>The soil moisture map was developed to support the need for land use management optimization by incorporating landscape sensitivity and hydrological connectivity into a framework that promotes the protection of soil and water quality. The soil moisture map can be used to address fundamental considerations, such as;</p><ul><li>(i) locating areas where different land use practices can be conducted with minimal impacts on water quality;</li> <li>(ii) guiding the construction of vital infrastructure in high flood risk areas;</li> <li>(iii) designing riparian protection zones to optimize the protection of water quality and biodiversity.</li> </ul>


2019 ◽  
Author(s):  
Kasper Van Mens ◽  
Joran Lokkerbol ◽  
Richard Janssen ◽  
Robert de Lange ◽  
Bea Tiemens

BACKGROUND It remains a challenge to predict which treatment will work for which patient in mental healthcare. OBJECTIVE In this study we compare machine algorithms to predict during treatment which patients will not benefit from brief mental health treatment and present trade-offs that must be considered before an algorithm can be used in clinical practice. METHODS Using an anonymized dataset containing routine outcome monitoring data from a mental healthcare organization in the Netherlands (n = 2,655), we applied three machine learning algorithms to predict treatment outcome. The algorithms were internally validated with cross-validation on a training sample (n = 1,860) and externally validated on an unseen test sample (n = 795). RESULTS The performance of the three algorithms did not significantly differ on the test set. With a default classification cut-off at 0.5 predicted probability, the extreme gradient boosting algorithm showed the highest positive predictive value (ppv) of 0.71(0.61 – 0.77) with a sensitivity of 0.35 (0.29 – 0.41) and area under the curve of 0.78. A trade-off can be made between ppv and sensitivity by choosing different cut-off probabilities. With a cut-off at 0.63, the ppv increased to 0.87 and the sensitivity dropped to 0.17. With a cut-off of at 0.38, the ppv decreased to 0.61 and the sensitivity increased to 0.57. CONCLUSIONS Machine learning can be used to predict treatment outcomes based on routine monitoring data.This allows practitioners to choose their own trade-off between being selective and more certain versus inclusive and less certain.


2021 ◽  
Vol 13 (5) ◽  
pp. 1021
Author(s):  
Hu Ding ◽  
Jiaming Na ◽  
Shangjing Jiang ◽  
Jie Zhu ◽  
Kai Liu ◽  
...  

Artificial terraces are of great importance for agricultural production and soil and water conservation. Automatic high-accuracy mapping of artificial terraces is the basis of monitoring and related studies. Previous research achieved artificial terrace mapping based on high-resolution digital elevation models (DEMs) or imagery. As a result of the importance of the contextual information for terrace mapping, object-based image analysis (OBIA) combined with machine learning (ML) technologies are widely used. However, the selection of an appropriate classifier is of great importance for the terrace mapping task. In this study, the performance of an integrated framework using OBIA and ML for terrace mapping was tested. A catchment, Zhifanggou, in the Loess Plateau, China, was used as the study area. First, optimized image segmentation was conducted. Then, features from the DEMs and imagery were extracted, and the correlations between the features were analyzed and ranked for classification. Finally, three different commonly-used ML classifiers, namely, extreme gradient boosting (XGBoost), random forest (RF), and k-nearest neighbor (KNN), were used for terrace mapping. The comparison with the ground truth, as delineated by field survey, indicated that random forest performed best, with a 95.60% overall accuracy (followed by 94.16% and 92.33% for XGBoost and KNN, respectively). The influence of class imbalance and feature selection is discussed. This work provides a credible framework for mapping artificial terraces.


2021 ◽  
Vol 13 (6) ◽  
pp. 1147
Author(s):  
Xiangqian Li ◽  
Wenping Yuan ◽  
Wenjie Dong

To forecast the terrestrial carbon cycle and monitor food security, vegetation growth must be accurately predicted; however, current process-based ecosystem and crop-growth models are limited in their effectiveness. This study developed a machine learning model using the extreme gradient boosting method to predict vegetation growth throughout the growing season in China from 2001 to 2018. The model used satellite-derived vegetation data for the first month of each growing season, CO2 concentration, and several meteorological factors as data sources for the explanatory variables. Results showed that the model could reproduce the spatiotemporal distribution of vegetation growth as represented by the satellite-derived normalized difference vegetation index (NDVI). The predictive error for the growing season NDVI was less than 5% for more than 98% of vegetated areas in China; the model represented seasonal variations in NDVI well. The coefficient of determination (R2) between the monthly observed and predicted NDVI was 0.83, and more than 69% of vegetated areas had an R2 > 0.8. The effectiveness of the model was examined for a severe drought year (2009), and results showed that the model could reproduce the spatiotemporal distribution of NDVI even under extreme conditions. This model provides an alternative method for predicting vegetation growth and has great potential for monitoring vegetation dynamics and crop growth.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Moojung Kim ◽  
Young Jae Kim ◽  
Sung Jin Park ◽  
Kwang Gi Kim ◽  
Pyung Chun Oh ◽  
...  

Abstract Background Annual influenza vaccination is an important public health measure to prevent influenza infections and is strongly recommended for cardiovascular disease (CVD) patients, especially in the current coronavirus disease 2019 (COVID-19) pandemic. The aim of this study is to develop a machine learning model to identify Korean adult CVD patients with low adherence to influenza vaccination Methods Adults with CVD (n = 815) from a nationally representative dataset of the Fifth Korea National Health and Nutrition Examination Survey (KNHANES V) were analyzed. Among these adults, 500 (61.4%) had answered "yes" to whether they had received seasonal influenza vaccinations in the past 12 months. The classification process was performed using the logistic regression (LR), random forest (RF), support vector machine (SVM), and extreme gradient boosting (XGB) machine learning techniques. Because the Ministry of Health and Welfare in Korea offers free influenza immunization for the elderly, separate models were developed for the < 65 and ≥ 65 age groups. Results The accuracy of machine learning models using 16 variables as predictors of low influenza vaccination adherence was compared; for the ≥ 65 age group, XGB (84.7%) and RF (84.7%) have the best accuracies, followed by LR (82.7%) and SVM (77.6%). For the < 65 age group, SVM has the best accuracy (68.4%), followed by RF (64.9%), LR (63.2%), and XGB (61.4%). Conclusions The machine leaning models show comparable performance in classifying adult CVD patients with low adherence to influenza vaccination.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Satoko Hiura ◽  
Shige Koseki ◽  
Kento Koyama

AbstractIn predictive microbiology, statistical models are employed to predict bacterial population behavior in food using environmental factors such as temperature, pH, and water activity. As the amount and complexity of data increase, handling all data with high-dimensional variables becomes a difficult task. We propose a data mining approach to predict bacterial behavior using a database of microbial responses to food environments. Listeria monocytogenes, which is one of pathogens, population growth and inactivation data under 1,007 environmental conditions, including five food categories (beef, culture medium, pork, seafood, and vegetables) and temperatures ranging from 0 to 25 °C, were obtained from the ComBase database (www.combase.cc). We used eXtreme gradient boosting tree, a machine learning algorithm, to predict bacterial population behavior from eight explanatory variables: ‘time’, ‘temperature’, ‘pH’, ‘water activity’, ‘initial cell counts’, ‘whether the viable count is initial cell number’, and two types of categories regarding food. The root mean square error of the observed and predicted values was approximately 1.0 log CFU regardless of food category, and this suggests the possibility of predicting viable bacterial counts in various foods. The data mining approach examined here will enable the prediction of bacterial population behavior in food by identifying hidden patterns within a large amount of data.


Sign in / Sign up

Export Citation Format

Share Document