scholarly journals EVALUATION OF RANDOM FOREST–BASED ANALYSIS FOR THE GYPSUM DISTRIBUTION IN THE ATACAMA DESERT

Author(s):  
D. Hoffmeister ◽  
M. Herbrecht ◽  
T. Kramm ◽  
P. Schulte

Abstract. Gypsum-rich material covers the hillslopes above ∼ 1000 m of the Atacama and forms the particular landscape. In this contribution, we evaluate random forest-based analysis in order to predict the gypsum distribution in a specific area of ∼ 3000 km2, located in the hyperarid core of the Atacama. Therefore, three different sets of input variables were chosen. These variables reflect the different factors forming soil properties, according to digital soil mapping. The variables are derived from indices based on imagery of the ASTER and Landsat-8 satellite, geomorphometric parameters based on the Tandem-X World DEM™, as well as selected climate variables and geologic units. These three different models were used to evaluate the Ca-content derived from soil surface samples, reflecting gypsum content. All three different models derived high values of explained variation (r2 > 0.886), the RMSE is ∼ 4500 mg∙kg−1 and the NRMSE is ∼ 6%. Overall, this approach shows promising results in order to derive a gypsum content prediction for the whole Atacama. However, further investigation on the independent variables need to be conducted. In this case, the ferric oxides index (representing magnetite content), slope and a temperature gradient are the most important factors for predicting gypsum content.

2020 ◽  
Vol 38 (4A) ◽  
pp. 510-514
Author(s):  
Tay H. Shihab ◽  
Amjed N. Al-Hameedawi ◽  
Ammar M. Hamza

In this paper to make use of complementary potential in the mapping of LULC spatial data is acquired from LandSat 8 OLI sensor images are taken in 2019.  They have been rectified, enhanced and then classified according to Random forest (RF) and artificial neural network (ANN) methods. Optical remote sensing images have been used to get information on the status of LULC classification, and extraction details. The classification of both satellite image types is used to extract features and to analyse LULC of the study area. The results of the classification showed that the artificial neural network method outperforms the random forest method. The required image processing has been made for Optical Remote Sensing Data to be used in LULC mapping, include the geometric correction, Image Enhancements, The overall accuracy when using the ANN methods 0.91 and the kappa accuracy was found 0.89 for the training data set. While the overall accuracy and the kappa accuracy of the test dataset were found 0.89 and 0.87 respectively.


2021 ◽  
Vol 20 ◽  
pp. 153303382110246
Author(s):  
Jihwan Park ◽  
Mi Jung Rho ◽  
Hyong Woo Moon ◽  
Jaewon Kim ◽  
Chanjung Lee ◽  
...  

Objectives: To develop a model to predict biochemical recurrence (BCR) after radical prostatectomy (RP), using artificial intelligence (AI) techniques. Patients and Methods: This study collected data from 7,128 patients with prostate cancer (PCa) who received RP at 3 tertiary hospitals. After preprocessing, we used the data of 6,755 cases to generate the BCR prediction model. There were 16 input variables with BCR as the outcome variable. We used a random forest to develop the model. Several sampling techniques were used to address class imbalances. Results: We achieved good performance using a random forest with synthetic minority oversampling technique (SMOTE) using Tomek links, edited nearest neighbors (ENN), and random oversampling: accuracy = 96.59%, recall = 95.49%, precision = 97.66%, F1 score = 96.59%, and ROC AUC = 98.83%. Conclusion: We developed a BCR prediction model for RP. The Dr. Answer AI project, which was developed based on our BCR prediction model, helps physicians and patients to make treatment decisions in the clinical follow-up process as a clinical decision support system.


Sensors ◽  
2019 ◽  
Vol 19 (20) ◽  
pp. 4523 ◽  
Author(s):  
Carlos Cabo ◽  
Celestino Ordóñez ◽  
Fernando Sáchez-Lasheras ◽  
Javier Roca-Pardiñas ◽  
and Javier de Cos-Juez

We analyze the utility of multiscale supervised classification algorithms for object detection and extraction from laser scanning or photogrammetric point clouds. Only the geometric information (the point coordinates) was considered, thus making the method independent of the systems used to collect the data. A maximum of five features (input variables) was used, four of them related to the eigenvalues obtained from a principal component analysis (PCA). PCA was carried out at six scales, defined by the diameter of a sphere around each observation. Four multiclass supervised classification models were tested (linear discriminant analysis, logistic regression, support vector machines, and random forest) in two different scenarios, urban and forest, formed by artificial and natural objects, respectively. The results obtained were accurate (overall accuracy over 80% for the urban dataset, and over 93% for the forest dataset), in the range of the best results found in the literature, regardless of the classification method. For both datasets, the random forest algorithm provided the best solution/results when discrimination capacity, computing time, and the ability to estimate the relative importance of each variable are considered together.


2021 ◽  
Vol 13 (12) ◽  
pp. 2339
Author(s):  
Haibo Yang ◽  
Fei Li ◽  
Wei Wang ◽  
Kang Yu

Spectral indices rarely show consistency in estimating crop traits across growth stages; thus, it is critical to simultaneously evaluate a group of spectral variables and select the most informative spectral indices for retrieving crop traits. The objective of this study was to explore the optimal spectral predictors for above-ground biomass (AGB) by applying Random Forest (RF) on three types of spectral predictors: the full spectrum, published spectral indices (Pub-SIs), and optimized spectral indices (Opt-SIs). Canopy hyperspectral reflectance of potato plants, treated with seven nitrogen (N) rates, was obtained during the tuber formation and tuber bulking from 2015 to 2016. Twelve Pub-SIs were selected, and their spectral bands were optimized using band optimization algorithms. Results showed that the Opt-SIs were the best input variables of RF models. Compared to the best empirical model based on Opt-SIs, the Opt-SIs based RF model improved the prediction of AGB, with R2 increased by 6%, 10%, and 16% at the tuber formation, tuber bulking, and for across the two growth stages, respectively. The Opt-SIs can significantly reduce the number of input variables. The optimized Blue nitrogen index (Opt-BNI) and Modified red-edge normalized difference vegetation index (Opt-mND705) combined with an RF model showed the best performance in estimating potato AGB at the tuber formation stage (R2 = 0.88). In the tuber bulking stage, only using optimized Nitrogen planar domain index (Opt-NPDI) as the input variable of the RF model produced satisfactory accuracy in training and testing datasets, with the R2, RMSE, and RE being 0.92, 208.6 kg/ha, and 10.3%, respectively. The Opt-BNI and Double-peak nitrogen index (Opt-NDDA) coupling with an RF model explained 86% of the variations in potato AGB, with the lowest RMSE (262.9 kg/ha) and RE (14.8%) across two growth stages. This study shows that combining the Opt-SIs and RF can greatly enhance the prediction accuracy for crop AGB while significantly reduces collinearity and redundancies of spectral data.


2018 ◽  
Vol 10 (6) ◽  
pp. 946 ◽  
Author(s):  
Yanan Liu ◽  
Weishu Gong ◽  
Xiangyun Hu ◽  
Jianya Gong

2019 ◽  
Vol 71 (3) ◽  
pp. 702-725
Author(s):  
Nayara Vasconcelos Estrabis ◽  
José Marcato Junior ◽  
Hemerson Pistori

O Cerrado é um dos biomas existentes no Brasil e o segundo mais extenso da América do Sul. Possui grande importância devido a sua biodiversidade, ecossistema e principalmente por servir como um reservatório, ou “esponja”, que distribui água para os demais biomas, além de ser berço de nascentes de algumas das maiores bacias da América do Sul. No entanto, devido às atividades antrópicas praticadas (com destaque para a pecuária e silvicultura) e a redução da vegetação nativa, este bioma está ameaçado. Considerado como hotspot em biodiversidade, o Cerrado pode não existir em 2050. Com a necessidade de sua preservação, o objetivo desse trabalho consistiu em investigar o uso de algoritmos de aprendizado de máquina para realizar o mapeamento da vegetação nativa existente na região do município de Três Lagoas, utilizando a plataforma em nuvem Google Earth Engine. O processo foi realizado com uma imagem Landsat-8 OLI, datada de 10 de outubro de 2018, e com os algoritmos Random Forest (RF) e Support Vector Machine (SVM). Na validação da classificação, o RF e o SVM apresentaram índices kappa iguais a 0,94 e 0,97, respectivamente. O RF, quando comparado ao SVM, apresentou classificação mais ruidosa. Por fim, verificou-se a existência de vegetação nativa de aproximadamente 2556 km² ao adotar o RF e 2873 km² ao adotar SVM.


2021 ◽  
Author(s):  
Zitian Gao ◽  
Danlu Guo ◽  
Dongryeol Ryu ◽  
Andrew Western

<p>Timely classification of crop types is critical for agronomic planning in water use and crop production. However, crop type mapping is typically undertaken only after the cropping season, which precludes its uses in later-season water use planning and yield estimation. This study aims 1) to understand how the accuracy of crop type classification changes within cropping season and 2) to suggest the earliest time that it is possible to achieve reliable crop classification. We focused on three main summer crops (corn/maize, cotton and rice) in the Coleambally Irrigation Area (CIA), a major irrigation district in south-eastern Australia consisting of over 4000 fields, for the period of 2013 to 2019. The summer irrigation season in the CIA is from mid-August to mid-May and most farms use surface irrigation to support the growth of summer crops. We developed models that combine satellite data and farmer-reported information for in-season crop type classification. Monthly-averaged Landsat spectral bands were used as input to Random Forest algorithm. We developed multiple models trained with data initially available at the start of the cropping season, then later using all the antecedent images up to different stages within the season. We evaluated the model performance and uncertainty using a two-fold cross validation by randomly choosing training vs. validation periods. Results show that the classification accuracy increases rapidly during the first three months followed by a marginal improvement afterwards. Crops can be classified with a User’s accuracy above 70% based on the first 2-3 months after the start of the season. Cotton and rice have higher in-season accuracy than corn/maize. The resulting crop maps can be used to support activities such as later-season system scale irrigation decision-making or yield estimation at a regional scale.</p><p>Keywords: Landsat 8 OLI, in-season, multi-year, crop type, Random Forest</p>


Sign in / Sign up

Export Citation Format

Share Document