A COMPARISON OF MACHINE LEARNING MODELS FOR SOIL SALINITY ESTIMATION USING MULTI-SPECTRAL EARTH OBSERVATION DATA

Abstract. Soil salinity, a significant environmental indicator, is considered one of the leading causes of land degradation, especially in arid and semi-arid regions. In many cases, this major threat leads to loss of arable land, reduces crop productivity, groundwater resources loss, increases economic costs for soil management, and ultimately increases the probability of soil erosion. Monitoring soil salinity distribution and degree of salinity and mapping the electrical conductivity (EC) using remote sensing techniques are crucial for land use management. Salt-effected soil is a predominant phenomenon in the Eshtehard Salt Lake located in Alborz, Iran. In this study, the potential of Sentinel-2 imagery was investigated for mapping and monitoring soil salinity. According to the satellite's pass, different salt properties were measured for 197 soil samples in the field data study. Therefore several spectral features, such as satellite band reflectance, salinity indices, and vegetation indices, were extracted from Sentinel-2 imagery. To build an optimum machine learning regression model for soil salinity estimation, three different regression models, including Gradient Boost Machine (GBM), Extreme Gradient Boost (XGBoost), and Random Forest (RF), were used. The XGBoostmethod outperformed GBM and RF with the coefficient of determination (R2) more than 76%, Root Mean Square Error (RMSE) about 0.84 dS m−1, and Normalized Root Mean Square Error (NRMSE) about 0.33 dS m−1. The results demonstrated that the integration of remote sensing data, field data, and using an appropriate machine learning model could provide high-precision salinity maps to monitor soil salinity as an environmental problem.

Download Full-text

Phycocyanin Monitoring in Some Spanish Water Bodies with Sentinel-2 Imagery

Water ◽

10.3390/w13202866 ◽

2021 ◽

Vol 13 (20) ◽

pp. 2866

Author(s):

Rebeca Pérez-González ◽

Xavier Sòria-Perpinyà ◽

Juan Miguel Soria ◽

Jesús Delegido ◽

Patricia Urrego ◽

...

Keyword(s):

Remote Sensing ◽

Root Mean Square Error ◽

Mean Square Error ◽

Root Mean Square ◽

Water Bodies ◽

World Health ◽

Mean Square ◽

Monitoring Method ◽

And Control ◽

Sentinel 2

Remote sensing is an appropriate tool for water management. It allows the study of some of the main sources of pollution, such as cyanobacterial harmful algal blooms. These species are increasing due to eutrophication and the adverse effects of climate change. This leads to water quality loss, which has a major impact on the environment, including human water supplies, which consequently require more expensive purification processes. The application of satellite remote sensing images as bio-optical tools is an effective way to monitor and control phycocyanin concentrations, which indicate the presence of cyanobacteria. For this study, 90 geo-referenced phycocyanin measurements were performed in situ, using a Turner C3 Submersible Fluorometer and a laboratory spectrofluorometer, both calibrated with phycocyanin standard, in water bodies of the Iberian Peninsula. These samples were synchronized with Sentinel-2 satellite orbit. The images were processed using Sentinel Application Program software and corrected with the Case 2 Regional Coast color-extended atmospheric correction tool. To produce algorithms that would help to obtain the phycocyanin concentration from the reflectance measured by the multispectral instrument sensor of the satellite, the following band combinations were tested, among others: band 665 nm, band 705 nm, and band 740 nm. The samples were equally divided: half were used for the algorithm’s calibration, and the other half for its validation. With the best adjustment, the algorithm was made more robust and accurate through a recalculation, obtaining a determination coefficient of 0.7, a Root Mean Square Error of 8.1 µg L−1, and a Relative Root Mean Square Error of 19%. In several reservoirs, we observed alarming phycocyanin concentrations that may trigger many environmental health problems, as established by the World Health Organization. Remote sensing provides a rapid monitoring method for the temporal and spatial distribution of these cyanobacteria blooms to ensure good preventive management and control, in order to improve the environmental quality of inland waters.

Download Full-text

Estimation of Apple Flowering Frost Loss for Fruit Yield Based on Gridded Meteorological and Remote Sensing Data in Luochuan, Shaanxi Province, China

Remote Sensing ◽

10.3390/rs13091630 ◽

2021 ◽

Vol 13 (9) ◽

pp. 1630

Author(s):

Yaohui Zhu ◽

Guijun Yang ◽

Hao Yang ◽

Fa Zhao ◽

Shaoyu Han ◽

...

Keyword(s):

Remote Sensing ◽

Root Mean Square Error ◽

Mean Square Error ◽

Root Mean Square ◽

Minimum Temperature ◽

Price Regulation ◽

Flowering Period ◽

Mean Square ◽

Prevention Measures ◽

Agricultural Insurance

With the increase in the frequency of extreme weather events in recent years, apple growing areas in the Loess Plateau frequently encounter frost during flowering. Accurately assessing the frost loss in orchards during the flowering period is of great significance for optimizing disaster prevention measures, market apple price regulation, agricultural insurance, and government subsidy programs. The previous research on orchard frost disasters is mainly focused on early risk warning. Therefore, to effectively quantify orchard frost loss, this paper proposes a frost loss assessment model constructed using meteorological and remote sensing information and applies this model to the regional-scale assessment of orchard fruit loss after frost. As an example, this article examines a frost event that occurred during the apple flowering period in Luochuan County, Northwestern China, on 17 April 2020. A multivariable linear regression (MLR) model was constructed based on the orchard planting years, the number of flowering days, and the chill accumulation before frost, as well as the minimum temperature and daily temperature difference on the day of frost. Then, the model simulation accuracy was verified using the leave-one-out cross-validation (LOOCV) method, and the coefficient of determination (R2), the root mean square error (RMSE), and the normalized root mean square error (NRMSE) were 0.69, 18.76%, and 18.76%, respectively. Additionally, the extended Fourier amplitude sensitivity test (EFAST) method was used for the sensitivity analysis of the model parameters. The results show that the simulated apple orchard fruit number reduction ratio is highly sensitive to the minimum temperature on the day of frost, and the chill accumulation and planting years before the frost, with sensitivity values of ≥0.74, ≥0.25, and ≥0.15, respectively. This research can not only assist governments in optimizing traditional orchard frost prevention measures and market price regulation but can also provide a reference for agricultural insurance companies to formulate plans for compensation after frost.

Download Full-text

Using Multi-Temporal MODIS NDVI Data to Monitor Tea Status and Forecast Yield: A Case Study at Tanuyen, Laichau, Vietnam

Remote Sensing ◽

10.3390/rs12111814 ◽

2020 ◽

Vol 12 (11) ◽

pp. 1814

Author(s):

Phamchimai Phan ◽

Nengcheng Chen ◽

Lei Xu ◽

Zeqiang Chen

Keyword(s):

Remote Sensing ◽

Root Mean Square Error ◽

Mean Square Error ◽

Root Mean Square ◽

Vegetation Index ◽

Coefficient Of Determination ◽

Percentage Error ◽

Support Vector ◽

Mean Square ◽

Multi Temporal

Tea is a cash crop that improves the quality of life for people in the Tanuyen District of Laichau Province, Vietnam. Tea yield, however, has stagnated in recent years, due to changes in temperature, precipitation, the age of the tea bushes, and diseases. Developing an approach for monitoring tea bushes by remote sensing and Geographic Information Systems (GIS) might be a way to alleviate this problem. Using multi-temporal remote sensing data, the paper details an investigation of the changes in tea health and yield forecasting through the normalized difference vegetation index (NDVI). In this study, we used NDVI as a support tool to demonstrate the temporal and spatial changes in NDVI through the extract tea NDVI value and calculate the mean NDVI value. The results of the study showed that the minimum NDVI value was 0.42 during January 2013 and February 2015 and 2016. The maximum NDVI value was in August 2015 and June 2017. We indicate that the linear relationship between NDVI value and mean temperature was strong with R 2 = 0.79 Our results confirm that the combination of meteorological data and NDVI data can achieve a high performance of yield prediction. Three models to predict tea yield were conducted: support vector machine (SVM), random forest (RF), and the traditional linear regression model (TLRM). For period 2009 to 2018, the prediction tea yield by the RF model was the best with a R 2 = 0.73 , by SVM it was 0.66, and 0.57 with the TLRM. Three evaluation indicators were used to consider accuracy: the coefficient of determination ( R 2 ), root-mean-square error (RMSE), and percentage error of tea yield (PETY). The highest accuracy for the three models was in 2015 with a R 2 ≥ 0.87, RMSE < 50 kg/ha, and PETY less 3% error. In the other years, the prediction accuracy was higher in the SVM and RF models. Meanwhile, the RF algorithm was better than PETY (≤10%) and the root mean square error for this algorithm was significantly less (≤80 kg/ha). RMSE and PETY showed relatively good values in the TLRM model with a RMSE from 80 to 100 kg/ha and a PETY from 8 to 15%.

Download Full-text

Machine Learning Based Prediction of Suicide Probability

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.a1701.1010120 ◽

2020 ◽

Vol 10 (1) ◽

pp. 94-97

Keyword(s):

Machine Learning ◽

Root Mean Square Error ◽

Mean Square Error ◽

Root Mean Square ◽

Regression Models ◽

Mean Square ◽

Spline Regression ◽

Proposed Model ◽

Bayesian Machine Learning

Many factors have led to the increase of suicide-proneness in the present era. As a consequence, many novel methods have been proposed in recent times for prediction of the probability of suicides, using different metrics. The current work reviews a number of models and techniques proposed recently, and offers a novel Bayesian machine learning (ML) model for prediction of suicides, involving classification of the data into separate categories. The proposed model is contrasted against similar computationally-inexpensive techniques such as spline regression. The model is found to generate appreciably accurate results for the dataset considered in this work. The application of Bayesian estimation allows the prediction of causation to a greater degree than the standard spline regression models, which is reflected by the comparatively low root mean square error (RMSE) for all estimates obtained by the proposed model.

Download Full-text

PEMODELAN PREDIKTIF KONSUMSI ENERGI BANGUNAN GEDUNG KOMERSIAL DENGAN ALGORITMA SUPPORT VECTOR MACHINE

Jurnal Pilar Nusa Mandiri ◽

10.33480/pilar.v14i2.882 ◽

2018 ◽

Vol 14 (2) ◽

pp. 225

Author(s):

Indriyanti Indriyanti ◽

Agus Subekti

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Root Mean Square Error ◽

Mean Square Error ◽

Root Mean Square ◽

Mean Absolute Error ◽

Absolute Error ◽

Support Vector ◽

Mean Square

Konsumsi energi bangunan yang semakin meningkat mendorong para peneliti untuk membangun sebuah model prediksi dengan menerapkan metode machine learning, namun masih belum diketahui model yang paling akurat. Model prediktif untuk konsumsi energi bangunan komersial penting untuk konservasi energi. Dengan menggunakan model yang tepat, kita dapat membuat desain bangunan yang lebih efisien dalam penggunaan energi. Dalam tulisan ini, kami mengusulkan model prediktif berdasarkan metode pembelajaran mesin untuk mendapatkan model terbaik dalam memprediksi total konsumsi energi. Algoritma yang digunakan yaitu SMOreg dan LibSVM dari kelas Support Vector Machine, kemudian untuk evaluasi model berdasarkan nilai Mean Absolute Error dan Root Mean Square Error. Dengan menggunakan dataset publik yang tersedia, kami mengembangkan model berdasarkan pada mesin vektor pendukung untuk regresi. Hasil pengujian kedua algoritma tersebut diketahui bahwa algoritma SMOreg memiliki akurasi lebih baik karena memiliki nilai MAE dan RMSE sebesar 4,70 dan 10,15, sedangkan untuk model LibSVM memiliki nilai MAE dan RMSE sebesar 9,37 dan 14,45. Kami mengusulkan metode berdasarkan algoritma SMOreg karena kinerjanya lebih baik.

Download Full-text

A Study on Time Series Forecasting using Hybridization of Time Series Models and Neural Networks

Recent Advances in Computer Science and Communications ◽

10.2174/1573401315666190619112842 ◽

2020 ◽

Vol 13 (5) ◽

pp. 827-832

Author(s):

Iflah Aijaz ◽

Parul Agarwal

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Time Series ◽

Root Mean Square Error ◽

Mean Square Error ◽

Root Mean Square ◽

Hybrid Models ◽

Time Series Forecasting ◽

Percentage Error ◽

Mean Square

Introduction: Auto-Regressive Integrated Moving Average (ARIMA) and Artificial Neural Networks (ANN) are leading linear and non-linear models in Machine learning respectively for time series forecasting. Objective: This survey paper presents a review of recent advances in the area of Machine Learning techniques and artificial intelligence used for forecasting different events. Methods: This paper presents an extensive survey of work done in the field of Machine Learning where hybrid models for are compared to the basic models for forecasting on the basis of error parameters like Mean Absolute Deviation (MAD), Mean Square Error (MSE), Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE) and Normalized Root Mean Square Error (NRMSE). Results: Table 1 summarizes important papers discussed in this paper on the basis of some parameters which explain the efficiency of hybrid models or when the model is used in isolation. Conclusion: The hybrid model has realized accurate results as compared when the models were used in isolation yet some research papers argue that hybrids cannot always outperform individual models.

Download Full-text

Prediksi Data Time Series Saham Bank BRI Dengan Mesin Belajar LSTM (Long ShortTerm Memory)

Journal of Informatic and Information Security ◽

10.31599/jiforty.v1i1.133 ◽

2020 ◽

Vol 1 (1) ◽

pp. 1-8

Author(s):

Adhitio Satyo Bayangkari Karno

Keyword(s):

Machine Learning ◽

Time Series ◽

Root Mean Square Error ◽

Mean Square Error ◽

Root Mean Square ◽

Short Term Memory ◽

Mean Square ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory

Abstract This study aims to measure the accuracy in predicting time series data using the LSTM (Long Short-Term Memory) machine learning method, and determine the number of epochs needed to produce a small RMSE (Root Mean Square Error) value. The result of this research is a high level of variation in RMSE value to the number of epochs needed in the data processing. This variation is quite difficult to obtain the right epoch value. By doing an iteration of the LSTM process on the number of different epochs (visualized in the graph), then the number of epochs with a minimum RMSE value will be easier to obtain. From the research of BBRI's stock data prediction, a good RMSE value was obtained (RMSE = 227.470333244533). Keywords: long short-term memory, machine learning, epoch, root mean square error, mean square error. Abstrak Penelitian ini bertujuan untuk mengukur ketelitian dalam memprediksi data time series menggunakan metode mesin belajar LSTM (Long Short-Term Memory), serta menentukan banyaknya epoch yang diperlukan untuk menghasilkan nilai RMSE (Root Mean Square Error) yang kecil. Hasil dari penelitian ini adalah tingkat variasi yang tinggi nilai rmse terhdap jumlah epoch yang diperlukan dalam proses pengolahan data. Variasi ini cukup menyulitkan untuk memperoleh nilai epoch yang tepat. Dengan melakukan iterasi dari proses LSTM terhadap jumlah epoch yang berbeda (di visualisasikan dalam grafik), maka jumlah epoch dengan nilai RMSE minimal akan lebih mudah diperoleh. Dari penelitan prediksi data saham BBRI diperoleh nilai RMSE yang cukup baik yaitu 227,470333244533. Kata kunci: long short-term memory, machine learning, epoch, root mean square error, mean square error.

Download Full-text

Low Cost Automatic Reconstruction of Tree Structure by AdQSM with Terrestrial Close-Range Photogrammetry

Forests ◽

10.3390/f12081020 ◽

2021 ◽

Vol 12 (8) ◽

pp. 1020

Author(s):

Yanqi Dong ◽

Guangpeng Fan ◽

Zhiwu Zhou ◽

Jincheng Liu ◽

Yongguo Wang ◽

...

Keyword(s):

Root Mean Square Error ◽

Mean Square Error ◽

Root Mean Square ◽

Low Cost ◽

Reference Value ◽

Tree Height ◽

Point Clouds ◽

Tree Structure ◽

Mean Square ◽

Tree Model

The quantitative structure model (QSM) contains the branch geometry and attributes of the tree. AdQSM is a new, accurate, and detailed tree QSM. In this paper, an automatic modeling method based on AdQSM is developed, and a low-cost technical scheme of tree structure modeling is provided, so that AdQSM can be freely used by more people. First, we used two digital cameras to collect two-dimensional (2D) photos of trees and generated three-dimensional (3D) point clouds of plot and segmented individual tree from the plot point clouds. Then a new QSM-AdQSM was used to construct tree model from point clouds of 44 trees. Finally, to verify the effectiveness of our method, the diameter at breast height (DBH), tree height, and trunk volume were derived from the reconstructed tree model. These parameters extracted from AdQSM were compared with the reference values from forest inventory. For the DBH, the relative bias (rBias), root mean square error (RMSE), and coefficient of variation of root mean square error (rRMSE) were 4.26%, 1.93 cm, and 6.60%. For the tree height, the rBias, RMSE, and rRMSE were—10.86%, 1.67 m, and 12.34%. The determination coefficient (R2) of DBH and tree height estimated by AdQSM and the reference value were 0.94 and 0.86. We used the trunk volume calculated by the allometric equation as a reference value to test the accuracy of AdQSM. The trunk volume was estimated based on AdQSM, and its bias was 0.07066 m3, rBias was 18.73%, RMSE was 0.12369 m3, rRMSE was 32.78%. To better evaluate the accuracy of QSM’s reconstruction of the trunk volume, we compared AdQSM and TreeQSM in the same dataset. The bias of the trunk volume estimated based on TreeQSM was −0.05071 m3, and the rBias was −13.44%, RMSE was 0.13267 m3, rRMSE was 35.16%. At 95% confidence interval level, the concordance correlation coefficient (CCC = 0.77) of the agreement between the estimated tree trunk volume of AdQSM and the reference value was greater than that of TreeQSM (CCC = 0.60). The significance of this research is as follows: (1) The automatic modeling method based on AdQSM is developed, which expands the application scope of AdQSM; (2) provide low-cost photogrammetric point cloud as the input data of AdQSM; (3) explore the potential of AdQSM to reconstruct forest terrestrial photogrammetric point clouds.

Download Full-text