Performance of a Physically Based Gap-Filling Technique of in-situ Soil Moisture, in Comparison with Machine Learning

Author(s):  
Seulchan Lee ◽  
Hyunho Jeon ◽  
Jongmin Park ◽  
Minha Choi

<p>As the importance of Soil Moisture (SM) has been recognized in various fields, including agricultural practices, natural hazards, and climate predictions, ground-based SM sensors such as Frequency Domain Reflectometry (FDR), Time Domain Reflectometry (TDR) are being widely used. However, gaps in in-situ SM data are still unavoidable due not only to sensor failure or low voltage supply, but to environmental conditions. Since it is essential to acquire accurate and continuous SM data for its application purpose, the gaps in the data should be handled properly. In this study, we propose a physically based gap-filling method in a mountainous region, in which in-situ SM measurements and flux tower are located. This method is developed only with in-situ SM and precipitation data, by considering variation characteristics of SM: increases rapidly with precipitation and decreases asymptotically afterward. SM data from the past is used to build Look-Up-Tables (LUTs) that contains the amount and speed of increment and decrement of SM, with and without precipitation, respectively. Based on the developed LUTs, the gaps are filled successively from where the gaps started. At the same time, we also introduce a machine learning-based gap-filling framework for the comparison. Ancillary data from the flux tower (e.g. net radiation, relative humidity) was used as input for training, with the same period as in the physically based method. The trained models are then used to fill the gaps. We found that both proposed methods are able to fill the gaps of in-situ SM reasonably, with capabilities to capture the characteristics of SM variation. Results from the comparison indicate that the physically based gap-filling method is very accurate and efficient when there’s limited information, and also suitable to be used for prediction purposes.</p>

2021 ◽  
Vol 13 (14) ◽  
pp. 2848
Author(s):  
Hao Sun ◽  
Qian Xu

Obtaining large-scale, long-term, and spatial continuous soil moisture (SM) data is crucial for climate change, hydrology, and water resource management, etc. ESA CCI SM is such a large-scale and long-term SM (longer than 40 years until now). However, there exist data gaps, especially for the area of China, due to the limitations in remote sensing of SM such as complex topography, human-induced radio frequency interference (RFI), and vegetation disturbances, etc. The data gaps make the CCI SM data cannot achieve spatial continuity, which entails the study of gap-filling methods. In order to develop suitable methods to fill the gaps of CCI SM in the whole area of China, we compared typical Machine Learning (ML) methods, including Random Forest method (RF), Feedforward Neural Network method (FNN), and Generalized Linear Model (GLM) with a geostatistical method, i.e., Ordinary Kriging (OK) in this study. More than 30 years of passive–active combined CCI SM from 1982 to 2018 and other biophysical variables such as Normalized Difference Vegetation Index (NDVI), precipitation, air temperature, Digital Elevation Model (DEM), soil type, and in situ SM from International Soil Moisture Network (ISMN) were utilized in this study. Results indicated that: 1) the data gap of CCI SM is frequent in China, which is found not only in cold seasons and areas but also in warm seasons and areas. The ratio of gap pixel numbers to the whole pixel numbers can be greater than 80%, and its average is around 40%. 2) ML methods can fill the gaps of CCI SM all up. Among the ML methods, RF had the best performance in fitting the relationship between CCI SM and biophysical variables. 3) Over simulated gap areas, RF had a comparable performance with OK, and they outperformed the FNN and GLM methods greatly. 4) Over in situ SM networks, RF achieved better performance than the OK method. 5) We also explored various strategies for gap-filling CCI SM. Results demonstrated that the strategy of constructing a monthly model with one RF for simulating monthly average SM and another RF for simulating monthly SM disturbance achieved the best performance. Such strategy combining with the ML method such as the RF is suggested in this study for filling the gaps of CCI SM in China.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Sungmin O. ◽  
Rene Orth

AbstractWhile soil moisture information is essential for a wide range of hydrologic and climate applications, spatially-continuous soil moisture data is only available from satellite observations or model simulations. Here we present a global, long-term dataset of soil moisture derived through machine learning trained with in-situ measurements, SoMo.ml. We train a Long Short-Term Memory (LSTM) model to extrapolate daily soil moisture dynamics in space and in time, based on in-situ data collected from more than 1,000 stations across the globe. SoMo.ml provides multi-layer soil moisture data (0–10 cm, 10–30 cm, and 30–50 cm) at 0.25° spatial and daily temporal resolution over the period 2000–2019. The performance of the resulting dataset is evaluated through cross validation and inter-comparison with existing soil moisture datasets. SoMo.ml performs especially well in terms of temporal dynamics, making it particularly useful for applications requiring time-varying soil moisture, such as anomaly detection and memory analyses. SoMo.ml complements the existing suite of modelled and satellite-based datasets given its distinct derivation, to support large-scale hydrological, meteorological, and ecological analyses.


2019 ◽  
Author(s):  
Luke Gregor ◽  
Alice D. Lebehot ◽  
Schalk Kok ◽  
Pedro M. Scheel Monteiro

Abstract. Over the last decade, advanced statistical inference and machine learning have been used to fill the gaps in sparse surface ocean CO2 measurements (Rödenbeck et al. 2015). The estimates from these methods have been used to constrain seasonal, interannual and decadal variability in sea-air CO2 fluxes and the drivers of these changes (Landschützer et al. 2015, 2016, Gregor et al. 2018). However, it is also becoming clear that these methods are converging towards a common bias and RMSE boundary: the wall, which suggests that pCO2 estimates are now limited by both data gaps and scale-sensitive observations. Here, we analyse this problem by introducing a new gap-filling method, an ensemble of six machine learning models (CSIR-ML6 version 2019a), where each model is constructed with a two-step clustering-regression approach. The ensemble is then statistically compared to well-established methods. The ensemble, CSIR-ML6, has an RMSE of 17.16 µatm and bias of 0.89 µatm when compared to a test-dataset kept separate from training procedures. However, when validating our estimates with independent datasets, we find that our method improves only incrementally on other gap-filling methods. We investigate the differences between the methods to understand the extent of the limitations of gap-filling estimates of pCO2. We show that disagreement between methods in the South Atlantic, southeastern Pacific and parts of the Southern Ocean are too large to interpret the interannual variability with confidence. We conclude that improvements in surface ocean pCO2 estimates will likely be incremental with the optimisation of gap-filling methods by (1) the inclusion of additional clustering and regression variables (e.g. eddy kinetic energy), (2) increasing the sampling resolution. Larger improvements will only be realised with an increase in CO2 observational coverage, particularly in today's poorly sampled areas.


2020 ◽  
Author(s):  
Sarah Schönbrodt-Stitt ◽  
Paolo Nasta ◽  
Nima Ahmadian ◽  
Markus Kurtenbach ◽  
Christopher Conrad ◽  
...  

<p>Mapping near-surface soil moisture (<em>θ</em>) is of tremendous relevance for a broad range of environment-related disciplines and meteorological, ecological, hydrological and agricultural applications. Globally available products offer the opportunity to address <em>θ</em> in large-scale modelling with coarse spatial resolution such as at the landscape level. However, <em>θ</em> estimation at higher spatial resolution is of vital importance for many small-scale applications. Therefore, we focus our study on a small-scale catchment (MFC2) belonging to the “Alento” hydrological observatory, located in southern Italy (Campania Region). The goal of this study is to develop new machine-learning approaches to estimate high grid-resolution (about 17 m cell size) <em>θ</em> maps from mainly backscatter measurements retrieved from C-band Synthetic Aperture Radar (SAR) based on Sentinel-1 (S1) images and from gridded terrain attributes. Thus, a workflow comprising a total of 48 SAR-based <em>θ</em> patterns estimated for 24 satellite overpass dates (revisit time of 6 days) each with ascendant and descendent orbits will be presented. To enable for the mapping, SAR-based <em>θ</em> data was calibrated with in-situ measurements carried out with a portable device during eight measurement campaigns at time of satellite overpasses (four overpass days in total with each ascendant and descendent satellite overpasses per day in November 2018). After the calibration procedure, data validation was executed from November 10, 2018 till March 28, 2019 by using two stationary sensors monitoring <em>θ</em> at high-temporal (1-min recording time). The specific sensor locations reflected two contrasting field conditions, one bare soil plot (frequently kept clear, without disturbance of vegetation cover) and one non-bare soil plot (real-world condition). Point-scale ground observations of <em>θ</em> were compared to pixel-scale (17 m × 17 m), SAR-based <em>θ</em> estimated for those pixels corresponding to the specific positions of the stationary sensors. Mapping performance was estimated through the root mean squared error (RMSE). For a short-term time series of <em>θ</em> (Nov 2018) integrating 136 in situ, sensor-based <em>θ</em> (<em>θ</em><sub>insitu</sub>) and 74 gravimetric-based <em>θ</em> (<em>θ</em><sub>gravimetric</sub>) measurements during a total of eight S1 overpasses, mapping performance already proved to be satisfactory with RMSE=0.039 m³m<sup>-</sup>³ and R²=0.92, respectively with RMSE=0.041 m³m<sup>-</sup>³ and R²=0.91. First results further reveal that estimated satellite-based <em>θ</em> patterns respond to the evolution of rainfall. With our workflow developed and results, we intend to contribute to improved environmental risk assessment by assimilating the results into hydrological models (e.g., HydroGeoSphere), and to support future studies on combined ground-based and SAR-based <em>θ</em> retrieval for forested land (future missions operating at larger wavelengths e.g. NISARL-band, Biomass P-band sensors).</p>


2020 ◽  
Vol 24 (10) ◽  
pp. 4793-4812
Author(s):  
Renaud Hostache ◽  
Dominik Rains ◽  
Kaniska Mallick ◽  
Marco Chini ◽  
Ramona Pelich ◽  
...  

Abstract. The main objective of this study is to investigate how brightness temperature observations from satellite microwave sensors may help to reduce errors and uncertainties in soil moisture and evapotranspiration simulations with a large-scale conceptual hydro-meteorological model. In addition, this study aims to investigate whether such a conceptual modelling framework, relying on parameter calibration, can reach the performance level of more complex physically based models for soil moisture simulations at a large scale. We use the ERA-Interim publicly available forcing data set and couple the Community Microwave Emission Modelling (CMEM) platform radiative transfer model with a hydro-meteorological model to enable, therefore, soil moisture, evapotranspiration and brightness temperature simulations over the Murray–Darling basin in Australia. The hydro-meteorological model is configured using recent developments in the SUPERFLEX framework, which enables tailoring the model structure to the specific needs of the application and to data availability and computational requirements. The hydrological model is first calibrated using only a sample of the Soil Moisture and Ocean Salinity (SMOS) brightness temperature observations (2010–2011). Next, SMOS brightness temperature observations are sequentially assimilated into the coupled SUPERFLEX–CMEM model (2010–2015). For this experiment, a local ensemble transform Kalman filter is used. Our empirical results show that the SUPERFLEX–CMEM modelling chain is capable of predicting soil moisture at a performance level similar to that obtained for the same study area and with a quasi-identical experimental set-up using the Community Land Model (CLM) . This shows that a simple model, when calibrated using globally and freely available Earth observation data, can yield performance levels similar to those of a physically based (uncalibrated) model. The correlation between simulated and in situ observed soil moisture ranges from 0.62 to 0.72 for the surface and root zone soil moisture. The assimilation of SMOS brightness temperature observations into the SUPERFLEX–CMEM modelling chain improves the correlation between predicted and in situ observed surface and root zone soil moisture by 0.03 on average, showing improvements similar to those obtained using the CLM land surface model. Moreover, at the same time the assimilation improves the correlation between predicted and in situ observed monthly evapotranspiration by 0.02 on average.


2020 ◽  
Author(s):  
Rene Orth ◽  
Sungmin Oh

<p>Soil moisture plays a key role in land-atmosphere interactions through its influence on the energy and water cycles. Furthermore, its spatiotemporal variations can affect the development and persistence of extreme weather events. Consequently, soil moisture information is required for a wide range of research and applications, such as agricultural monitoring, flood and drought prediction, climate projection, and carbon-cycle modeling. Despite its scientific and societal importance, observations of soil moisture are sparse, in particular across time and at large spatial scales. Only models and satellite retrievals can provide global soil moisture information. While the ability of land surface models to represent the complex land-atmosphere interplay is still limited, satellite-based soil moisture data are a valuable alternative. However, these products suffer from a scaling based on models, and can only capture the top few centimeters of the soil. </p><p>In this study, we aim to augment satellite-based soil moisture data using machine learning. For this purpose we integrate satellite soil moisture with multiple hydro-meteorological data streams to derive global gridded soil moisture using Long Short-Term Memory (LSTM) neural networks. These networks are trained using in-situ soil moisture measurements as target data. With the resulting self-learned relationships, the LSTMs can produce in-situ-like soil moisture globally. We further analyze the implications of using point-scale target data to infer large scale information. The new dataset is derived separately for the surface and the deeper soil, thereby extending beyond the range covered by the satellite-based products. The integration of many data streams and multiple soil moisture observations through a powerful synergistic technique offers the potential to yield high accuracy. This is tested through rigorous cross-validation of the derived dataset. Finally, the planned datasets will permit consistent long-term, large-scale analysis to enhance our understanding of the hydrology-biosphere-climate interplay, to better constrain models and to support hydrological hazards monitoring and climate projections.</p>


2019 ◽  
Vol 12 (12) ◽  
pp. 5113-5136 ◽  
Author(s):  
Luke Gregor ◽  
Alice D. Lebehot ◽  
Schalk Kok ◽  
Pedro M. Scheel Monteiro

Abstract. Over the last decade, advanced statistical inference and machine learning have been used to fill the gaps in sparse surface ocean CO2 measurements (Rödenbeck et al., 2015). The estimates from these methods have been used to constrain seasonal, interannual and decadal variability in sea–air CO2 fluxes and the drivers of these changes (Landschützer et al., 2015, 2016; Gregor et al., 2018). However, it is also becoming clear that these methods are converging towards a common bias and root mean square error (RMSE) boundary: “the wall”, which suggests that pCO2 estimates are now limited by both data gaps and scale-sensitive observations. Here, we analyse this problem by introducing a new gap-filling method, an ensemble average of six machine-learning models (CSIR-ML6 version 2019a, Council for Scientific and Industrial Research – Machine Learning ensemble with Six members), where each model is constructed with a two-step clustering-regression approach. The ensemble average is then statistically compared to well-established methods. The ensemble average, CSIR-ML6, has an RMSE of 17.16 µatm and bias of 0.89 µatm when compared to a test dataset kept separate from training procedures. However, when validating our estimates with independent datasets, we find that our method improves only incrementally on other gap-filling methods. We investigate the differences between the methods to understand the extent of the limitations of gap-filling estimates of pCO2. We show that disagreement between methods in the South Atlantic, southeastern Pacific and parts of the Southern Ocean is too large to interpret the interannual variability with confidence. We conclude that improvements in surface ocean pCO2 estimates will likely be incremental with the optimisation of gap-filling methods by (1) the inclusion of additional clustering and regression variables (e.g. eddy kinetic energy), (2) increasing the sampling resolution and (3) successfully incorporating pCO2 estimates from alternate platforms (e.g. floats, gliders) into existing machine-learning approaches.


Sign in / Sign up

Export Citation Format

Share Document