Towards hybrid modeling of the global hydrological cycle
Abstract. Progress in machine learning in conjunction with the increasing availability of relevant Earth observation data streams may help to overcome uncertainties of global hydrological models due to the complexity of the processes, diversity, and heterogeneity of the land surface and subsurface, as well as scale-dependency of processes and parameters. In this study, we exemplify a hybrid approach to global hydrological modeling that exploits the data-adaptiveness of machine learning for representing uncertain processes within a model structure based on physical principles like mass conservation. Our H2M model simulates the dynamics of snow, soil moisture, and groundwater pools globally at 1º spatial resolution and daily time step where simulated water fluxes depend on an embedded recurrent neural network. We trained the model simultaneously against observational products of terrestrial water storage variations (TWS), runoff, evapotranspiration, and snow water equivalent with a multi-task learning approach. We find that H2M is capable of reproducing the key patterns of global water cycle components with model performances being at least on par with four state-of-the-art global hydrological models. The neural network learned hydrological responses of evapotranspiration and runoff generation to antecedent soil moisture state that are qualitatively consistent with our understanding and theory. Simulated contributions of groundwater, soil moisture, and snowpack variability to TWS variations are plausible and within the large range of traditional GHMs. H2M indicates a somewhat stronger role of soil moisture for TWS variations in transitional and tropical regions compared to GHMs. Overall, we present a proof of concept for global hybrid hydrological modeling in providing a new, complementary, and data-driven perspective on global water cycle variations. With further increasing Earth observations, hybrid modeling has a large potential to advance our capability to monitor and understand the Earth system by facilitating a data-adaptive yet physically consistent, joint interpretation of heterogeneous data streams.