Using multi-model ensembles of CMIP5 global climate models to reproduce observed monthly rainfall and temperature with machine learning methods in Australia

2018 ◽  
Vol 38 (13) ◽  
pp. 4891-4902 ◽  
Author(s):  
Bin Wang ◽  
Lihong Zheng ◽  
De Li Liu ◽  
Fei Ji ◽  
Anthony Clark ◽  
...  
Author(s):  
SOURABH SHRIVASTAVA ◽  
RAM AVTAR ◽  
PRASANTA KUMAR BAL

The coarse horizontal resolution global climate models (GCMs) have limitations in producing large biases over the mountainous region. Also, single model output or simple multi-model ensemble (SMME) outputs are associated with large biases. While predicting the rainfall extreme events, this study attempts to use an alternative modeling approach by using five different machine learning (ML) algorithms to improve the skill of North American Multi-Model Ensemble (NMME) GCMs during Indian summer monsoon rainfall from 1982 to 2009 by reducing the model biases. Random forest (RF), AdaBoost (Ada), gradient (Grad) boosting, bagging (Bag) and extra (Extra) trees regression models are used and the results from each models are compared against the observations. In simple MME (SMME), a wet bias of 20[Formula: see text]mm/day and an RMSE up to 15[Formula: see text]mm/day are found over the Himalayan region. However, all the ML models can bring down the mean bias up to [Formula: see text][Formula: see text]mm/day and RMSE up to 2[Formula: see text]mm/day. The interannual variability in ML outputs is closer to observation than the SMME. Also, a high correlation from 0.5 to 0.8 is found between in all ML models and then in SMME. Moreover, representation of RF and Grad is found to be best out of all five ML models that represent a high correlation over the Himalayan region. In conclusion, by taking full advantage of different models, the proposed ML-based multi-model ensemble method is shown to be accurate and effective.


Author(s):  
Jiayi Wang ◽  
Raymond K. W. Wong ◽  
Jun Mikyoung ◽  
Courtney Schumacher ◽  
Ramalingam Saravanan ◽  
...  

Abstract Predicting rain from large-scale environmental variables remains a challenging problem for climate models and it is unclear how well numerical methods can predict the true characteristics of rainfall without smaller (storm) scale information. This study explores the ability of three statistical and machine learning methods to predict 3-hourly rain occurrence and intensity at 0.5° resolution over the tropical Pacific Ocean using rain observations the Global Precipitation Measurement (GPM) satellite radar and large-scale environmental profiles of temperature and moisture from the MERRA-2 reanalysis. We also separated the rain into different types (deep convective, stratiform, and shallow convective) because of their varying kinematic and thermodynamic structures that might respond to the large-scale environment in different ways. Our expectation was that the popular machine learning methods (i.e., the neural network and random forest) would outperform a standard statistical method (a generalized linear model) because of their more flexible structures, especially in predicting the highly skewed distribution of rain rates for each rain type. However, none of the methods obviously distinguish themselves from one another and each method still has issues with predicting rain too often and not fully capturing the high end of the rain rate distributions, both of which are common problems in climate models. One implication of this study is that machine learning tools must be carefully assessed and are not necessarily applicable to solving all big data problems. Another implication is that traditional climate model approaches are not sufficient to predict extreme rain events and that other avenues need to be pursued.


2021 ◽  
Author(s):  
Manuel Celestino Vilela Teixeira Almeida ◽  
Yurii Shevchuk ◽  
Georgiy Kirillin ◽  
Pedro Matos Soares ◽  
Rita Margarida Antunes de Paula Cardoso ◽  
...  

Abstract. The complexity of the state-of-the-art climate models requires high computational resources and imposes rather simplified parameterization of inland waters. The effect of lakes and reservoirs on the local and regional climate is commonly parameterized in regional or global climate modeling as a function of surface water temperature estimated by atmosphere-coupled one-dimensional lake models. The latter typically neglect one of the major transport mechanisms specific to artificial reservoirs: heat and mass advection due to in- and outflows. Incorporation of these essentially two-dimensional processes into lake parameterizations requires a trade-off between computational efficiency and physical soundness, which is addressed in this study. We evaluated the performance of the two most used lake parameterization schemes and a machine learning approach on high-resolution historical water temperature records from 24 reservoirs. Simulations were also performed at both variable and constant water level to explore the thermal structure differences between lakes and reservoirs. Our results highlight that surface water temperatures in reservoirs differ significantly from those found in lakes, reinforcing the need to include anthropogenic inflow and outflow controls in regional and global climate models. Our findings also highlight the efficiency of the machine learning approach, which may overperform process-based physical models both in accuracy and in computational requirements, if applied to reservoirs with long-term observations available. A relationship between mean water retention times and the importance of inflows and outflows is established: reservoirs with the retention time shorter than ~100 days, if simulated without in- and outflow effects, tend to exhibit a statistically significant deviation in the computed surface temperatures regardless of their morphological characteristics.


2017 ◽  
Vol 56 (12) ◽  
pp. 3245-3262 ◽  
Author(s):  
A. Wootten ◽  
A. Terando ◽  
B. J. Reich ◽  
R. P. Boyles ◽  
F. Semazzi

AbstractIn recent years, climate model experiments have been increasingly oriented toward providing information that can support local and regional adaptation to the expected impacts of anthropogenic climate change. This shift has magnified the importance of downscaling as a means to translate coarse-scale global climate model (GCM) output to a finer scale that more closely matches the scale of interest. Applying this technique, however, introduces a new source of uncertainty into any resulting climate model ensemble. Here a method is presented, on the basis of a previously established variance decomposition method, to partition and quantify the uncertainty in climate model ensembles that is attributable to downscaling. The method is applied to the southeastern United States using five downscaled datasets that represent both statistical and dynamical downscaling techniques. The combined ensemble is highly fragmented, in that only a small portion of the complete set of downscaled GCMs and emission scenarios is typically available. The results indicate that the uncertainty attributable to downscaling approaches ~20% for large areas of the Southeast for precipitation and ~30% for extreme heat days (>35°C) in the Appalachian Mountains. However, attributable quantities are significantly lower for time periods when the full ensemble is considered but only a subsample of all models is available, suggesting that overconfidence could be a serious problem in studies that employ a single set of downscaled GCMs. This article concludes with recommendations to advance the design of climate model experiments so that the uncertainty that accrues when downscaling is employed is more fully and systematically considered.


2021 ◽  
Author(s):  
Soulivanh Thao ◽  
Mats Garvik ◽  
Grégoire Mariethoz ◽  
Mathieu Vrac

Abstract Global Climate Models are the main tools for climate projections. Since many models exist, it is common to use Multi-Model Ensembles to reduce biases and assess uncertainties in climate projections. Several approaches have been proposed to combine individual models and extract a robust signal from an ensemble. Among them, the Multi-Model Mean (MMM) is the most commonly used. Based on the assumption that the models are centered around the truth, it consists in averaging the ensemble, with the possibility of using equal weights for all models or to adjust weights to favor some models. In this paper, we propose a new alternative to reconstruct multi-decadal means of climate variables from a Multi-Model Ensemble, where the local performance of the models is taken into account. This is in contrast with MMM where a model has the same weight for all locations. Our approach is based on a computer vision method called graph cuts and consists in selecting for each grid point the most appropriate model, while at the same time considering the overall spatial consistency of the resulting field. The performance of the graph cuts approach is assessed based on two experiments: one where the ERA5 reanalyses are considered as the reference, and another involving a perfect model experiment where each model is in turn considered as the reference. We show that the graph cuts approach generally results in lower biases than other model combination approaches such as MMM, while at the same time preserving a similar level of spatial continuity.


2022 ◽  
Vol 15 (1) ◽  
pp. 173-197
Author(s):  
Manuel C. Almeida ◽  
Yurii Shevchuk ◽  
Georgiy Kirillin ◽  
Pedro M. M. Soares ◽  
Rita M. Cardoso ◽  
...  

Abstract. The complexity of the state-of-the-art climate models requires high computational resources and imposes rather simplified parameterization of inland waters. The effect of lakes and reservoirs on the local and regional climate is commonly parameterized in regional or global climate modeling as a function of surface water temperature estimated by atmosphere-coupled one-dimensional lake models. The latter typically neglect one of the major transport mechanisms specific to artificial reservoirs: heat and mass advection due to inflows and outflows. Incorporation of these essentially two-dimensional processes into lake parameterizations requires a trade-off between computational efficiency and physical soundness, which is addressed in this study. We evaluated the performance of the two most used lake parameterization schemes and a machine-learning approach on high-resolution historical water temperature records from 24 reservoirs. Simulations were also performed at both variable and constant water level to explore the thermal structure differences between lakes and reservoirs. Our results highlight the need to include anthropogenic inflow and outflow controls in regional and global climate models. Our findings also highlight the efficiency of the machine-learning approach, which may overperform process-based physical models in both accuracy and computational requirements if applied to reservoirs with long-term observations available. Overall, results suggest that the combined use of process-based physical models and machine-learning models will considerably improve the modeling of air–lake heat and moisture fluxes. A relationship between mean water retention times and the importance of inflows and outflows is established: reservoirs with a retention time shorter than ∼ 100 d, if simulated without inflow and outflow effects, tend to exhibit a statistically significant deviation in the computed surface temperatures regardless of their morphological characteristics.


2021 ◽  
Author(s):  
Gavin D. Madakumbura ◽  
Chad W. Thackeray ◽  
Jesse Norris ◽  
Naomi Goldenson ◽  
Alex Hall

Abstract Global climate models produce large increases in extreme precipitation when subject to anthropogenic forcing, but detecting this human influence in observations is challenging. Large internal variability makes the signal difficult to characterize. Models produce diverse precipitation responses to anthropogenic forcing, mirroring a variety of parameterization choices for subgrid-scale processes. And observations are inhomogeneously sampled in space and time, leading to multiple global datasets, each produced with a different homogenization technique. Thus, previous attempts to detect human influence on extreme precipitation have not incorporated internal variability or model uncertainty, and have been limited to specific regions and observational datasets. Using machine learning methods, we find a physically interpretable anthropogenic signal that is detectable in all global datasets. Detection occurs even when internal variability and model uncertainty are taken into account. Machine learning efficiently generates multiple lines of evidence supporting detection of an anthropogenic signal in extreme precipitation.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Gavin D. Madakumbura ◽  
Chad W. Thackeray ◽  
Jesse Norris ◽  
Naomi Goldenson ◽  
Alex Hall

Abstract The intensification of extreme precipitation under anthropogenic forcing is robustly projected by global climate models, but highly challenging to detect in the observational record. Large internal variability distorts this anthropogenic signal. Models produce diverse magnitudes of precipitation response to anthropogenic forcing, largely due to differing schemes for parameterizing subgrid-scale processes. Meanwhile, multiple global observational datasets of daily precipitation exist, developed using varying techniques and inhomogeneously sampled data in space and time. Previous attempts to detect human influence on extreme precipitation have not incorporated model uncertainty, and have been limited to specific regions and observational datasets. Using machine learning methods that can account for these uncertainties and capable of identifying the time evolution of the spatial patterns, we find a physically interpretable anthropogenic signal that is detectable in all global observational datasets. Machine learning efficiently generates multiple lines of evidence supporting detection of an anthropogenic signal in global extreme precipitation.


2013 ◽  
Vol 9 (2) ◽  
pp. 1565-1597 ◽  
Author(s):  
K. Saito ◽  
T. Sueyoshi ◽  
S. Marchenko ◽  
V. Romanovsky ◽  
B. Otto-Bliesner ◽  
...  

Abstract. Global-scale frozen ground distribution during the Last Glacial Maximum (LGM) was reconstructed using multi-model ensembles of global climate models, and then compared with evidence-based knowledge and earlier numerical results. Modeled soil temperatures, taken from Paleoclimate Modelling Intercomparison Project Phase III (PMIP3) simulations, were used to diagnose the subsurface thermal regime and determine underlying frozen ground types for the present-day (pre-industrial; 0 k) and the LGM (21 k). This direct method was then compared to the earlier indirect method, which categorizes the underlying frozen ground type from surface air temperature, applied to both the PMIP2 (phase II) and PMIP3 products. Both direct and indirect diagnoses for 0 k showed strong agreement with the present-day observation-based map, although the soil temperature ensemble showed a higher diversity among the models partly due to varying complexity of the implemented subsurface processes. The area of continuous permafrost estimated by the multi-model analysis was 25.6 million km2 for LGM, in contrast to 12.7 million km2 for the pre-industrial control, whereas seasonally, frozen ground increased from 22.5 million km2 to 32.6 million km2. These changes in area resulted mainly from a cooler climate at LGM, but other factors as well, such as the presence of huge land ice sheets and the consequent expansion of total land area due to sea-level change. LGM permafrost boundaries modeled by the PMIP3 ensemble-improved over those of the PMIP2 due to higher spatial resolutions and improved climatology-also compared better to previous knowledge derived from the geomorphological and geocryological evidences. Combinatorial applications of coupled climate models and detailed stand-alone physical-ecological models for the cold-region terrestrial, paleo-, and modern climates will advance our understanding of the functionality and variability of the frozen ground subsystem in the global eco-climate system.


2021 ◽  
Author(s):  
Gavin D. Madakumbura ◽  
Chad W. Thackeray ◽  
Jesse Norris ◽  
Naomi Goldenson ◽  
Alex Hall

Abstract The intensification of extreme precipitation under anthropogenic forcing is robustly projected by global climate models, but highly challenging to detect in the observational record. Large internal variability distorts this anthropogenic signal. Models produce diverse magnitudes of precipitation response to anthropogenic forcing, largely due to differing schemes for parameterizing subgrid-scale processes. Meanwhile, multiple global observational datasets of daily precipitation exist, developed using varying techniques and inhomogeneously sampled data in space and time. Previous attempts to detect human influence on extreme precipitation have not incorporated model uncertainty, and have been limited to specific regions and observational datasets. Using machine learning methods that can account for these uncertainties and capable of identifying the time evolution of the spatial patterns, we find a physically interpretable anthropogenic signal that is detectable in all global observational datasets. Machine learning efficiently generates multiple lines of evidence supporting detection of an anthropogenic signal in global extreme precipitation.


Sign in / Sign up

Export Citation Format

Share Document