scholarly journals Assessment of Native Radar Reflectivity and Radar Rainfall Estimates for Discharge Forecasting in Mountain Catchments with a Random Forest Model

2020 ◽  
Vol 12 (12) ◽  
pp. 1986 ◽  
Author(s):  
Johanna Orellana-Alvear ◽  
Rolando Célleri ◽  
Rütger Rollenbeck ◽  
Paul Muñoz ◽  
Pablo Contreras ◽  
...  

Discharge forecasting is a key component for early warning systems and extremely useful for decision makers. Forecasting models require accurate rainfall estimations of high spatial resolution and other geomorphological characteristics of the catchment, which are rarely available in remote mountain regions such as the Andean highlands. While radar data is available in some mountain areas, the absence of a well distributed rain gauge network makes it hard to obtain accurate rainfall maps. Thus, this study explored a Random Forest model and its ability to leverage native radar data (i.e., reflectivity) by providing a simplified but efficient discharge forecasting model for a representative mountain catchment in the southern Andes of Ecuador. This model was compared with another that used as input derived radar rainfall (i.e., rainfall depth), obtained after the transformation from reflectivity to rainfall rate by using a local Z-R relation and a rain gauge-based bias adjustment. In addition, the influence of a soil moisture proxy was evaluated. Radar and runoff data from April 2015 to June 2017 were used. Results showed that (i) model performance was similar by using either native or derived radar data as inputs (0.66 < NSE < 0.75; 0.72 < KGE < 0.78). Thus, exhaustive pre-processing for obtaining radar rainfall estimates can be avoided for discharge forecasting. (ii) Soil moisture representation as input of the model did not significantly improve model performance (i.e., NSE increased from 0.66 to 0.68). Finally, this native radar data-based model constitutes a promising alternative for discharge forecasting in remote mountain regions where ground monitoring is scarce and hardly available.

Energies ◽  
2020 ◽  
Vol 13 (7) ◽  
pp. 1786
Author(s):  
Linh T. T. Ho ◽  
Laurent Dubus ◽  
Matteo De Felice ◽  
Alberto Troccoli

Hydro power can provide a source of dispatchable low-carbon electricity and a storage solution in a climate-dependent energy mix with high shares of wind and solar production. Therefore, understanding the effect climate has on hydro power generation is critical to ensure a stable energy supply, particularly at a continental scale. Here, we introduce a framework using climate data to model hydro power generation at the country level based on a machine learning method, the random forest model, to produce a publicly accessible hydro power dataset from 1979 to present for twelve European countries. In addition to producing a consistent European hydro power generation dataset covering the past 40 years, the specific novelty of this approach is to focus on the lagged effect of climate variability on hydro power. Specifically, multiple lagged values of temperature and precipitation are used. Overall, the model shows promising results, with the correlation values ranging between 0.85 and 0.98 for run-of-river and between 0.73 and 0.90 for reservoir-based generation. Compared to the more standard optimal lag approach the normalised mean absolute error reduces by an average of 10.23% and 5.99%, respectively. The model was also implemented over six Italian bidding zones to also test its skill at the sub-country scale. The model performance is only slightly degraded at the bidding zone level, but this also depends on the actual installed capacity, with higher capacities displaying higher performance. The framework and results presented could provide a useful reference for applications such as pan-European (continental) hydro power planning and for system adequacy and extreme events assessments.


EP Europace ◽  
2019 ◽  
Vol 21 (9) ◽  
pp. 1307-1312 ◽  
Author(s):  
Wei-Syun Hu ◽  
Meng-Hsuen Hsieh ◽  
Cheng-Li Lin

Abstract Aims We aimed to construct a random forest model to predict atrial fibrillation (AF) in Chinese population. Methods and results This study was comprised of 682 237 subjects with or without AF. Each subject had 19 features that included the subjects’ age, gender, underlying diseases, CHA2DS2-VASc score, and follow-up period. The data were split into train and test sets at an approximate 9:1 ratio: 614 013 data points were placed into the train set and 68 224 data points were placed into the test set. In this study, weighted average F1, precision, and recall values were used to measure prediction model performance. The F1, precision, and recall values were calculated across the train set, the test set, and all data. The area under receiving operating characteristic (ROC) curve was also used to evaluate the performance of the prediction model. The prediction model achieved a k-fold cross-validation accuracy of 0.979 (k = 10). In the test set, the prediction model achieved an F1 value of 0.968, precision value of 0.958, and recall value of 0.979. The area under ROC curve of the model was 0.948 (95% confidence interval 0.947–0.949). This model was validated with a separate dataset. Conclusions This study showed a novel AF risk prediction scheme for Chinese individuals with random forest model methodology.


2005 ◽  
Vol 22 (11) ◽  
pp. 1633-1655 ◽  
Author(s):  
S-G. Park ◽  
M. Maki ◽  
K. Iwanami ◽  
V. N. Bringi ◽  
V. Chandrasekar

Abstract In this paper, the attenuation-correction methodology presented in Part I is applied to radar measurements observed by the multiparameter radar at the X-band wavelength (MP-X) of the National Research Institute for Earth Science and Disaster Prevention (NIED), and is evaluated by comparison with scattering simulations using ground-based disdrometer data. Further, effects of attenuation on the estimation of rainfall amounts and drop size distribution parameters are also investigated. The joint variability of the corrected reflectivity and differential reflectivity show good agreement with scattering simulations. In addition, specific attenuation and differential attenuation, which are derived in the correction procedure, show good agreement with scattering simulations. In addition, a composite rainfall-rate algorithm is proposed and evaluated by comparison with eight gauges. The radar-rainfall estimates from the uncorrected (or observed) ZH produce severe underestimation, even at short ranges from the radar and for stratiform rain events. On the contrary, the reflectivity-based rainfall estimates from the attenuation-corrected ZH does not show such severe underestimation and does show better agreement with rain gauge measurements. More accurate rainfall amounts can be obtained from a simple composite algorithm based on specific differential phase KDP, with the R(ZH_cor) estimates being used for low rainfall rates (KDP ≤ 0.3° km−1 or ZH_cor ≤ 35 dBZ). This improvement in accuracy of rainfall estimation based on KDP is a result of the insensitivity of the rainfall algorithm to natural variations of drop size distributions (DSDs). The ZH, ZDR, and KDP data are also used to infer the parameters (median volume diameter D0 and normalized intercept parameter Nw) of a normalized gamma DSD. The retrieval of D0 and Nw from the corrected radar data show good agreement with those from disdrometer data in terms of the respective relative frequency histograms. The results of this study demonstrate that high-quality hydrometeorological information on rain events such as rainfall amounts and DSDs can be derived from X-band polarimetric radars.


2019 ◽  
Vol 11 (14) ◽  
pp. 1632 ◽  
Author(s):  
Johanna Orellana-Alvear ◽  
Rolando Célleri ◽  
Rütger Rollenbeck ◽  
Jörg Bendix

Despite many efforts of the radar community, quantitative precipitation estimation (QPE) from weather radar data remains a challenging topic. The high resolution of X-band radar imagery in space and time comes with an intricate correction process of reflectivity. The steep and high mountain topography of the Andes enhances its complexity. This study aims to optimize the rainfall derivation of the highest X-band radar in the world (4450 m a.s.l.) by using a random forest (RF) model and single Plan Position Indicator (PPI) scans. The performance of the RF model was evaluated in comparison with the traditional step-wise approach by using both, the Marshall-Palmer and a site-specific Z–R relationship. Since rain gauge networks are frequently unevenly distributed and hardly available at real time in mountain regions, bias adjustment was neglected. Results showed an improvement in the step-wise approach by using the site-specific (instead of the Marshall-Palmer) Z–R relationship. However, both models highly underestimate the rainfall rate (correlation coefficient < 0.69; slope up to 12). Contrary, the RF model greatly outperformed the step-wise approach in all testing locations and on different rainfall events (correlation coefficient up to 0.83; slope = 1.04). The results are promising and unveil a different approach to overcome the high attenuation issues inherent to X-band radars.


2021 ◽  
Vol 13 (18) ◽  
pp. 3657
Author(s):  
Chau-Ren Jung ◽  
Wei-Ting Chen ◽  
Shoji F. Nakayama

Satellite-based models for estimating concentrations of particulate matter with an aerodynamic diameter less than 2.5 μm (PM2.5) have seldom been developed in islands with complex topography over the monsoon area, where the transport of PM2.5 is influenced by both the synoptic-scale winds and local-scale circulations compared with the continental regions. We validated Multi-Angle Implementation of Atmospheric Correction (MAIAC) aerosol optical depth (AOD) with ground observations in Japan and developed a 1-km-resolution national-scale model between 2011 and 2016 to estimate daily PM2.5 concentrations. A two-stage random forest model integrating MAIAC AOD with meteorological variables and land use data was applied to develop the model. The first-stage random forest model was used to impute the missing AOD values. The second-stage random forest model was then utilised to estimate ground PM2.5 concentrations. Ten-fold cross-validation was performed to evaluate the model performance. There was good consistency between MAIAC AOD and ground truth in Japan (correlation coefficient = 0.82 and 74.62% of data falling within the expected error). For model training, the model showed a training coefficient of determination (R2) of 0.98 and a root mean square error (RMSE) of 1.22 μg/m3. For the 10-fold cross-validation, the cross-validation R2 and RMSE of the model were 0.86 and 3.02 μg/m3, respectively. A subsite validation was used to validate the model at the grids overlapping with the AERONET sites, and the model performance was excellent at these sites with a validation R2 (RMSE) of 0.94 (1.78 μg/m3). Additionally, the model performance increased as increased AOD coverage. The top-ten important predictors for estimating ground PM2.5 concentrations were day of the year, temperature, AOD, relative humidity, 10-m-height zonal wind, 10-m-height meridional wind, boundary layer height, precipitation, surface pressure, and population density. MAIAC AOD showed high retrieval accuracy in Japan. The performance of the satellite-based model was excellent, which showed that PM2.5 estimates derived from the model were reliable and accurate. These estimates can be used to assess both the short-term and long-term effects of PM2.5 on health outcomes in epidemiological studies.


Water ◽  
2019 ◽  
Vol 11 (7) ◽  
pp. 1401 ◽  
Author(s):  
Shaodan Chen ◽  
Dunxian She ◽  
Liping Zhang ◽  
Mengyao Guo ◽  
Xin Liu

Soil moisture is an important indicator that is widely used in meteorology, hydrology, and agriculture. Two key problems must be addressed in the process of downscaling soil moisture: the selection of the downscaling method and the determination of the environmental variables, namely, the influencing factors of soil moisture. This study attempted to utilize machine learning and data mining algorithms to downscale the Advanced Microwave Scanning Radiometer-Earth Observing System (AMSR-E) soil moisture data from 25 km to 1 km and compared the advantages and disadvantages of the random forest model and the Cubist algorithm to determine the more suitable soil moisture downscaling method for the middle and lower reaches of the Yangtze River Basin (MLRYRB). At present, either the normalized difference vegetation index (NDVI) or a digital elevation model (DEM) is selected as the environmental variable for the downscaling models. In contrast, variables, such as albedo and evapotranspiration, are infrequently applied; nevertheless, this study selected these two environmental variables, which have a considerable impact on soil moisture. Thus, the selected environmental variables in the downscaling process included the longitude, latitude, elevation, slope, NDVI, daytime and nighttime land surface temperature (LST_D and LST_N, respectively), albedo, evapotranspiration (ET), land cover (LC) type, and aspect. This study achieved downscaling on a 16-day timescale based on Moderate Resolution Imaging Spectroradiometer (MODIS) data. A comparison of the random forest model with the Cubist algorithm revealed that the R2 of the random forest-based downscaling method is higher than that of the Cubist algorithm-based method by 0.0161; moreover, the root-mean-square error (RMSE) is reduced by 0.0006 and the mean absolute error (MAE) is reduced by 0.0014. Testing the accuracies of these two downscaling methods showed that the random forest model is more suitable than the Cubist algorithm for downscaling AMSR-E soil moisture data from 25 km to 1 km in the MLRYRB, which provides a theoretical basis for obtaining high spatial resolution soil moisture data.


2021 ◽  
Author(s):  
Christian Thiele ◽  
Gerrit Hirschfeld ◽  
Ruth von Brachel

AbstractRegistries of clinical trials are a potential source for scientometric analysis of medical research and serve important functions for the research community and the public at large. Clinical trials that recruit patients in Germany are usually registered in the German Clinical Trials Register (DRKS) or in international registries such as ClinicalTrials.gov. Furthermore, the International Clinical Trials Registry Platform (ICTRP) aggregates trials from multiple primary registries. We queried the DRKS, ClinicalTrials.gov, and the ICTRP for trials with a recruiting location in Germany. Trials that were registered in multiple registries were linked using the primary and secondary identifiers and a Random Forest model based on various similarity metrics. We identified 35,912 trials that were conducted in Germany. The majority of the trials was registered in multiple databases. 32,106 trials were linked using primary IDs, 26 were linked using a Random Forest model, and 10,537 internal duplicates on ICTRP were identified using the Random Forest model after finding pairs with matching primary or secondary IDs. In cross-validation, the Random Forest increased the F1-score from 96.4% to 97.1% compared to a linkage based solely on secondary IDs on a manually labelled data set. 28% of all trials were registered in the German DRKS. 54% of the trials on ClinicalTrials.gov, 43% of the trials on the DRKS and 56% of the trials on the ICTRP were pre-registered. The ratio of pre-registered studies and the ratio of studies that are registered in the DRKS increased over time.


Sign in / Sign up

Export Citation Format

Share Document