scholarly journals Improving the Pareto Frontier in multi-dataset calibration of hydrological models using metaheuristics

2021 ◽  
Author(s):  
Silja Stefnisdóttir ◽  
Anna E. Sikorska-Senoner ◽  
Eyjólfur I. Ásgeirsson ◽  
David C. Finger

Abstract. Hydrological models are crucial tools in water and environmental resource management but they require careful calibration based on observed data. Model calibration remains a challenging task, especially if a multi-objective or multi-dataset calibration is necessary to generate realistic simulations of multiple flow components under consideration. In this study, we explore the value of three metaheuristics, i.e. (i) Monte Carlo (MC), (ii) Simulated Annealing (SA), and (iii) Genetic 5 Algorithm (GA), for a multi-data set calibration to simultaneously simulate streamflow, snow cover and glacier mass balances using the conceptual HBV model. Based on the results from a small glaciated catchment of the Rhone River in Switzerland, we show that all three metaheuristics can generate parameter sets that result in realistic simulations of all three variables. Detailed comparison of model simulations with these three metaheuristics reveals however that GA provides the most accurate simulations (with lowest confidence intervals) for all three variables when using both the 100 and the 10 best parameter sets for 10 each method. However, when considering the 100 best parameter sets per method, GA yields also some worst solutions from the pool of all methods’ solutions. The findings are supported by a reduction of the parameter equifinality and an improvement of the Pareto frontier for GA in comparison to both other metaheuristic methods. Based on our results, we conclude that GA-based multi-dataset calibration leads to the most reproducible and consistent hydrological simulations with multiple variables considered.

2021 ◽  
Author(s):  
Silja Stefnisdóttir ◽  
Anna E. Sikorska-Senoner ◽  
Eyjólfur I. Ásgeirsson ◽  
David C. Finger

<p>Hydrological models are crucial components in water and environmental resource management to provide simulations on streamflow, snow cover, and glacier mass balances. Effective model calibration is however challenging, especially if a multi-objective or multi-dataset calibration is necessary to generate realistic simulations of all flow components under consideration.</p><p>In this study, we explore the value of metaheuristics for multi-dataset calibration to simulate streamflow, snow cover and glacier mass balances using the HBV model in the glaciated catchment of the Rhonegletscher in Switzerland. We evaluate the performance of three metaheuristic calibration methods, i.e. Monte Carlo (MC), Simulated Annealing (SA) and Genetic Algorithms (GA), in regard to these three datasets. For all three methods, we compare the model performance using 100 best and 10 best optimized parameter sets.</p><p>Our results demonstrate that all three metaheuristic methods can generate realistic simulations of the snow cover, the glacier mass balance and the streamflow. The comparison of these three methods reveals that GA provides the most accurate simulations (with lowest confidence intervals) for all three datasets, for both 100 and 10 best simulations. However, when using all 100 simulations, GA yields also some worst solutions which are eliminated if only 10 best solutions are considered.</p><p>Based on our results we conclude that GA-based multi-dataset calibration provides more accurate and more precise simulation than MC or SA. This conclusion is fortified by a reduction of the parameter equifinality and an improvement of the Pareto frontier for GA in comparison to both other metaheuristic methods. This method should therefore lead to more reproducible and consistent hydrological simulations.</p>


2020 ◽  
Vol 501 (2) ◽  
pp. 1663-1676
Author(s):  
R Barnett ◽  
S J Warren ◽  
N J G Cross ◽  
D J Mortlock ◽  
X Fan ◽  
...  

ABSTRACT We present the results of a new, deeper, and complete search for high-redshift 6.5 < z < 9.3 quasars over 977 deg2 of the VISTA Kilo-Degree Infrared Galaxy (VIKING) survey. This exploits a new list-driven data set providing photometry in all bands Z, Y, J, H, Ks, for all sources detected by VIKING in J. We use the Bayesian model comparison (BMC) selection method of Mortlock et al., producing a ranked list of just 21 candidates. The sources ranked 1, 2, 3, and 5 are the four known z > 6.5 quasars in this field. Additional observations of the other 17 candidates, primarily DESI Legacy Survey photometry and ESO FORS2 spectroscopy, confirm that none is a quasar. This is the first complete sample from the VIKING survey, and we provide the computed selection function. We include a detailed comparison of the BMC method against two other selection methods: colour cuts and minimum-χ2 SED fitting. We find that: (i) BMC produces eight times fewer false positives than colour cuts, while also reaching 0.3 mag deeper, (ii) the minimum-χ2 SED-fitting method is extremely efficient but reaches 0.7 mag less deep than the BMC method, and selects only one of the four known quasars. We show that BMC candidates, rejected because their photometric SEDs have high χ2 values, include bright examples of galaxies with very strong [O iii] λλ4959,5007 emission in the Y band, identified in fainter surveys by Matsuoka et al. This is a potential contaminant population in Euclid searches for faint z > 7 quasars, not previously accounted for, and that requires better characterization.


2015 ◽  
Vol 8 (2) ◽  
pp. 1787-1832 ◽  
Author(s):  
J. Heymann ◽  
M. Reuter ◽  
M. Hilker ◽  
M. Buchwitz ◽  
O. Schneising ◽  
...  

Abstract. Consistent and accurate long-term data sets of global atmospheric concentrations of carbon dioxide (CO2) are required for carbon cycle and climate related research. However, global data sets based on satellite observations may suffer from inconsistencies originating from the use of products derived from different satellites as needed to cover a long enough time period. One reason for inconsistencies can be the use of different retrieval algorithms. We address this potential issue by applying the same algorithm, the Bremen Optimal Estimation DOAS (BESD) algorithm, to different satellite instruments, SCIAMACHY onboard ENVISAT (March 2002–April 2012) and TANSO-FTS onboard GOSAT (launched in January 2009), to retrieve XCO2, the column-averaged dry-air mole fraction of CO2. BESD has been initially developed for SCIAMACHY XCO2 retrievals. Here, we present the first detailed assessment of the new GOSAT BESD XCO2 product. GOSAT BESD XCO2 is a product generated and delivered to the MACC project for assimilation into ECMWF's Integrated Forecasting System (IFS). We describe the modifications of the BESD algorithm needed in order to retrieve XCO2 from GOSAT and present detailed comparisons with ground-based observations of XCO2 from the Total Carbon Column Observing Network (TCCON). We discuss detailed comparison results between all three XCO2 data sets (SCIAMACHY, GOSAT and TCCON). The comparison results demonstrate the good consistency between the SCIAMACHY and the GOSAT XCO2. For example, we found a mean difference for daily averages of −0.60 ± 1.56 ppm (mean difference ± standard deviation) for GOSAT-SCIAMACHY (linear correlation coefficient r = 0.82), −0.34 ± 1.37 ppm (r = 0.86) for GOSAT-TCCON and 0.10 ± 1.79 ppm (r = 0.75) for SCIAMACHY-TCCON. The remaining differences between GOSAT and SCIAMACHY are likely due to non-perfect collocation (±2 h, 10° × 10° around TCCON sites), i.e., the observed air masses are not exactly identical, but likely also due to a still non-perfect BESD retrieval algorithm, which will be continuously improved in the future. Our overarching goal is to generate a satellite-derived XCO2 data set appropriate for climate and carbon cycle research covering the longest possible time period. We therefore also plan to extend the existing SCIAMACHY and GOSAT data set discussed here by using also data from other missions (e.g., OCO-2, GOSAT-2, CarbonSat) in the future.


2021 ◽  
Author(s):  
Markus Hrachowitz ◽  
Petra Hulsman ◽  
Hubert Savenije

<p>Hydrological models are often calibrated with respect to flow observations at the basin outlet. As a result, flow predictions may seem reliable but this is not necessarily the case for the spatiotemporal variability of system-internal processes, especially in large river basins. Satellite observations contain valuable information not only for poorly gauged basins with limited ground observations and spatiotemporal model calibration, but also for stepwise model development. This study explored the value of satellite observations to improve our understanding of hydrological processes through stepwise model structure adaption and to calibrate models both temporally and spatially. More specifically, satellite-based evaporation and total water storage anomaly observations were used to diagnose model deficiencies and to subsequently improve the hydrological model structure and the selection of feasible parameter sets. A distributed, process based hydrological model was developed for the Luangwa river basin in Zambia and calibrated with respect to discharge as benchmark. This model was modified stepwise by testing five alternative hypotheses related to the process of upwelling groundwater in wetlands, which was assumed to be negligible in the benchmark model, and the spatial discretization of the groundwater reservoir. Each model hypothesis was calibrated with respect to 1) discharge and 2) multiple variables simultaneously including discharge and the spatiotemporal variability in the evaporation and total water storage anomalies. The benchmark model calibrated with respect to discharge reproduced this variable well, as also the basin-averaged evaporation and total water storage anomalies. However, the evaporation in wetland dominated areas and the spatial variability in the evaporation and total water storage anomalies were poorly modelled. The model improved the most when introducing upwelling groundwater flow from a distributed groundwater reservoir and calibrating it with respect to multiple variables simultaneously. This study showed satellite-based evaporation and total water storage anomaly observations provide valuable information for improved understanding of hydrological processes through stepwise model development and spatiotemporal model calibration.</p>


Author(s):  
Eugenia Rinaldi ◽  
Sylvia Thun

HiGHmed is a German Consortium where eight University Hospitals have agreed to the cross-institutional data exchange through novel medical informatics solutions. The HiGHmed Use Case Infection Control group has modelled a set of infection-related data in the openEHR format. In order to establish interoperability with the other German Consortia belonging to the same national initiative, we mapped the openEHR information to the Fast Healthcare Interoperability Resources (FHIR) format recommended within the initiative. FHIR enables fast exchange of data thanks to the discrete and independent data elements into which information is organized. Furthermore, to explore the possibility of maximizing analysis capabilities for our data set, we subsequently mapped the FHIR elements to the Observational Medical Outcomes Partnership Common Data Model (OMOP CDM). The OMOP data model is designed to support the conduct of research to identify and evaluate associations between interventions and outcomes caused by these interventions. Mapping across standard allows to exploit their peculiarities while establishing and/or maintaining interoperability. This article provides an overview of our experience in mapping infection control related data across three different standards openEHR, FHIR and OMOP CDM.


2021 ◽  
Author(s):  
Vazken Andréassian ◽  
Léonard Santos ◽  
Torben Sonnenborg ◽  
Alban de Lavenne ◽  
Göran Lindström ◽  
...  

<p>Hydrological models are increasingly used under evolving climatic conditions. They should thus be evaluated regarding their temporal transferability (application in different time periods) and extrapolation capacity (application beyond the range of known past conditions). In theory, parameters of hydrological models are independent of climate. In practice, however, many published studies based on the Split-Sample Test (Klemeš, 1986), have shown that model performances decrease systematically when it is used out of its calibration period. The RAT test proposed here aims at evaluating model robustness to a changing climate by assessing potential undesirable dependencies of hydrological model performances to climate variables. The test compares, over a long data period, the annual value of several climate variables (temperature, precipitation and aridity index) and the bias of the model over each year. If a significant relation exists between the climatic variable and the bias, the model is not considered to be robust to climate change on the catchment. The test has been compared to the Generalized Split-Sample Test (Coron et al., 2012) and showed similar results.</p><p>Here, we report on a large scale application of the test for three hydrological models with different level of complexity (GR6J, HYPE, MIKE-SHE) on a data set of 352 catchments in Denmark, France and Sweden. The results show that the test behaves differently given the evaluated variable (be temperature, precipitation or aridity) and the hydrological characteristics of each catchment. They also show that, although of different level of complexity, the robustness of the three models is similar on the overall data set. However, they are not robust on the same catchments and, then, are not sensitive to the same hydrological characteristics. This example highlights the applicability of the RAT test regardless of the model set-up and calibration procedure and its ability to provide a first evaluation of the model robustness to climate change.</p><p> </p><p><strong>References</strong></p><p>Coron, L., V. Andréassian, C. Perrin, J. Lerat, J. Vaze, M. Bourqui, and F. Hendrickx, 2012. Crash testing hydrological models in contrasted climate conditions: An experiment on 216 Australian catchments, Water Resour. Res., 48, W05552, doi:10.1029/2011WR011721</p><p>Klemeš, V., 1986. Operational testing of hydrological simulation models, Hydrol. Sci. J., 31, 13–24, doi:10.1080/02626668609491024</p><p> </p>


2009 ◽  
Vol 13 (7) ◽  
pp. 1075-1089 ◽  
Author(s):  
M. Akhtar ◽  
N. Ahmad ◽  
M. J. Booij

Abstract. The most important climatological inputs required for the calibration and validation of hydrological models are temperature and precipitation that can be derived from observational records or alternatively from regional climate models (RCMs). In this paper, meteorological station observations and results of the PRECIS (Providing REgional Climate for Impact Studies) RCM driven by the outputs of reanalysis ERA 40 data and HadAM3P general circulation model (GCM) results are used as input in the hydrological model. The objective is to investigate the effect of precipitation and temperature simulated with the PRECIS RCM nested in these two data sets on discharge simulated with the HBV model for three river basins in the Hindukush-Karakorum-Himalaya (HKH) region. Six HBV model experiments are designed: HBV-Met, HBV-ERA and HBV-Had, HBV-MetCRU-corrected, HBV-ERABenchmark and HBV-HadBenchmark where HBV is driven by meteorological stations data, data from PRECIS nested in ERA-40 and HadAM3P, meteorological stations CRU corrected data, ERA-40 reanalysis and HadAM3P GCM data, respectively. Present day PRECIS simulations possess strong capacity to simulate spatial patterns of present day climate characteristics. However, also some quantitative biases exist in the HKH region, where PRECIS RCM simulations underestimate temperature and overestimate precipitation with respect to CRU observations. The calibration and validation results of the HBV model experiments show that the performance of HBV-Met is better than the HBV models driven by other data sources. However, using input data series from sources different from the data used in the model calibration shows that HBV-Had is more efficient than other models and HBV-Met has the least absolute relative error with respect to all other models. The uncertainties are higher in least efficient models (i.e. HBV-MetCRU-corrected and HBV-ERABenchmark) where the model parameters are also unrealistic. In terms of both robustness and uncertainty ranges the HBV models calibrated with PRECIS output performed better than other calibrated models except for HBV-Met which has shown a higher robustness. This suggests that in data sparse regions such as the HKH region data from regional climate models may be used as input in hydrological models for climate scenarios studies.


1986 ◽  
Vol 82 ◽  
Author(s):  
S. Spooner ◽  
S. Iida ◽  
B. C. Larson

ABSTRACTA detailed comparison of small-angle scattering (SANS) and large angle x-ray diffraction methods of characterization of precipitates was undertaken. Cobalt-rich precipitates on the order of 50 Å developed after a 17 hour anneal at 570°C were studied in a single crystal sample with SANS and with diffuse x-ray scattering near the (400)Bragg peak. Each scattering data set was analyzed independently in terms of a distribution of precipitate sizes; a detailed comparison is made of the size distribution obtained. A small interparticle interference effect is seen.


2017 ◽  
Vol 51 (1) ◽  
pp. 75-100 ◽  
Author(s):  
Adrian Burton ◽  
Hylke Koers ◽  
Paolo Manghi ◽  
Sandro La Bruzzo ◽  
Amir Aryani ◽  
...  

Purpose Research data publishing is today widely regarded as crucial for reproducibility, proper assessment of scientific results, and as a way for researchers to get proper credit for sharing their data. However, several challenges need to be solved to fully realize its potential, one of them being the development of a global standard for links between research data and literature. Current linking solutions are mostly based on bilateral, ad hoc agreements between publishers and data centers. These operate in silos so that content cannot be readily combined to deliver a network graph connecting research data and literature in a comprehensive and reliable way. The Research Data Alliance (RDA) Publishing Data Services Working Group (PDS-WG) aims to address this issue of fragmentation by bringing together different stakeholders to agree on a common infrastructure for sharing links between datasets and literature. The paper aims to discuss these issues. Design/methodology/approach This paper presents the synergic effort of the RDA PDS-WG and the OpenAIRE infrastructure toward enabling a common infrastructure for exchanging data-literature links by realizing and operating the Data-Literature Interlinking (DLI) Service. The DLI Service populates and provides access to a graph of data set-literature links (at the time of writing close to five million, and growing) collected from a variety of major data centers, publishers, and research organizations. Findings To achieve its objectives, the Service proposes an interoperable exchange data model and format, based on which it collects and publishes links, thereby offering the opportunity to validate such common approach on real-case scenarios, with real providers and consumers. Feedback of these actors will drive continuous refinement of the both data model and exchange format, supporting the further development of the Service to become an essential part of a universal, open, cross-platform, cross-discipline solution for collecting, and sharing data set-literature links. Originality/value This realization of the DLI Service is the first technical, cross-community, and collaborative effort in the direction of establishing a common infrastructure for facilitating the exchange of data set-literature links. As a result of its operation and underlying community effort, a new activity, name Scholix, has been initiated involving the technological level stakeholders such as DataCite and CrossRef.


Author(s):  
Alexander Mackenzie Rivero ◽  
Alberto Rodríguez Rodríguez ◽  
Edwin Joao Merchán Carreño ◽  
Rodrigo Martínez Béjar

The use of machine learning allows the creation of a predictive data model, as a result of the analysis in a data set with 286 instances and nine attributes belonging to the Institute of Oncology of the University Medical Center. Ljubljana. Based on this situation, the data are preprocessed by applying intelligent data analysis techniques to eliminate missing values as well as the evaluation of each attribute that allows the optimization of results. We used several classification algorithms including J48 trees, random forest, bayes net, naive bayes, decision table, in order to obtain one that given the characteristics of the data, would allow the best classification percentage and therefore a better matrix of confusion, Using 66 % of the data for learning and 33 % for validating the model. Using this model, a predictor with a 71,134 % e effectiveness is obtained to estimate or not the recurrence of breast cancer.


Sign in / Sign up

Export Citation Format

Share Document