scholarly journals Impact of modellers' decisions on hydrological a priori predictions

2013 ◽  
Vol 10 (7) ◽  
pp. 8875-8944
Author(s):  
H. M. Holländer ◽  
H. Bormann ◽  
T. Blume ◽  
W. Buytaert ◽  
G. B. Chirico ◽  
...  

Abstract. The purpose of this paper is to stimulate a re-thinking of how we, the catchment hydrologists, could become reliable forecasters. A group of catchment modellers predicted the hydrological response of a man-made 6 ha catchment in its initial phase (Chicken Creek) without having access to the observed records. They used conceptually different model families. Their modelling experience differed largely. The prediction exercise was organized in three steps: (1) for the 1st prediction modellers received a basic data set describing the internal structure of the catchment (somewhat more complete than usually available to a priori predictions in ungauged catchments). They did not obtain time series of stream flow, soil moisture or groundwater response. (2) Before the 2nd improved prediction they inspected the catchment on-site and attended a workshop where the modellers presented and discussed their first attempts. (3) For their improved 3rd prediction they were offered additional data by charging them pro forma with the costs for obtaining this additional information. Holländer et al. (2009) discussed the range of predictions obtained in step 1. Here, we detail the modeller's decisions in accounting for the various processes based on what they learned during the field visit (step 2) and add the final outcome of step 3 when the modellers made use of additional data. We document the prediction progress as well as the learning process resulting from the availability of added information. For the 2nd and 3rd step, the progress in prediction quality could be evaluated in relation to individual modelling experience and costs of added information. We learned (i) that soft information such as the modeller's system understanding is as important as the model itself (hard information), (ii) that the sequence of modelling steps matters (field visit, interactions between differently experienced experts, choice of model, selection of available data, and methods for parameter guessing), and (iii) that added process understanding can be as efficient as adding data for improving parameters needed to satisfy model requirements.

2014 ◽  
Vol 18 (6) ◽  
pp. 2065-2085 ◽  
Author(s):  
H. M. Holländer ◽  
H. Bormann ◽  
T. Blume ◽  
W. Buytaert ◽  
G. B. Chirico ◽  
...  

Abstract. In practice, the catchment hydrologist is often confronted with the task of predicting discharge without having the needed records for calibration. Here, we report the discharge predictions of 10 modellers – using the model of their choice – for the man-made Chicken Creek catchment (6 ha, northeast Germany, Gerwin et al., 2009b) and we analyse how well they improved their prediction in three steps based on adding information prior to each following step. The modellers predicted the catchment's hydrological response in its initial phase without having access to the observed records. They used conceptually different physically based models and their modelling experience differed largely. Hence, they encountered two problems: (i) to simulate discharge for an ungauged catchment and (ii) using models that were developed for catchments, which are not in a state of landscape transformation. The prediction exercise was organized in three steps: (1) for the first prediction the modellers received a basic data set describing the catchment to a degree somewhat more complete than usually available for a priori predictions of ungauged catchments; they did not obtain information on stream flow, soil moisture, nor groundwater response and had therefore to guess the initial conditions; (2) before the second prediction they inspected the catchment on-site and discussed their first prediction attempt; (3) for their third prediction they were offered additional data by charging them pro forma with the costs for obtaining this additional information. Holländer et al. (2009) discussed the range of predictions obtained in step (1). Here, we detail the modeller's assumptions and decisions in accounting for the various processes. We document the prediction progress as well as the learning process resulting from the availability of added information. For the second and third steps, the progress in prediction quality is evaluated in relation to individual modelling experience and costs of added information. In this qualitative analysis of a statistically small number of predictions we learned (i) that soft information such as the modeller's system understanding is as important as the model itself (hard information), (ii) that the sequence of modelling steps matters (field visit, interactions between differently experienced experts, choice of model, selection of available data, and methods for parameter guessing), and (iii) that added process understanding can be as efficient as adding data for improving parameters needed to satisfy model requirements.


2018 ◽  
Vol 616 ◽  
pp. A13 ◽  
Author(s):  
◽  
F. Spoto ◽  
P. Tanga ◽  
F. Mignard ◽  
J. Berthier ◽  
...  

Context. The Gaia spacecraft of the European Space Agency (ESA) has been securing observations of solar system objects (SSOs) since the beginning of its operations. Data Release 2 (DR2) contains the observations of a selected sample of 14,099 SSOs. These asteroids have been already identified and have been numbered by the Minor Planet Center repository. Positions are provided for each Gaia observation at CCD level. As additional information, complementary to astrometry, the apparent brightness of SSOs in the unfiltered G band is also provided for selected observations. Aims. We explain the processing of SSO data, and describe the criteria we used to select the sample published in Gaia DR2. We then explore the data set to assess its quality. Methods. To exploit the main data product for the solar system in Gaia DR2, which is the epoch astrometry of asteroids, it is necessary to take into account the unusual properties of the uncertainty, as the position information is nearly one-dimensional. When this aspect is handled appropriately, an orbit fit can be obtained with post-fit residuals that are overall consistent with the a-priori error model that was used to define individual values of the astrometric uncertainty. The role of both random and systematic errors is described. The distribution of residuals allowed us to identify possible contaminants in the data set (such as stars). Photometry in the G band was compared to computed values from reference asteroid shapes and to the flux registered at the corresponding epochs by the red and blue photometers (RP and BP). Results. The overall astrometric performance is close to the expectations, with an optimal range of brightness G ~ 12 − 17. In this range, the typical transit-level accuracy is well below 1 mas. For fainter asteroids, the growing photon noise deteriorates the performance. Asteroids brighter than G ~ 12 are affected by a lower performance of the processing of their signals. The dramatic improvement brought by Gaia DR2 astrometry of SSOs is demonstrated by comparisons to the archive data and by preliminary tests on the detection of subtle non-gravitational effects.


2012 ◽  
Vol 52 (No. 4) ◽  
pp. 188-196 ◽  
Author(s):  
Y. Lei ◽  
S. Y Zhang

Forestmodellers have long faced the problem of selecting an appropriate mathematical model to describe tree ontogenetic or size-shape empirical relationships for tree species. A common practice is to develop many models (or a model pool) that include different functional forms, and then to select the most appropriate one for a given data set. However, this process may impose subjective restrictions on the functional form. In this process, little attention is paid to the features (e.g. asymptote and inflection point rather than asymptote and nonasymptote) of different functional forms, and to the intrinsic curve of a given data set. In order to find a better way of comparing and selecting the growth models, this paper describes and analyses the characteristics of the Schnute model. This model has both flexibility and versatility that have not been used in forestry. In this study, the Schnute model was applied to different data sets of selected forest species to determine their functional forms. The results indicate that the model shows some desirable properties for the examined data sets, and allows for discerning the different intrinsic curve shapes such as sigmoid, concave and other curve shapes. Since no suitable functional form for a given data set is usually known prior to the comparison of candidate models, it is recommended that the Schnute model be used as the first step to determine an appropriate functional form of the data set under investigation in order to avoid using a functional form a priori.


2006 ◽  
Vol 63 (6) ◽  
pp. 1414-1428 ◽  
Author(s):  
Ronan Fablet

This paper deals with the analysis of images of biological tissue that involves ring structures, such as tree trunks, bivalve seashells, or fish otoliths, with a view to automating the acquisition of age and growth data. A bottom-up template-based scheme extracts meaningful ridge and valley curve data using growth-adapted time-frequency filtering. Age and growth estimation is then stated as the Bayesian selection of a subset of ring curves, which combines a measure of curve significativity and an a priori statistical growth model. Experiments on real samples demonstrate the efficiency of the proposed data extraction stage. Our Bayesian framework is shown to significantly outperform previous methods for the interpretation of a data set of 200 plaice otoliths and compares favorably with interexpert agreement rates (88% of agreement to expert interpretations).


1953 ◽  
Vol 6 (3) ◽  
pp. 335 ◽  
Author(s):  
T Pearcey ◽  
GW Hill

The general structure of typical programmes is considered in relation to the structure of the computer, in particular to its facilities which tend to render programmes invariant with regard to their position in the store. Full-scale programmes are constructed by piecing together by a master programme a suitable selection of items from a library of "standard routines". The library contains "sub-routines" which are completely self-contained and require no additional information for their operation and are invariant with respect to their position in the store, and "routines" which are not so invariant and frequently require additional data to be provided during their entry into the store. The entry of special data for routines and the simplification of the construction of the master programme are facilitated by a special routine used whilst the entire programme is being entered into the store.


Author(s):  
Maria A. Milkova

Nowadays the process of information accumulation is so rapid that the concept of the usual iterative search requires revision. Being in the world of oversaturated information in order to comprehensively cover and analyze the problem under study, it is necessary to make high demands on the search methods. An innovative approach to search should flexibly take into account the large amount of already accumulated knowledge and a priori requirements for results. The results, in turn, should immediately provide a roadmap of the direction being studied with the possibility of as much detail as possible. The approach to search based on topic modeling, the so-called topic search, allows you to take into account all these requirements and thereby streamline the nature of working with information, increase the efficiency of knowledge production, avoid cognitive biases in the perception of information, which is important both on micro and macro level. In order to demonstrate an example of applying topic search, the article considers the task of analyzing an import substitution program based on patent data. The program includes plans for 22 industries and contains more than 1,500 products and technologies for the proposed import substitution. The use of patent search based on topic modeling allows to search immediately by the blocks of a priori information – terms of industrial plans for import substitution and at the output get a selection of relevant documents for each of the industries. This approach allows not only to provide a comprehensive picture of the effectiveness of the program as a whole, but also to visually obtain more detailed information about which groups of products and technologies have been patented.


The review article discusses the possibilities of using fractal mathematical analysis to solve scientific and applied problems of modern biology and medicine. The authors show that only such an approach, related to the section of nonlinear mechanics, allows quantifying the chaotic component of the structure and function of living systems, that is a priori important additional information and expands, in particular, the possibilities of diagnostics, differential diagnosis and prediction of the course of physiological and pathological processes. A number of examples demonstrate the specific advantages of using fractal analysis for these purposes. The conclusion can be made that the expanded use of fractal analysis methods in the research work of medical and biological specialists is promising.


Author(s):  
Laure Fournier ◽  
Lena Costaridou ◽  
Luc Bidaut ◽  
Nicolas Michoux ◽  
Frederic E. Lecouvet ◽  
...  

Abstract Existing quantitative imaging biomarkers (QIBs) are associated with known biological tissue characteristics and follow a well-understood path of technical, biological and clinical validation before incorporation into clinical trials. In radiomics, novel data-driven processes extract numerous visually imperceptible statistical features from the imaging data with no a priori assumptions on their correlation with biological processes. The selection of relevant features (radiomic signature) and incorporation into clinical trials therefore requires additional considerations to ensure meaningful imaging endpoints. Also, the number of radiomic features tested means that power calculations would result in sample sizes impossible to achieve within clinical trials. This article examines how the process of standardising and validating data-driven imaging biomarkers differs from those based on biological associations. Radiomic signatures are best developed initially on datasets that represent diversity of acquisition protocols as well as diversity of disease and of normal findings, rather than within clinical trials with standardised and optimised protocols as this would risk the selection of radiomic features being linked to the imaging process rather than the pathology. Normalisation through discretisation and feature harmonisation are essential pre-processing steps. Biological correlation may be performed after the technical and clinical validity of a radiomic signature is established, but is not mandatory. Feature selection may be part of discovery within a radiomics-specific trial or represent exploratory endpoints within an established trial; a previously validated radiomic signature may even be used as a primary/secondary endpoint, particularly if associations are demonstrated with specific biological processes and pathways being targeted within clinical trials. Key Points • Data-driven processes like radiomics risk false discoveries due to high-dimensionality of the dataset compared to sample size, making adequate diversity of the data, cross-validation and external validation essential to mitigate the risks of spurious associations and overfitting. • Use of radiomic signatures within clinical trials requires multistep standardisation of image acquisition, image analysis and data mining processes. • Biological correlation may be established after clinical validation but is not mandatory.


Sign in / Sign up

Export Citation Format

Share Document