scholarly journals Data-driven catchment classification: application to the PUB problem

2011 ◽  
Vol 8 (1) ◽  
pp. 391-427 ◽  
Author(s):  
M. Di Prinzio ◽  
A. Castellarin ◽  
E. Toth

Abstract. Objective criteria for catchment classification are identified by the scientific community among the key research topics for improving the interpretation and representation of the spatiotemporal variability of streamflow. A promising approach to catchment classification makes use of unsupervised neural networks (Self Organising Maps, SOM's), which organise input data through non-linear techniques depending on the intrinsic similarity of the data themselves. Our study considers ~300 Italian catchments scattered nationwide, for which several descriptors of the streamflow regime and geomorphoclimatic characteristics are available. We qualitatively and quantitatively compare in the context of PUB (Prediction in Ungauged Basins) a reference classification, RC, with four alternative classifications, AC's. RC was identified by using indices of the streamflow regime as input to SOM, whereas AC's were identified on the basis of catchment descriptors that can be derived for ungauged basins. One AC directly adopts the available catchment descriptors as input to SOM. The remaining AC's are identified by applying SOM to two sets of derived variables obtained by applying Principal Component Analysis (PCA, second AC) and Canonical Correlation Analysis (CCA, third and fourth ACs) to the available catchment descriptors. First, we measure the similarity between each AC and RC. Second, we use AC's and RC to regionalize several streamflow indices and we compare AC's with RC in terms of accuracy of streamflow prediction. In particular, we perform an extensive cross-validation to quantify nationwide the accuracy of predictions in ungauged basins of mean annual runoff, mean annual flood, and flood quantiles associated with given exceedance probabilities. Results of the study show that CCA can significantly improve the effectiveness of SOM classifications for the PUB problem.

2011 ◽  
Vol 15 (6) ◽  
pp. 1921-1935 ◽  
Author(s):  
M. Di Prinzio ◽  
A. Castellarin ◽  
E. Toth

Abstract. A promising approach to catchment classification makes use of unsupervised neural networks (Self Organising Maps, SOM's), which organise input data through non-linear techniques depending on the intrinsic similarity of the data themselves. Our study considers ∼300 Italian catchments scattered nationwide, for which several descriptors of the streamflow regime and geomorphoclimatic characteristics are available. We compare a reference classification, identified by using indices of the streamflow regime as input to SOM, with four alternative classifications, which were identified on the basis of catchment descriptors that can be derived for ungauged basins. One alternative classification adopts the available catchment descriptors as input to SOM, the remaining classifications are identified by applying SOM to sets of derived variables obtained by applying Principal Component Analysis (PCA) and Canonical Correlation Analysis (CCA) to the available catchment descriptors. The comparison is performed relative to a PUB problem, that is for predicting several streamflow indices in ungauged basins. We perform an extensive cross-validation to quantify nationwide the accuracy of predictions of mean annual runoff, mean annual flood, and flood quantiles associated with given exceedance probabilities. Results of the study indicate that performing PCA and, in particular, CCA on the available set of catchment descriptors before applying SOM significantly improves the effectiveness of SOM classifications by reducing the uncertainty of hydrological predictions in ungauged sites.


2020 ◽  
Author(s):  
Xin Yi See ◽  
Benjamin Reiner ◽  
Xuelan Wen ◽  
T. Alexander Wheeler ◽  
Channing Klein ◽  
...  

<div> <div> <div> <p>Herein, we describe the use of iterative supervised principal component analysis (ISPCA) in de novo catalyst design. The regioselective synthesis of 2,5-dimethyl-1,3,4-triphenyl-1H- pyrrole (C) via Ti- catalyzed formal [2+2+1] cycloaddition of phenyl propyne and azobenzene was targeted as a proof of principle. The initial reaction conditions led to an unselective mixture of all possible pyrrole regioisomers. ISPCA was conducted on a training set of catalysts, and their performance was regressed against the scores from the top three principal components. Component loadings from this PCA space along with k-means clustering were used to inform the design of new test catalysts. The selectivity of a prospective test set was predicted in silico using the ISPCA model, and only optimal candidates were synthesized and tested experimentally. This data-driven predictive-modeling workflow was iterated, and after only three generations the catalytic selectivity was improved from 0.5 (statistical mixture of products) to over 11 (> 90% C) by incorporating 2,6-dimethyl- 4-(pyrrolidin-1-yl)pyridine as a ligand. The successful development of a highly selective catalyst without resorting to long, stochastic screening processes demonstrates the inherent power of ISPCA in de novo catalyst design and should motivate the general use of ISPCA in reaction development. </p> </div> </div> </div>


2021 ◽  
pp. 126975
Author(s):  
Hanlin Yin ◽  
Zilong Guo ◽  
Xiuwei Zhang ◽  
Jiaojiao Chen ◽  
Yanning Zhang

2012 ◽  
Vol 43 (6) ◽  
pp. 833-850 ◽  
Author(s):  
Ziqi Yan ◽  
Lars Gottschalk ◽  
Irina Krasovskaia ◽  
Jun Xia

The long-term mean value of runoff is the basic descriptor of available water resources. This paper focuses on the accuracy that can be achieved when mapping this variable across space and along main rivers for a given stream gauging network. Three stochastic interpolation schemes for estimating average annual runoff across space are evaluated and compared. Two of the schemes firstly interpolate runoff to a regular grid net and then integrate the grid values along rivers. One of these schemes includes a constraint to account for the lateral water balance along the rivers. The third scheme interpolates runoff directly to points along rivers. A drainage basin in China with 20 gauging sites is used as a test area. In general, all three approaches reproduce the sample discharges along rivers with postdiction errors along main river branches around 10%. Using more objective cross-validation results, it was found that the two schemes based on basin integration, and especially the one with a constraint, performed significantly better than the one with direct interpolation to points along rivers. The analysis did not allow identification of possible influence of surface water use.


2011 ◽  
Vol 8 (4) ◽  
pp. 7017-7053 ◽  
Author(s):  
Z. Bao ◽  
J. Liu ◽  
J. Zhang ◽  
G. Fu ◽  
G. Wang ◽  
...  

Abstract. Equifinality is unavoidable when transferring model parameters from gauged catchments to ungauged catchments for predictions in ungauged basins (PUB). A framework for estimating the three baseflow parameters of variable infiltration capacity (VIC) model, directly with soil and topography properties is presented. When the new parameters setting methodology is used, the number of parameters needing to be calibrated is reduced from six to three, that leads to a decrease of equifinality and uncertainty. This is validated by Monte Carlo simulations in 24 hydro-climatic catchments in China. Using the new parameters estimation approach, model parameters become more sensitive and the extent of parameters space will be smaller when a threshold of goodness-of-fit is given. That means the parameters uncertainty is reduced with the new parameters setting methodology. In addition, the uncertainty of model simulation is estimated by the generalised likelihood uncertainty estimation (GLUE) methodology. The results indicate that the uncertainty of streamflow simulations, i.e., confidence interval, is lower with the new parameters estimation methodology compared to that used by original calibration methodology. The new baseflow parameters estimation framework could be applied in VIC model and other appropriate models for PUB.


2019 ◽  
Vol 35 (6) ◽  
Author(s):  
Daniel Vieira de Morais ◽  
Lorena Andrade Nunes ◽  
Vandira Pereira da Mata ◽  
Maria Angélica Pereira de Carvalho Costa ◽  
Geni da Silva Sodré ◽  
...  

Leaves are plant structures that express important traits of the environment where they live. Leaf description has allowed identification of plant species as well as investigation of abiotic factors effects on their development, such as gases, light, temperature, and herbivory. This study described populations of Dalbergia ecastaphyllum through leaf geometric morphometrics in Brazil. We evaluated 200 leaves from four populations. The principal component analysis (PCA) showed that the first four principal components were responsible for 97.81% of variation. The non-parametric multivariate analysis of variance (NPMANOVA) indicated significant difference between samples (p = 0.0001). The Mentel test showed no correlation between geographical distances and shape. The canonical variate analysis (CVA) indicated that the first two variables were responsible for 96.77 % of total variation, while the cross-validation test showed an average of 83.33%. D. ecastaphyllum leaves are elliptical and ovate.


Sign in / Sign up

Export Citation Format

Share Document