Steep channel freezeup processes: understanding complexity with statistical and physical models

2015 ◽  
Vol 42 (9) ◽  
pp. 622-633 ◽  
Author(s):  
Mathieu Dubé ◽  
Benoit Turcotte ◽  
Brian Morse

The development of ice dams in steep channels dictates water level variations and influences flow rates and habitat conditions. Despite the dominance of ice dam development in cold region gravel bed channels, practicing engineers and scientists have access to very little quantitative information describing this complex freezeup process. This paper aims to fill this gap by presenting a large data set on the process. The substantial variations observed in formation and melting rates from one site to the next and from one year to the next at the same site are explained with a physically-based numerical model that includes a complete heat budget applied to single step-pool sequence. The model successfully simulates the entire development of an ice dam and shows that the process depends on multiple parameters, or degrees of freedom. It also reveals that morphological characteristics greatly influence ice dam dynamics.

2019 ◽  
Vol 31 (1) ◽  
pp. 176-207 ◽  
Author(s):  
Hugo C. C. Carneiro ◽  
Carlos E. Pedreira ◽  
Felipe M. G. França ◽  
Priscila M. V. Lima

The Wilkie, Stonham, and Aleksander recognition device (WiSARD) [Formula: see text]-tuple classifier is a multiclass weightless neural network capable of learning a given pattern in a single step. Its architecture is determined by the number of classes it should discriminate. A target class is represented by a structure called a discriminator, which is composed of [Formula: see text] RAM nodes, each of them addressed by an [Formula: see text]-tuple. Previous studies were carried out in order to mitigate an important problem of the WiSARD [Formula: see text]-tuple classifier: having its RAM nodes saturated when trained by a large data set. Finding the VC dimension of the WiSARD [Formula: see text]-tuple classifier was one of those studies. Although no exact value was found, tight bounds were discovered. Later, the bleaching technique was proposed as a means to avoid saturation. Recent empirical results with the bleaching extension showed that the WiSARD [Formula: see text]-tuple classifier can achieve high accuracies with low variance in a great range of tasks. Theoretical studies had not been conducted with that extension previously. This work presents the exact VC dimension of the basic two-class WiSARD [Formula: see text]-tuple classifier, which is linearly proportional to the number of RAM nodes belonging to a discriminator, and exponentially to their addressing tuple length, precisely [Formula: see text]. The exact VC dimension of the bleaching extension to the WiSARD [Formula: see text]-tuple classifier, whose value is the same as that of the basic model, is also produced. Such a result confirms that the bleaching technique is indeed an enhancement to the basic WiSARD [Formula: see text]-tuple classifier as it does no harm to the generalization capability of the original paradigm.


2021 ◽  
Author(s):  
Daniel Probst ◽  
Matteo Manica ◽  
Yves Gaëtan Nana Teukam ◽  
Alessandro Castrogiovanni ◽  
Federico Paratore ◽  
...  

Enzyme catalysts are an integral part of green chemistry strategies towards a more sustainable and resource-efficient chemical synthesis. However, the use of enzymes on unreported substrates and their specific stereo- and regioselectivity are domain-specific knowledge factors that require decades of field experience to master. This makes the retrosynthesis of given targets with biocatalysed reactions a significant challenge. Here, we use the molecular transformer architecture to capture the latent knowledge about enzymatic activity from a large data set of publicly available biochemical reactions, extending forward reaction and retrosynthetic pathway prediction to the domain of biocatalysis. We introduce the use of a class token based on the EC classification scheme that allows to capture catalysis patterns among different enzymes belonging to the same hierarchical families. The forward prediction model achieves an accuracy of 49.6% and 62.7%, top-1 and top-5 respectively, while the single-step retrosynthetic model shows a round-trip accuracy of 39.6% and 42.6%, top-1 and top-10 respectively. Trained models and curated data are made publicly available with the hope of promoting enzymatic catalysis and making green chemistry more accessible through the use of digital technologies.


Entropy ◽  
2021 ◽  
Vol 23 (4) ◽  
pp. 490
Author(s):  
Clément Mantoux ◽  
Baptiste Couvy-Duchesne ◽  
Federica Cacciamani ◽  
Stéphane Epelbaum ◽  
Stanley Durrleman ◽  
...  

Network analysis provides a rich framework to model complex phenomena, such as human brain connectivity. It has proven efficient to understand their natural properties and design predictive models. In this paper, we study the variability within groups of networks, i.e., the structure of connection similarities and differences across a set of networks. We propose a statistical framework to model these variations based on manifold-valued latent factors. Each network adjacency matrix is decomposed as a weighted sum of matrix patterns with rank one. Each pattern is described as a random perturbation of a dictionary element. As a hierarchical statistical model, it enables the analysis of heterogeneous populations of adjacency matrices using mixtures. Our framework can also be used to infer the weight of missing edges. We estimate the parameters of the model using an Expectation-Maximization-based algorithm. Experimenting on synthetic data, we show that the algorithm is able to accurately estimate the latent structure in both low and high dimensions. We apply our model on a large data set of functional brain connectivity matrices from the UK Biobank. Our results suggest that the proposed model accurately describes the complex variability in the data set with a small number of degrees of freedom.


2021 ◽  
Author(s):  
Daniel Probst ◽  
Matteo Manica ◽  
Yves Gaëtan Nana Teukam ◽  
Alessandro Castrogiovanni ◽  
Federico Paratore ◽  
...  

Enzyme catalysts are an integral part of green chemistry strategies towards a more sustainable and resource-efficient chemical synthesis. However, the use of enzymes on unreported substrates and their specific stereo- and regioselectivity are domain-specific knowledge factors that require decades of field experience to master. This makes the retrosynthesis of given targets with biocatalysed reactions a significant challenge. Here, we use the molecular transformer architecture to capture the latent knowledge about enzymatic activity from a large data set of publicly available biochemical reactions, extending forward reaction and retrosynthetic pathway prediction to the domain of biocatalysis. We introduce the use of a class token based on the EC classification scheme that allows to capture catalysis patterns among different enzymes belonging to the same hierarchical families. The forward prediction model achieves an accuracy of 49.6% and 62.7%, top-1 and top-5 respectively, while the single-step retrosynthetic model shows a round-trip accuracy of 39.6% and 42.6%, top-1 and top-10 respectively. Trained models and curated data are made publicly available with the hope of promoting enzymatic catalysis and making green chemistry more accessible through the use of digital technologies.


Author(s):  
Yeshayahu Talmon

To achieve complete microstructural characterization of self-aggregating systems, one needs direct images in addition to quantitative information from non-imaging, e.g., scattering or Theological measurements, techniques. Cryo-TEM enables us to image fluid microstructures at better than one nanometer resolution, with minimal specimen preparation artifacts. Direct images are used to determine the “building blocks” of the fluid microstructure; these are used to build reliable physical models with which quantitative information from techniques such as small-angle x-ray or neutron scattering can be analyzed.To prepare vitrified specimens of microstructured fluids, we have developed the Controlled Environment Vitrification System (CEVS), that enables us to prepare samples under controlled temperature and humidity conditions, thus minimizing microstructural rearrangement due to volatile evaporation or temperature changes. The CEVS may be used to trigger on-the-grid processes to induce formation of new phases, or to study intermediate, transient structures during change of phase (“time-resolved cryo-TEM”). Recently we have developed a new CEVS, where temperature and humidity are controlled by continuous flow of a mixture of humidified and dry air streams.


2020 ◽  
Vol 39 (5) ◽  
pp. 6419-6430
Author(s):  
Dusan Marcek

To forecast time series data, two methodological frameworks of statistical and computational intelligence modelling are considered. The statistical methodological approach is based on the theory of invertible ARIMA (Auto-Regressive Integrated Moving Average) models with Maximum Likelihood (ML) estimating method. As a competitive tool to statistical forecasting models, we use the popular classic neural network (NN) of perceptron type. To train NN, the Back-Propagation (BP) algorithm and heuristics like genetic and micro-genetic algorithm (GA and MGA) are implemented on the large data set. A comparative analysis of selected learning methods is performed and evaluated. From performed experiments we find that the optimal population size will likely be 20 with the lowest training time from all NN trained by the evolutionary algorithms, while the prediction accuracy level is lesser, but still acceptable by managers.


2019 ◽  
Vol 21 (9) ◽  
pp. 662-669 ◽  
Author(s):  
Junnan Zhao ◽  
Lu Zhu ◽  
Weineng Zhou ◽  
Lingfeng Yin ◽  
Yuchen Wang ◽  
...  

Background: Thrombin is the central protease of the vertebrate blood coagulation cascade, which is closely related to cardiovascular diseases. The inhibitory constant Ki is the most significant property of thrombin inhibitors. Method: This study was carried out to predict Ki values of thrombin inhibitors based on a large data set by using machine learning methods. Taking advantage of finding non-intuitive regularities on high-dimensional datasets, machine learning can be used to build effective predictive models. A total of 6554 descriptors for each compound were collected and an efficient descriptor selection method was chosen to find the appropriate descriptors. Four different methods including multiple linear regression (MLR), K Nearest Neighbors (KNN), Gradient Boosting Regression Tree (GBRT) and Support Vector Machine (SVM) were implemented to build prediction models with these selected descriptors. Results: The SVM model was the best one among these methods with R2=0.84, MSE=0.55 for the training set and R2=0.83, MSE=0.56 for the test set. Several validation methods such as yrandomization test and applicability domain evaluation, were adopted to assess the robustness and generalization ability of the model. The final model shows excellent stability and predictive ability and can be employed for rapid estimation of the inhibitory constant, which is full of help for designing novel thrombin inhibitors.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Ruolan Zeng ◽  
Jiyong Deng ◽  
Limin Dang ◽  
Xinliang Yu

AbstractA three-descriptor quantitative structure–activity/toxicity relationship (QSAR/QSTR) model was developed for the skin permeability of a sufficiently large data set consisting of 274 compounds, by applying support vector machine (SVM) together with genetic algorithm. The optimal SVM model possesses the coefficient of determination R2 of 0.946 and root mean square (rms) error of 0.253 for the training set of 139 compounds; and a R2 of 0.872 and rms of 0.302 for the test set of 135 compounds. Compared with other models reported in the literature, our SVM model shows better statistical performance in a model that deals with more samples in the test set. Therefore, applying a SVM algorithm to develop a nonlinear QSAR model for skin permeability was achieved.


Author(s):  
Lior Shamir

Abstract Several recent observations using large data sets of galaxies showed non-random distribution of the spin directions of spiral galaxies, even when the galaxies are too far from each other to have gravitational interaction. Here, a data set of $\sim8.7\cdot10^3$ spiral galaxies imaged by Hubble Space Telescope (HST) is used to test and profile a possible asymmetry between galaxy spin directions. The asymmetry between galaxies with opposite spin directions is compared to the asymmetry of galaxies from the Sloan Digital Sky Survey. The two data sets contain different galaxies at different redshift ranges, and each data set was annotated using a different annotation method. The results show that both data sets show a similar asymmetry in the COSMOS field, which is covered by both telescopes. Fitting the asymmetry of the galaxies to cosine dependence shows a dipole axis with probabilities of $\sim2.8\sigma$ and $\sim7.38\sigma$ in HST and SDSS, respectively. The most likely dipole axis identified in the HST galaxies is at $(\alpha=78^{\rm o},\delta=47^{\rm o})$ and is well within the $1\sigma$ error range compared to the location of the most likely dipole axis in the SDSS galaxies with $z>0.15$ , identified at $(\alpha=71^{\rm o},\delta=61^{\rm o})$ .


Genetics ◽  
1997 ◽  
Vol 146 (3) ◽  
pp. 995-1010 ◽  
Author(s):  
Rafael Zardoya ◽  
Axel Meyer

The complete nucleotide sequence of the 16,407-bp mitochondrial genome of the coelacanth (Latimeria chalumnae) was determined. The coelacanth mitochondrial genome order is identical to the consensus vertebrate gene order which is also found in all ray-finned fishes, the lungfish, and most tetrapods. Base composition and codon usage also conform to typical vertebrate patterns. The entire mitochondrial genome was PCR-amplified with 24 sets of primers that are expected to amplify homologous regions in other related vertebrate species. Analyses of the control region of the coelacanth mitochondrial genome revealed the existence of four 22-bp tandem repeats close to its 3′ end. The phylogenetic analyses of a large data set combining genes coding for rRNAs, tRNA, and proteins (16,140 characters) confirmed the phylogenetic position of the coelacanth as a lobe-finned fish; it is more closely related to tetrapods than to ray-finned fishes. However, different phylogenetic methods applied to this largest available molecular data set were unable to resolve unambiguously the relationship of the coelacanth to the two other groups of extant lobe-finned fishes, the lungfishes and the tetrapods. Maximum parsimony favored a lungfish/coelacanth or a lungfish/tetrapod sistergroup relationship depending on which transversion:transition weighting is assumed. Neighbor-joining and maximum likelihood supported a lungfish/tetrapod sistergroup relationship.


Sign in / Sign up

Export Citation Format

Share Document