scholarly journals Efficient parameterization of large-scale dynamic models based on relative measurements

Author(s):  
Leonard Schmiester ◽  
Yannik Schälte ◽  
Fabian Fröhlich ◽  
Jan Hasenauer ◽  
Daniel Weindl

Abstract Motivation Mechanistic models of biochemical reaction networks facilitate the quantitative understanding of biological processes and the integration of heterogeneous datasets. However, some biological processes require the consideration of comprehensive reaction networks and therefore large-scale models. Parameter estimation for such models poses great challenges, in particular when the data are on a relative scale. Results Here, we propose a novel hierarchical approach combining (i) the efficient analytic evaluation of optimal scaling, offset and error model parameters with (ii) the scalable evaluation of objective function gradients using adjoint sensitivity analysis. We evaluate the properties of the methods by parameterizing a pan-cancer ordinary differential equation model (>1000 state variables, >4000 parameters) using relative protein, phosphoprotein and viability measurements. The hierarchical formulation improves optimizer performance considerably. Furthermore, we show that this approach allows estimating error model parameters with negligible computational overhead when no experimental estimates are available, providing an unbiased way to weight heterogeneous data. Overall, our hierarchical formulation is applicable to a wide range of models, and allows for the efficient parameterization of large-scale models based on heterogeneous relative measurements. Availability and implementation Supplementary code and data are available online at http://doi.org/10.5281/zenodo.3254429 and http://doi.org/10.5281/zenodo.3254441. Supplementary information Supplementary data are available at Bioinformatics online.

2019 ◽  
Author(s):  
Leonard Schmiester ◽  
Yannik Schälte ◽  
Fabian Fröhlich ◽  
Jan Hasenauer ◽  
Daniel Weindl

AbstractMotivationMechanistic models of biochemical reaction networks facilitate the quantitative understanding of biological processes and the integration of heterogeneous datasets. However, some biological processes require the consideration of comprehensive reaction networks and therefore large-scale models. Parameter estimation for such models poses great challenges, in particular when the data are on a relative scale.ResultsHere, we propose a novel hierarchical approach combining (i) the efficient analytic evaluation of optimal scaling, offset, and error model parameters with (ii) the scalable evaluation of objective function gradients using adjoint sensitivity analysis. We evaluate the properties of the methods by parameterizing a pan-cancer ordinary differential equation model (>1000 state variables,>4000 parameters) using relative protein, phospho-protein and viability measurements. The hierarchical formulation improves optimizer performance considerably. Furthermore, we show that this approach allows estimating error model parameters with negligible computational overhead when no experimental estimates are available, pro-viding an unbiased way to weight heterogeneous data. Overall, our hierarchical formulation is applicable to a wide range of models, and allows for the efficient parameterization of large-scale models based on heterogeneous relative measurements.Contactjan.hasenauer@helmholtz-muenchen.deSupplementary informationSupplementary information are available atbioRxivonline. Supplementary code and data are available online athttp://doi.org/10.5281/zenodo.2593839andhttp://doi.org/10.5281/zenodo.2592186.


2000 ◽  
Vol 663 ◽  
Author(s):  
J. Samper ◽  
R. Juncosa ◽  
V. Navarro ◽  
J. Delgado ◽  
L. Montenegro ◽  
...  

ABSTRACTFEBEX (Full-scale Engineered Barrier EXperiment) is a demonstration and research project dealing with the bentonite engineered barrier designed for sealing and containment of waste in a high level radioactive waste repository (HLWR). It includes two main experiments: an situ full-scale test performed at Grimsel (GTS) and a mock-up test operating since February 1997 at CIEMAT facilities in Madrid (Spain) [1,2,3]. One of the objectives of FEBEX is the development and testing of conceptual and numerical models for the thermal, hydrodynamic, and geochemical (THG) processes expected to take place in engineered clay barriers. A significant improvement in coupled THG modeling of the clay barrier has been achieved both in terms of a better understanding of THG processes and more sophisticated THG computer codes. The ability of these models to reproduce the observed THG patterns in a wide range of THG conditions enhances the confidence in their prediction capabilities. Numerical THG models of heating and hydration experiments performed on small-scale lab cells provide excellent results for temperatures, water inflow and final water content in the cells [3]. Calculated concentrations at the end of the experiments reproduce most of the patterns of measured data. In general, the fit of concentrations of dissolved species is better than that of exchanged cations. These models were later used to simulate the evolution of the large-scale experiments (in situ and mock-up). Some thermo-hydrodynamic hypotheses and bentonite parameters were slightly revised during TH calibration of the mock-up test. The results of the reference model reproduce simultaneously the observed water inflows and bentonite temperatures and relative humidities. Although the model is highly sensitive to one-at-a-time variations in model parameters, the possibility of parameter combinations leading to similar fits cannot be precluded. The TH model of the “in situ” test is based on the same bentonite TH parameters and assumptions as for the “mock-up” test. Granite parameters were slightly modified during the calibration process in order to reproduce the observed thermal and hydrodynamic evolution. The reference model captures properly relative humidities and temperatures in the bentonite [3]. It also reproduces the observed spatial distribution of water pressures and temperatures in the granite. Once calibrated the TH aspects of the model, predictions of the THG evolution of both tests were performed. Data from the dismantling of the in situ test, which is planned for the summer of 2001, will provide a unique opportunity to test and validate current THG models of the EBS.


2020 ◽  
Vol 36 (10) ◽  
pp. 3011-3017 ◽  
Author(s):  
Olga Mineeva ◽  
Mateo Rojas-Carulla ◽  
Ruth E Ley ◽  
Bernhard Schölkopf ◽  
Nicholas D Youngblut

Abstract Motivation Methodological advances in metagenome assembly are rapidly increasing in the number of published metagenome assemblies. However, identifying misassemblies is challenging due to a lack of closely related reference genomes that can act as pseudo ground truth. Existing reference-free methods are no longer maintained, can make strong assumptions that may not hold across a diversity of research projects, and have not been validated on large-scale metagenome assemblies. Results We present DeepMAsED, a deep learning approach for identifying misassembled contigs without the need for reference genomes. Moreover, we provide an in silico pipeline for generating large-scale, realistic metagenome assemblies for comprehensive model training and testing. DeepMAsED accuracy substantially exceeds the state-of-the-art when applied to large and complex metagenome assemblies. Our model estimates a 1% contig misassembly rate in two recent large-scale metagenome assembly publications. Conclusions DeepMAsED accurately identifies misassemblies in metagenome-assembled contigs from a broad diversity of bacteria and archaea without the need for reference genomes or strong modeling assumptions. Running DeepMAsED is straight-forward, as well as is model re-training with our dataset generation pipeline. Therefore, DeepMAsED is a flexible misassembly classifier that can be applied to a wide range of metagenome assembly projects. Availability and implementation DeepMAsED is available from GitHub at https://github.com/leylabmpi/DeepMAsED. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Author(s):  
Yuan Yuan ◽  
Lei Lin

Satellite image time series (SITS) classification is a major research topic in remote sensing and is relevant for a wide range of applications. Deep learning approaches have been commonly employed for SITS classification and have provided state-of-the-art performance. However, deep learning methods suffer from overfitting when labeled data is scarce. To address this problem, we propose a novel self-supervised pre-training scheme to initialize a Transformer-based network by utilizing large-scale unlabeled data. In detail, the model is asked to predict randomly contaminated observations given an entire time series of a pixel. The main idea of our proposal is to leverage the inherent temporal structure of satellite time series to learn general-purpose spectral-temporal representations related to land cover semantics. Once pre-training is completed, the pre-trained network can be further adapted to various SITS classification tasks by fine-tuning all the model parameters on small-scale task-related labeled data. In this way, the general knowledge and representations about SITS can be transferred to a label-scarce task, thereby improving the generalization performance of the model as well as reducing the risk of overfitting. Comprehensive experiments have been carried out on three benchmark datasets over large study areas. Experimental results demonstrate the effectiveness of the proposed method, leading to a classification accuracy increment up to 1.91% to 6.69%. <div><b>This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.</b></div>


2020 ◽  
Author(s):  
Yuan Yuan ◽  
Lei Lin

<div>Satellite image time series (SITS) classification is a major research topic in remote sensing and is relevant for a wide range of applications. Deep learning approaches have been commonly employed for SITS classification and have provided state-of-the-art performance. However, deep learning methods suffer from overfitting when labeled data is scarce. To address this problem, we propose a novel self-supervised pre-training scheme to initialize a Transformer-based network by utilizing large-scale unlabeled data. In detail, the model is asked to predict randomly contaminated observations given an entire time series of a pixel. The main idea of our proposal is to leverage the inherent temporal structure of satellite time series to learn general-purpose spectral-temporal representations related to land cover semantics. Once pre-training is completed, the pre-trained network can be further adapted to various SITS classification tasks by fine-tuning all the model parameters on small-scale task-related labeled data. In this way, the general knowledge and representations about SITS can be transferred to a label-scarce task, thereby improving the generalization performance of the model as well as reducing the risk of overfitting. Comprehensive experiments have been carried out on three benchmark datasets over large study areas. Experimental results demonstrate the effectiveness of the proposed method, leading to a classification accuracy increment up to 2.38% to 5.27%. The code and the pre-trained model will be available at https://github.com/linlei1214/SITS-BERT upon publication.</div><div><b>This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.</b></div>


2018 ◽  
Author(s):  
Federica Eduati ◽  
Patricia Jaaks ◽  
Christoph A. Merten ◽  
Mathew J. Garnett ◽  
Julio Saez- Rodriguez

AbstractMechanistic modeling of signaling pathways mediating patient-specific response to therapy can help to unveil resistance mechanisms and improve therapeutic strategies. Yet, creating such models for patients, in particular for solid malignancies, is challenging. A major hurdle to build these models is the limited material available, that precludes the generation of large-scale perturbation data. Here, we present an approach that couples ex vivo high-throughput screenings of cancer biopsies using microfluidics with logic-based modeling to generate patient-specific dynamic models of extrinsic and intrinsic apoptosis signaling pathways. We used the resulting models to investigate heterogeneity in pancreatic cancer patients, showing dissimilarities especially in the PI3K-Akt pathway. Variation in model parameters reflected well the different tumor stages. Finally, we used our dynamic models to efficaciously predict new personalized combinatorial treatments. Our results suggest our combination of microfluidic experiments and mathematical model can be a novel tool toward cancer precision medicine.


2001 ◽  
Vol 124 (1) ◽  
pp. 62-66 ◽  
Author(s):  
Pei-Sun Zung ◽  
Ming-Hwei Perng

This paper presents a handy nonlinear dynamic model for the design of a two stage pilot pressure relief servo-valve. Previous surveys indicate that the performance of existing control valves has been limited by the lack of an accurate dynamic model. However, most of the existing dynamic models of pressure relief valves are developed for the selection of a suitable valve for a hydraulic system, and assume model parameters which are not directly controllable during the manufacturing process. As a result, such models are less useful for a manufacturer eager to improve the performance of a pressure valve. In contrast, model parameters in the present approach have been limited to dimensions measurable from the blue prints of the valve such that a specific design can be evaluated by simulation before actually manufacturing the valve. Moreover, the resultant model shows excellent agreement with experiments in a wide range of operating conditions.


2015 ◽  
Vol 15 (7) ◽  
pp. 10709-10738 ◽  
Author(s):  
M. Sikma ◽  
H. G. Ouwersloot

Abstract. We investigate the representation of convective transport of atmospheric compounds that can be applied in large-scale models. We focus on three key parameterizations that, when combined, express this transport: the area fraction of transporting clouds, the upward velocity in the cloud cores and the chemical concentrations at the cloud base. The first two parameterizations combined represent the mass flux by clouds. To investigate the key parameterizations under a wide range of conditions, we use Large-Eddy Simulation model data for 10 meteorological situations, characterized by either shallow cumulus or stratocumulus clouds. In the analysis of the area fraction of clouds, we (i) simplify the independent variable used for the parameterization, Q1, by considering the variability in moisture rather than in the saturation deficit. We show that there is an unambiguous dependence of the area fraction of clouds on the simplified Q1, and update the parameters in the parameterization to account for this simplification. We (ii) further demonstrate that the independent variable has to be evaluated locally to capture cloud presence. Furthermore, we (iii) show that the area fraction of transporting clouds is not represented by the parameterization for the total cloud area fraction, as is currently applied in large-scale models. To capture cloud transport, a novel active cloud area fraction parameterization is proposed. Subsequently, the scaling of the upward velocity in the clouds' core by the Deardorff convective velocity scale and the parameterization for the concentration of atmospheric reactants at cloud base from literature are verified and improved by analyzing 6 SCu cases. For the latter, we additionally discuss how the parameterization is affected by wind conditions. This study contributes to a more accurate estimation of convective transport in large-scale models, which occurs there at sub-grid scale.


Author(s):  
Benjamin Hall ◽  
Anna Niarakis

Discrete, logic-based models are increasingly used to describe biological mechanisms. Initially introduced to study gene regulation, these models evolved to cover various molecular mechanisms, such as signalling, transcription factor cooperativity, and even metabolic processes. The abstract nature and amenability of discrete models to robust mathematical analyses make them appropriate for addressing a wide range of complex biological problems. Recent technological breakthroughs have generated a wealth of high throughput data. Novel, literature-based representations of biological processes and emerging machine learning algorithms offer new opportunities for model construction. Here, we review recent efforts to incorporate omic data into logic-based models and discuss critical challenges in constructing and analysing integrative, large-scale, logic-based models of biological mechanisms.


2018 ◽  
Vol 35 (7) ◽  
pp. 1249-1251 ◽  
Author(s):  
Kai Li ◽  
Marc Vaudel ◽  
Bing Zhang ◽  
Yan Ren ◽  
Bo Wen

Abstract Summary Data visualization plays critical roles in proteomics studies, ranging from quality control of MS/MS data to validation of peptide identification results. Herein, we present PDV, an integrative proteomics data viewer that can be used to visualize a wide range of proteomics data, including database search results, de novo sequencing results, proteogenomics files, MS/MS data in mzML/mzXML format and data from public proteomics repositories. PDV is a lightweight visualization tool that enables intuitive and fast exploration of diverse, large-scale proteomics datasets on standard desktop computers in both graphical user interface and command line modes. Availability and implementation PDV software and the user manual are freely available at http://pdv.zhang-lab.org. The source code is available at https://github.com/wenbostar/PDV and is released under the GPL-3 license. Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document