scholarly journals Data Integration in Logic-based Models of Biological Mechanisms

Author(s):  
Benjamin Hall ◽  
Anna Niarakis

Discrete, logic-based models are increasingly used to describe biological mechanisms. Initially introduced to study gene regulation, these models evolved to cover various molecular mechanisms, such as signalling, transcription factor cooperativity, and even metabolic processes. The abstract nature and amenability of discrete models to robust mathematical analyses make them appropriate for addressing a wide range of complex biological problems. Recent technological breakthroughs have generated a wealth of high throughput data. Novel, literature-based representations of biological processes and emerging machine learning algorithms offer new opportunities for model construction. Here, we review recent efforts to incorporate omic data into logic-based models and discuss critical challenges in constructing and analysing integrative, large-scale, logic-based models of biological mechanisms.

Author(s):  
Benjamin Hall ◽  
Anna Niarakis

Discrete, logic-based models are increasingly used to describe biological mechanisms. Initially introduced to study gene regulation, these models evolved to cover various molecular mechanisms, such as signalling, transcription factor cooperativity, and even metabolic processes. The abstract nature and amenability of discrete models to robust mathematical analyses make them appropriate for addressing a wide range of complex biological problems. Recent technological breakthroughs have generated a wealth of high throughput data. Novel, literature-based representations of biological processes and emerging algorithms offer new opportunities for model construction. Here, we review up-to-date efforts to address challenging biological questions by incorporating omic data into logic-based models, and discuss critical difficulties in constructing and analysing integrative, large-scale, logic-based models of biological mechanisms.


2011 ◽  
Vol 39 (3) ◽  
pp. 719-723 ◽  
Author(s):  
Zharain Bawa ◽  
Charlotte E. Bland ◽  
Nicklas Bonander ◽  
Nagamani Bora ◽  
Stephanie P. Cartwright ◽  
...  

Membrane proteins are drug targets for a wide range of diseases. Having access to appropriate samples for further research underpins the pharmaceutical industry's strategy for developing new drugs. This is typically achieved by synthesizing a protein of interest in host cells that can be cultured on a large scale, allowing the isolation of the pure protein in quantities much higher than those found in the protein's native source. Yeast is a popular host as it is a eukaryote with similar synthetic machinery to that of the native human source cells of many proteins of interest, while also being quick, easy and cheap to grow and process. Even in these cells, the production of human membrane proteins can be plagued by low functional yields; we wish to understand why. We have identified molecular mechanisms and culture parameters underpinning high yields and have consolidated our findings to engineer improved yeast host strains. By relieving the bottlenecks to recombinant membrane protein production in yeast, we aim to contribute to the drug discovery pipeline, while providing insight into translational processes.


Author(s):  
Emir Kocer ◽  
Tsz Wai Ko ◽  
Jörg Behler

In the past two decades, machine learning potentials (MLPs) have reached a level of maturity that now enables applications to large-scale atomistic simulations of a wide range of systems in chemistry, physics, and materials science. Different machine learning algorithms have been used with great success in the construction of these MLPs. In this review, we discuss an important group of MLPs relying on artificial neural networks to establish a mapping from the atomic structure to the potential energy. In spite of this common feature, there are important conceptual differences among MLPs, which concern the dimensionality of the systems, the inclusion of long-range electrostatic interactions, global phenomena like nonlocal charge transfer, and the type of descriptor used to represent the atomic structure, which can be either predefined or learnable. A concise overview is given along with a discussion of the open challenges in the field. Expected final online publication date for the Annual Review of Physical Chemistry, Volume 73 is April 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.


2019 ◽  
Author(s):  
Yuhua Zhang ◽  
Corbin Quick ◽  
Ketian Yu ◽  
Alvaro Barbeira ◽  
Francesca Luca ◽  
...  

AbstractTranscriptome-wide association studies (TWAS), an integrative framework using expression quantitative trait loci (eQTLs) to construct proxies for gene expression, have emerged as a promising method to investigate the biological mechanisms underlying associations between genotypes and complex traits. However, challenges remain in interpreting TWAS results, especially regarding their causality implications. In this paper, we describe a new computational framework, probabilistic TWAS (PTWAS), to detect associations and investigate causal relationships between gene expression and complex traits. We use established concepts and principles from instrumental variables (IV) analysis to delineate and address the unique challenges that arise in TWAS. PTWAS utilizes probabilistic eQTL annotations derived from multi-variant Bayesian fine-mapping analysis conferring higher power to detect TWAS associations than existing methods. Additionally, PTWAS provides novel functionalities to evaluate the causal assumptions and estimate tissue- or cell-type specific causal effects of gene expression on complex traits. These features make PTWAS uniquely suited for in-depth investigations of the biological mechanisms that contribute to complex trait variation. Using eQTL data across 49 tissues from GTEx v8, we apply PTWAS to analyze 114 complex traits using GWAS summary statistics from several large-scale projects, including the UK Biobank. Our analysis reveals an abundance of genes with strong evidence of eQTL-mediated causal effects on complex traits and highlights the heterogeneity and tissue-relevance of these effects across complex traits. We distribute software and eQTL annotations to enable users performing rigorous TWAS analysis by leveraging the full potentials of the latest GTEx multi-tissue eQTL data.


Author(s):  
Gonca Erdemci-Tandogan ◽  
M. Lisa Manning

Large-scale tissue deformation during biological processes such as morphogenesis requires cellular rearrangements. The simplest rearrangement in confluent cellular monolayers involves neighbor exchanges among four cells, called a T1 transition, in analogy to foams. But unlike foams, cells must execute a sequence of molecular processes, such as endocytosis of adhesion molecules, to complete a T1 transition. Such processes could take a long time compared to other timescales in the tissue. In this work, we incorporate this idea by augmenting vertex models to require a fixed, finite time for T1 transitions, which we call the “T1 delay time”. We study how variations in T1 delay time affect tissue mechanics, by quantifying the relaxation time of tissues in the presence of T1 delays and comparing that to the cell-shape based timescale that characterizes fluidity in the absence of any T1 delays. We show that the molecular-scale T1 delay timescale dominates over the cell shape-scale collective response timescale when the T1 delay time is the larger of the two. We extend this analysis to tissues that become anisotropic under convergent extension, finding similar results. Moreover, we find that increasing the T1 delay time increases the percentage of higher-fold coordinated vertices and rosettes, and decreases the overall number of successful T1s, contributing to a more elastic-like – and less fluid-like – tissue response. Our work suggests that molecular mechanisms that act as a brake on T1 transitions could stiffen global tissue mechanics and enhance rosette formation during morphogenesis.


2020 ◽  
Vol 8 (12) ◽  
pp. 1889
Author(s):  
Annie Vera Hunnestad ◽  
Anne Ilse Maria Vogel ◽  
Evelyn Armstrong ◽  
Maria Guadalupe Digernes ◽  
Murat Van Ardelan ◽  
...  

Iron is an essential, yet scarce, nutrient in marine environments. Phytoplankton, and especially cyanobacteria, have developed a wide range of mechanisms to acquire iron and maintain their iron-rich photosynthetic machinery. Iron limitation studies often utilize either oceanographic methods to understand large scale processes, or laboratory-based, molecular experiments to identify underlying molecular mechanisms on a cellular level. Here, we aim to highlight the benefits of both approaches to encourage interdisciplinary understanding of the effects of iron limitation on cyanobacteria with a focus on avoiding pitfalls in the initial phases of collaboration. In particular, we discuss the use of trace metal clean methods in combination with sterile techniques, and the challenges faced when a new collaboration is set up to combine interdisciplinary techniques. Methods necessary for producing reliable data, such as High Resolution Inductively Coupled Plasma Mass Spectrometry (HR-ICP-MS), Flow Injection Analysis Chemiluminescence (FIA-CL), and 77K fluorescence emission spectroscopy are discussed and evaluated and a technical manual, including the preparation of the artificial seawater medium Aquil, cleaning procedures, and a sampling scheme for an iron limitation experiment is included. This paper provides a reference point for researchers to implement different techniques into interdisciplinary iron studies that span cyanobacteria physiology, molecular biology, and biogeochemistry.


2019 ◽  
Vol 14 (7) ◽  
pp. 614-620 ◽  
Author(s):  
Jiajing Chen ◽  
Jianan Zhao ◽  
Shiping Yang ◽  
Zhen Chen ◽  
Ziding Zhang

Background: As one of the most important reversible protein post-translation modification types, ubiquitination plays a significant role in the regulation of many biological processes, such as cell division, signal transduction, apoptosis and immune response. Protein ubiquitination usually occurs when ubiquitin molecule is attached to a lysine on a target protein, which is also known as “lysine ubiquitination”. Objective: In order to investigate the molecular mechanisms of ubiquitination-related biological processes, the crucial first step is the identification of ubiquitination sites. However, conventional experimental methods in detecting ubiquitination sites are often time-consuming and a large number of ubiquitination sites remain unidentified. In this study, a ubiquitination site prediction method for Arabidopsis thaliana was developed using a Support Vector Machine (SVM). Methods: We collected 3009 experimentally validated ubiquitination sites on 1607 proteins in A. thaliana to construct the training set. Three feature encoding schemes were used to characterize the sequence patterns around ubiquitination sites, including AAC, Binary and CKSAAP. The maximum Relevance and Minimum Redundancy (mRMR) feature selection method was employed to reduce the dimensionality of input features. Five-fold cross-validation and independent tests were used to evaluate the performance of the established models. Results: As a result, the combination of AAC and CKSAAP encoding schemes yielded the best performance with the accuracy and AUC of 81.35% and 0.868 in the independent test. We also generated an online predictor termed as AraUbiSite, which is freely accessible at: http://systbio.cau.edu.cn/araubisite. Conclusion: We developed a well-performed prediction tool for large-scale ubiquitination site identification in A. thaliana. It is hoped that the current work will speed up the process of identification of ubiquitination sites in A. thaliana and help to further elucidate the molecular mechanisms of ubiquitination in plants.


2018 ◽  
Author(s):  
Jin Li ◽  
Le Zheng ◽  
Akihiko Uchiyama ◽  
Lianghua Bin ◽  
Theodora M. Mauro ◽  
...  

AbstractA large volume of biological data is being generated for studying mechanisms of various biological processes. These precious data enable large-scale computational analyses to gain biological insights. However, it remains a challenge to mine the data efficiently for knowledge discovery. The heterogeneity of these data makes it difficult to consistently integrate them, slowing down the process of biological discovery. We introduce a data processing paradigm to identify key factors in biological processes via systematic collection of gene expression datasets, primary analysis of data, and evaluation of consistent signals. To demonstrate its effectiveness, our paradigm was applied to epidermal development and identified many genes that play a potential role in this process. Besides the known epidermal development genes, a substantial proportion of the identified genes are still not supported by gain- or loss-of-function studies, yielding many novel genes for future studies. Among them, we selected a top gene for loss-of-function experimental validation and confirmed its function in epidermal differentiation, proving the ability of this paradigm to identify new factors in biological processes. In addition, this paradigm revealed many key genes in cold-induced thermogenesis using data from cold-challenged tissues, demonstrating its generalizability. This paradigm can lead to fruitful results for studying molecular mechanisms in an era of explosive accumulation of publicly available biological data.


2009 ◽  
Vol 28 (4) ◽  
pp. 248-261 ◽  
Author(s):  
Jadranka Dunđerski ◽  
Gordana Matić

Glucocorticoid Receptor in Health and DiseaseGlucocorticoid hormones are essential for life, have a vital place in the treatment of inflammatory and autoimmune diseases and are increasingly implicated in the pathogenesis of a number of common disorders. Their action is mediated by an intracellular receptor protein, the glucocorticoid receptor (GR), functioning as a ligand-inducible transcription factor. Multiple synthetic glucocorticoids are used as potent antiinflammatory and immunosuppressive agents, but their therapeutic usefulness is limited by a wide range and severity of side-effects. One of the most important pharmaceutical goals has been to design steroidal and non-steroidal GR ligands with profound therapeutic efficacy and reduced unwanted effects. The therapeutic benefit of glucocorticoid agonists is frequently compromised by resistance to glucocorticoids, which may depend on: access of the hormones to target cells, steroid metabolism, expression level and isoform composition of the GR protein, mutations and polymorphisms in the GR gene and association of the receptor with chaperone proteins. The major breakthrough into the critical role of glucocorticoid signaling in the maintenance of homeostasis and pathogenesis of diseases, as well as into the molecular mechanisms underlying the therapeutic usefulness of antiinflammatory drugs acting through the GR is expected to result from the current progress in large-scale gene expression profiling technologies and computational biology.


Author(s):  
Leonard Schmiester ◽  
Yannik Schälte ◽  
Fabian Fröhlich ◽  
Jan Hasenauer ◽  
Daniel Weindl

Abstract Motivation Mechanistic models of biochemical reaction networks facilitate the quantitative understanding of biological processes and the integration of heterogeneous datasets. However, some biological processes require the consideration of comprehensive reaction networks and therefore large-scale models. Parameter estimation for such models poses great challenges, in particular when the data are on a relative scale. Results Here, we propose a novel hierarchical approach combining (i) the efficient analytic evaluation of optimal scaling, offset and error model parameters with (ii) the scalable evaluation of objective function gradients using adjoint sensitivity analysis. We evaluate the properties of the methods by parameterizing a pan-cancer ordinary differential equation model (>1000 state variables, >4000 parameters) using relative protein, phosphoprotein and viability measurements. The hierarchical formulation improves optimizer performance considerably. Furthermore, we show that this approach allows estimating error model parameters with negligible computational overhead when no experimental estimates are available, providing an unbiased way to weight heterogeneous data. Overall, our hierarchical formulation is applicable to a wide range of models, and allows for the efficient parameterization of large-scale models based on heterogeneous relative measurements. Availability and implementation Supplementary code and data are available online at http://doi.org/10.5281/zenodo.3254429 and http://doi.org/10.5281/zenodo.3254441. Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document