Integrating –omics data into genome-scale metabolic network models: principles and challenges

Charlotte Ramon; Mattia G. Gollub; Jörg Stelling

doi:10.1042/ebc20180011

Integrating –omics data into genome-scale metabolic network models: principles and challenges

Essays in Biochemistry ◽

10.1042/ebc20180011 ◽

2018 ◽

Vol 62 (4) ◽

pp. 563-574 ◽

Cited By ~ 10

Author(s):

Charlotte Ramon ◽

Mattia G. Gollub ◽

Jörg Stelling

Keyword(s):

Data Integration ◽

Large Scale ◽

Network Models ◽

Omics Data ◽

Scale Models ◽

Common Framework ◽

Genome Scale ◽

Constraint Based Models ◽

Omics Data Integration

At genome scale, it is not yet possible to devise detailed kinetic models for metabolism because data on the in vivo biochemistry are too sparse. Predictive large-scale models for metabolism most commonly use the constraint-based framework, in which network structures constrain possible metabolic phenotypes at steady state. However, these models commonly leave many possibilities open, making them less predictive than desired. With increasingly available –omics data, it is appealing to increase the predictive power of constraint-based models (CBMs) through data integration. Many corresponding methods have been developed, but data integration is still a challenge and existing methods perform less well than expected. Here, we review main approaches for the integration of different types of –omics data into CBMs focussing on the methods’ assumptions and limitations. We argue that key assumptions – often derived from single-enzyme kinetics – do not generally apply in the context of networks, thereby explaining current limitations. Emerging methods bridging CBMs and biochemical kinetics may allow for –omics data integration in a common framework to provide more accurate predictions.

Application of multi-omics data integration and machine learning approaches to identify epigenetic and transcriptomic differences between in vitro and in vivo produced bovine embryos

PLoS ONE ◽

10.1371/journal.pone.0252096 ◽

2021 ◽

Vol 16 (5) ◽

pp. e0252096

Author(s):

Maria B. Rabaglino ◽

Alan O’Doherty ◽

Jan Bojsen-Møller Secher ◽

Patrick Lonergan ◽

Poul Hyttel ◽

...

Keyword(s):

Machine Learning ◽

Data Integration ◽

Focal Adhesion ◽

Learning Approach ◽

Omics Data ◽

Machine Learning Approach ◽

Cluster 2 ◽

Omics Data Integration

Pregnancy rates for in vitro produced (IVP) embryos are usually lower than for embryos produced in vivo after ovarian superovulation (MOET). This is potentially due to alterations in their trophectoderm (TE), the outermost layer in physical contact with the maternal endometrium. The main objective was to apply a multi-omics data integration approach to identify both temporally differentially expressed and differentially methylated genes (DEG and DMG), between IVP and MOET embryos, that could impact TE function. To start, four and five published transcriptomic and epigenomic datasets, respectively, were processed for data integration. Second, DEG from day 7 to days 13 and 16 and DMG from day 7 to day 17 were determined in the TE from IVP vs. MOET embryos. Third, genes that were both DE and DM were subjected to hierarchical clustering and functional enrichment analysis. Finally, findings were validated through a machine learning approach with two additional datasets from day 15 embryos. There were 1535 DEG and 6360 DMG, with 490 overlapped genes, whose expression profiles at days 13 and 16 resulted in three main clusters. Cluster 1 (188) and Cluster 2 (191) genes were down-regulated at day 13 or day 16, respectively, while Cluster 3 genes (111) were up-regulated at both days, in IVP embryos compared to MOET embryos. The top enriched terms were the KEGG pathway "focal adhesion" in Cluster 1 (FDR = 0.003), and the cellular component: "extracellular exosome" in Cluster 2 (FDR<0.0001), also enriched in Cluster 1 (FDR = 0.04). According to the machine learning approach, genes in Cluster 1 showed a similar expression pattern between IVP and less developed (short) MOET conceptuses; and between MOET and DKK1-treated (advanced) IVP conceptuses. In conclusion, these results suggest that early conceptuses derived from IVP embryos exhibit epigenomic and transcriptomic changes that later affect its elongation and focal adhesion, impairing post-transfer survival.

A Multi-Omics Data Integration Method and Parametric Analysis on Large-Scale Colon Cancer Data

Journal of KIISE ◽

10.5626/jok.2020.47.8.779 ◽

2020 ◽

Vol 47 (8) ◽

pp. 779-786

Author(s):

Inuk Jung

Keyword(s):

Colon Cancer ◽

Data Integration ◽

Parametric Analysis ◽

Large Scale ◽

Integration Method ◽

Omics Data ◽

Cancer Data ◽

Colon Cancer Data ◽

Omics Data Integration

0311 - Effect of ammonia on the dynamics of anaerobic digestion microbiome: omics data integration in a time-course context

10.26226/morressier.5b5199beb1b87b000ecee94b ◽

2018 ◽

Author(s):

Olivier Chapleur

Keyword(s):

Anaerobic Digestion ◽

Data Integration ◽

Time Course ◽

Omics Data ◽

Omics Data Integration

Multi-omics data integration reveals correlated regulatory features of triple negative breast cancer

Molecular Omics ◽

10.1039/d1mo00117e ◽

2021 ◽

Author(s):

Kevin Chappell ◽

Kanishka Manna ◽

Charity L. Washam ◽

Stefan Graw ◽

Duah Alkam ◽

...

Keyword(s):

Breast Cancer ◽

Data Integration ◽

Triple Negative Breast Cancer ◽

Triple Negative ◽

Biological Pathways ◽

Omics Data ◽

Insight Into ◽

Omics Data Integration

Multi-omics data integration of triple negative breast cancer (TNBC) provides insight into biological pathways.

Integration of enzyme constraints in a genome-scale metabolic model of Aspergillus niger improves phenotype predictions

Microbial Cell Factories ◽

10.1186/s12934-021-01614-2 ◽

2021 ◽

Vol 20 (1) ◽

Author(s):

Jingru Zhou ◽

Yingping Zhuang ◽

Jianye Xia

Keyword(s):

Aspergillus Niger ◽

Large Scale ◽

Measurement Techniques ◽

Metabolic Model ◽

System Level ◽

Metabolic Phenotype ◽

Omics Data ◽

Prediction Ability ◽

Phenotype Prediction ◽

Genome Scale

Abstract Background Genome-scale metabolic model (GSMM) is a powerful tool for the study of cellular metabolic characteristics. With the development of multi-omics measurement techniques in recent years, new methods that integrating multi-omics data into the GSMM show promising effects on the predicted results. It does not only improve the accuracy of phenotype prediction but also enhances the reliability of the model for simulating complex biochemical phenomena, which can promote theoretical breakthroughs for specific gene target identification or better understanding the cell metabolism on the system level. Results Based on the basic GSMM model iHL1210 of Aspergillus niger, we integrated large-scale enzyme kinetics and proteomics data to establish a GSMM based on enzyme constraints, termed a GEM with Enzymatic Constraints using Kinetic and Omics data (GECKO). The results show that enzyme constraints effectively improve the model’s phenotype prediction ability, and extended the model’s potential to guide target gene identification through predicting metabolic phenotype changes of A. niger by simulating gene knockout. In addition, enzyme constraints significantly reduced the solution space of the model, i.e., flux variability over 40.10% metabolic reactions were significantly reduced. The new model showed also versatility in other aspects, like estimating large-scale $$k_{{cat}}$$ k cat values, predicting the differential expression of enzymes under different growth conditions. Conclusions This study shows that incorporating enzymes’ abundance information into GSMM is very effective for improving model performance with A. niger. Enzyme-constrained model can be used as a powerful tool for predicting the metabolic phenotype of A. niger by incorporating proteome data. In the foreseeable future, with the fast development of measurement techniques, and more precise and rich proteomics quantitative data being obtained for A. niger, the enzyme-constrained GSMM model will show greater application space on the system level.

MuSA: a graphical user interface for multi-OMICs data integration in radiogenomic studies

Scientific Reports ◽

10.1038/s41598-021-81200-z ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Mario Zanfardino ◽

Rossana Castaldo ◽

Katia Pane ◽

Ornella Affinito ◽

Marco Aiello ◽

...

Keyword(s):

User Interface ◽

Data Integration ◽

Graphical User Interface ◽

Data Science ◽

Heterogeneous Data ◽

Biological Information ◽

Omics Data ◽

Correlation Clustering ◽

Downstream Analysis ◽

Omics Data Integration

AbstractAnalysis of large-scale omics data along with biomedical images has gaining a huge interest in predicting phenotypic conditions towards personalized medicine. Multiple layers of investigations such as genomics, transcriptomics and proteomics, have led to high dimensionality and heterogeneity of data. Multi-omics data integration can provide meaningful contribution to early diagnosis and an accurate estimate of prognosis and treatment in cancer. Some multi-layer data structures have been developed to integrate multi-omics biological information, but none of these has been developed and evaluated to include radiomic data. We proposed to use MultiAssayExperiment (MAE) as an integrated data structure to combine multi-omics data facilitating the exploration of heterogeneous data. We improved the usability of the MAE, developing a Multi-omics Statistical Approaches (MuSA) tool that uses a Shiny graphical user interface, able to simplify the management and the analysis of radiogenomic datasets. The capabilities of MuSA were shown using public breast cancer datasets from TCGA-TCIA databases. MuSA architecture is modular and can be divided in Pre-processing and Downstream analysis. The pre-processing section allows data filtering and normalization. The downstream analysis section contains modules for data science such as correlation, clustering (i.e., heatmap) and feature selection methods. The results are dynamically shown in MuSA. MuSA tool provides an easy-to-use way to create, manage and analyze radiogenomic data. The application is specifically designed to guide no-programmer researchers through different computational steps. Integration analysis is implemented in a modular structure, making MuSA an easily expansible open-source software.

Multilevel heterogeneous omics data integration with kernel fusion

Briefings in Bioinformatics ◽

10.1093/bib/bby115 ◽

2018 ◽

Cited By ~ 1

Author(s):

Haitao Yang ◽

Hongyan Cao ◽

Tao He ◽

Tong Wang ◽

Yuehua Cui

Keyword(s):

Data Integration ◽

Omics Data ◽

Kernel Fusion ◽

Omics Data Integration

A Cloud Solution for Multi-omics Data Integration

2016 Intl IEEE Conferences on Ubiquitous Intelligence & Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People, and Smart World Congress (UIC/ATC/ScalCom/CBDCom/IoP/SmartWorld) ◽

10.1109/uic-atc-scalcom-cbdcom-iop-smartworld.2016.0096 ◽

2016 ◽

Author(s):

Fabio Tordini

Keyword(s):

Data Integration ◽

Omics Data ◽

Omics Data Integration

Multi-omics data integration in the Cloud

Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics ◽

10.1145/3388440.3414917 ◽

2020 ◽

Author(s):

Kawther Abdilleh ◽

Boris Aguilar ◽

J. Ross Thomson

Keyword(s):

Data Integration ◽

Omics Data ◽

Omics Data Integration

MIC-Drop: A platform for large-scale in vivo CRISPR screens

Science ◽

10.1126/science.abi8870 ◽

2021 ◽

pp. eabi8870

Author(s):

Saba Parvez ◽

Chelsea Herdman ◽

Manu Beerens ◽

Korak Chakraborti ◽

Zachary P. Harmer ◽

...

Keyword(s):

Large Scale ◽

Cultured Cells ◽

Cardiac Development ◽

Droplet Microfluidics ◽

Model Organisms ◽

Genetic Screens ◽

Large Numbers ◽

And Function ◽

Genome Scale

CRISPR-Cas9 can be scaled up for large-scale screens in cultured cells, but CRISPR screens in animals have been challenging because generating, validating, and keeping track of large numbers of mutant animals is prohibitive. Here, we report Multiplexed Intermixed CRISPR Droplets (MIC-Drop), a platform combining droplet microfluidics, single-needle en masse CRISPR ribonucleoprotein injections, and DNA barcoding to enable large-scale functional genetic screens in zebrafish. The platform can efficiently identify genes responsible for morphological or behavioral phenotypes. In one application, we show MIC-Drop can identify small molecule targets. Furthermore, in a MIC-Drop screen of 188 poorly characterized genes, we discover several genes important for cardiac development and function. With the potential to scale to thousands of genes, MIC-Drop enables genome-scale reverse-genetic screens in model organisms.