Increasing consensus of context-specific metabolic models by integrating data-inferred cell functions

AbstractGenome-scale metabolic models provide a valuable context for analyzing data from diverse high-throughput experimental techniques. Models can quantify the activities of diverse pathways and cellular functions. Since some metabolic reactions are only catalyzed in specific environments, several algorithms exist that build context-specific models. However, these methods make differing assumptions that influence the content and associated predictive capacity of resulting models, such that model content varies more due to methods used than cell types. Here we overcome this problem with a novel framework for inferring the metabolic functions of a cell before model construction. For this, we curated a list of metabolic tasks and developed a framework to infer the activity of these functionalities from transcriptomic data. We protected the data-inferred tasks during the implementation of diverse context-specific model extraction algorithms for 44 cancer cell lines. We show that the protection of data-inferred metabolic tasks decreases the variability of models across extraction methods. Furthermore, resulting models better capture the actual biological variability across cell lines. This study highlights the potential of using biological knowledge, inferred from omics data, to obtain a better consensus between existing extraction algorithms. It further provides guidelines for the development of the next-generation of data contextualization methods.

Download Full-text

StanDep: capturing transcriptomic variability improves context-specific metabolic models

10.1101/594861 ◽

2019 ◽

Cited By ~ 1

Author(s):

Chintan J. Joshi ◽

Song-Min Schinn ◽

Anne Richelle ◽

Isaac Shamie ◽

Eyleen J. O’Rourke ◽

...

Keyword(s):

Expression Profiles ◽

Heuristic Method ◽

Housekeeping Genes ◽

Extraction Methods ◽

Expression Data ◽

Matlab Toolbox ◽

Metabolic Models ◽

Using Data ◽

Context Specific ◽

Genome Scale

AbstractDiverse algorithms can integrate transcriptomics with genome-scale metabolic models (GEMs) to build context-specific metabolic models. These algorithms require identification of a list of high confidence (core) reactions from transcriptomics, but parameters related to identification of core reactions, such as thresholding of expression profiles, can significantly change model content. Importantly, current thresholding approaches are burdened with setting singular arbitrary thresholds for all genes; thus, resulting in removal of enzymes needed in small amounts and even many housekeeping genes. Here, we describe StanDep, a novel heuristic method for using transcriptomics to identify core reactions prior to building context-specific metabolic models. StanDep clusters gene expression data based on their expression pattern across different contexts and determines thresholds for each cluster using data-dependent statistics, specifically standard deviation and mean. To demonstrate the use of StanDep, we built hundreds of models for the NCI-60 cancer cell lines. These models successfully increased the inclusion of housekeeping reactions, which are often lost in models built using standard thresholding approaches. Further, StanDep also provided a transcriptomic explanation for inclusion of lowly expressed reactions that were otherwise only supported by model extraction methods. Our study also provides novel insights into how cells may deal with context-specific and ubiquitous functions. StanDep, as a MATLAB toolbox, is available at https://github.com/LewisLabUCSD/StanDep

Download Full-text

On the inconsistent treatment of gene-protein-reaction rules in context-specific metabolic models

10.1101/593277 ◽

2019 ◽

Author(s):

Miguel Ponce-de-León ◽

Iñigo Apaolaza ◽

Alfonso Valencia ◽

Francisco J. Planes

Keyword(s):

Gene Expression ◽

Metabolic Model ◽

Specific Model ◽

Omics Data ◽

Model Reconstruction ◽

Reconstruction Algorithms ◽

Reconstruction Methods ◽

Metabolic Models ◽

Context Specific ◽

Genome Scale

ABSTRACTWith the publication of high-quality genome-scale metabolic models for several organisms, the Systems Biology community has developed a number of algorithms for their analysis making use of ever growing –omics data. In particular, the reconstruction of the first genome-scale human metabolic model, Recon1, promoted the development of Context-Specific Model (CS-Model) reconstruction methods. This family of algorithms aims to identify the set of metabolic reactions that are active in a cell in a given condition using omics data, such as gene expression levels. Different CS-Model reconstruction algorithms have their own strengths and weaknesses depending on the problem under study and omics data available. However, after careful inspection, we found that all of these algorithms share common issues in the way GPR rules and gene expression data are treated. The first issue is related with how gapfilling reactions are managed after the reconstruction is conducted. The second issue concerns the molecular context, which is used to build the CS-model but neglected for posterior analyses. To evaluate the effect of these issues, we reconstructed ∼400 CS-Models of cancer cell lines and conducted gene essentiality analysis, using CRISPR–Cas9 essentiality data for validation purposes. Altogether, our results illustrate the importance of correcting the errors introduced during the GPR translation in many of the published metabolic reconstructions.

Download Full-text

Increasing consensus of context-specific metabolic models by integrating data-inferred cell functions

PLoS Computational Biology ◽

10.1371/journal.pcbi.1006867 ◽

2019 ◽

Vol 15 (4) ◽

pp. e1006867 ◽

Cited By ~ 18

Author(s):

Anne Richelle ◽

Austin W. T. Chiang ◽

Chih-Chung Kuo ◽

Nathan E. Lewis

Keyword(s):

Cell Functions ◽

Metabolic Models ◽

Context Specific

Download Full-text

A flexible ontology for inference of emergent whole cell function from relationships between subcellular processes

10.1101/112201 ◽

2017 ◽

Author(s):

Jens Hansen ◽

David Meretzky ◽

Simeneh Woldesenbet ◽

Gustavo Stolovitzky ◽

Ravi Iyengar

Keyword(s):

Cell Biology ◽

Cell Function ◽

Biological Knowledge ◽

Dependent Manner ◽

Whole Cell ◽

Cell Functions ◽

Cell Responses ◽

Growing Cell ◽

Cell Ontology ◽

Context Specific

AbstractWhole cell responses arise from coordinated interactions between diverse human gene products functioning within various pathways underlying sub-cellular processes (SCP). Lower level SCPs interact to form higher level SCPs, often in a context specific manner to give rise to whole cell function. We sought to determine if capturing such relationships enables us to describe the emergence of whole cell functions from interacting SCPs. We developed the “Molecular Biology of the Cell” ontology based on standard cell biology and biochemistry textbooks and review articles. Currently, our ontology contains 5,385 genes, 753 SCPs and 19,180 expertly curated gene-SCP associations. Our algorithm to populate the SCPs with genes enables extension of the ontology on demand and the adaption of the ontology to the continuously growing cell biological knowledge. Since whole cell responses most often arise from the coordinated activity of multiple SCPs, we developed a dynamic enrichment algorithm that flexibly predicts SCP-SCP relationships beyond the current taxonomy. This algorithm enables us to identify interactions between SCPs as a basis for higher order function in a context dependent manner, allowing us to provide a detailed description of how SCPs together can give rise to whole cell functions. We conclude that this ontology can, from omics data sets, enable the development of detailed multidimensional SCP networks for predictive modeling of emergent whole cell functions.

Download Full-text

Direct LC-MS/MS Analysis of Extra- and Intracellular Glycerophosphoinositol in Model Cancer Cell Lines

Frontiers in Immunology ◽

10.3389/fimmu.2021.646681 ◽

2021 ◽

Vol 12 ◽

Author(s):

Ana Margarida Campos ◽

Genoveffa Nuzzo ◽

Alessia Varone ◽

Paola Italiani ◽

Diana Boraschi ◽

...

Keyword(s):

Cell Lines ◽

Mammalian Cells ◽

Solid Phase ◽

Cell Types ◽

Eukaryotic Cell ◽

Water Soluble ◽

Extracellular Milieu ◽

Cell Functions

Glycerophosphoinositols (GPIs) are water-soluble bioactive phospholipid derivatives of increasing interest as intracellular and paracrine mediators of eukaryotic cell functions. The most representative compound of the family is glycerophosphoinositol (GroPIns), an ubiquitous component of mammalian cells that participates in cell proliferation, cell survival and cell response to stimuli. Levels and activity of this compound vary among cell types and deciphering these functions requires accurate measurements in in vitro and in vivo models. The conventional approaches for the analysis of GroPIns pose several issues in terms of sensitivity and product resolution, especially when the product is in the extracellular milieu. Here we present an UPLC-MS study for the quantitative analysis of this lipid derivative in cells and, for the first time, culture supernatants. The method is based on a solid-phase extraction that allows for fast desalting and analyte concentration. The robustness of the procedure was tested on the simultaneous measurements of intra- and extracellular levels of GroPIns in a number of human cell lines where it has been shown that the non-transformed cells are characterized by high extracellular level of GroPIns, whereas the tumor cells tended to have higher intracellular levels.

Download Full-text

A pipeline for the reconstruction and evaluation of context-specific human metabolic models at a large-scale

10.1101/2021.07.22.453372 ◽

2021 ◽

Author(s):

Vítor Vieira ◽

Jorge Ferreira ◽

Miguel Rocha

Keyword(s):

Cell Line ◽

Cell Lines ◽

Cancer Cell ◽

Large Scale ◽

Breast Cancer Cell Lines ◽

Mathematical Framework ◽

Cancer Aggressiveness ◽

Transcriptomics Data ◽

Metabolic Models ◽

Context Specific

Constraint-based (CB) metabolic models provide a mathematical framework and scaffold for in silico cell metabolism analysis and manipulation. In the past decade, significant efforts have been done to model human metabolism, enabled by the increased availability of multi-omics datasets and curated genome-scale reconstructions, as well as the development of several algorithms for context-specific model (CSM) reconstruction. Although CSM reconstruction has revealed insights on the deregulated metabolism of several pathologies, the process of reconstructing representative models of human tissues still lacks benchmarks and appropriate integrated software frameworks, since many tools required for this process are still disperse across various software platforms, some of which are proprietary. In this work, we address this challenge by assembling a scalable CSM reconstruction pipeline capable of integrating transcriptomics data in CB models. We combined omics preprocessing methods inspired by previous efforts with in-house implementations of existing CSM algorithms and new model refinement and validation routines, all implemented in the Troppo Python-based open-source framework. The pipeline was validated with multi-omics datasets from the Cancer Cell Line Encyclopedia (CCLE), also including reference fluxomics measurements for the MCF7 cell line. We reconstructed over 6000 models based on the Human-GEM template model for 733 cell lines featured in the CCLE, using MCF7 models as reference to find the best parameter combinations. These reference models outperform earlier studies using the same template by comparing gene essentiality and fluxomics experiments. We also analysed the heterogeneity of breast cancer cell lines, identifying key changes in metabolism related to cancer aggressiveness. Despite the many challenges in CB modelling, we demonstrate using our pipeline that combining transcriptomics data in metabolic models can be used to investigate key metabolic shifts. Significant limitations were found on these models ability for reliable quantitative flux prediction, thus motivating further work in genome-wide phenotype prediction

Download Full-text

On the limits of active module identification

Briefings in Bioinformatics ◽

10.1093/bib/bbab066 ◽

2021 ◽

Author(s):

Olga Lazareva ◽

Jan Baumbach ◽

Markus List ◽

David B Blumenthal

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Small Diameter ◽

Extensive Study ◽

Biological Knowledge ◽

Expression Data ◽

Module Identification ◽

Ppi Networks ◽

Novel Algorithms ◽

Context Specific

Abstract In network and systems medicine, active module identification methods (AMIMs) are widely used for discovering candidate molecular disease mechanisms. To this end, AMIMs combine network analysis algorithms with molecular profiling data, most commonly, by projecting gene expression data onto generic protein–protein interaction (PPI) networks. Although active module identification has led to various novel insights into complex diseases, there is increasing awareness in the field that the combination of gene expression data and PPI network is problematic because up-to-date PPI networks have a very small diameter and are subject to both technical and literature bias. In this paper, we report the results of an extensive study where we analyzed for the first time whether widely used AMIMs really benefit from using PPI networks. Our results clearly show that, except for the recently proposed AMIM DOMINO, the tested AMIMs do not produce biologically more meaningful candidate disease modules on widely used PPI networks than on random networks with the same node degrees. AMIMs hence mainly learn from the node degrees and mostly fail to exploit the biological knowledge encoded in the edges of the PPI networks. This has far-reaching consequences for the field of active module identification. In particular, we suggest that novel algorithms are needed which overcome the degree bias of most existing AMIMs and/or work with customized, context-specific networks instead of generic PPI networks.

Download Full-text

Human Immunodeficiency Viruses Pseudotyped with SARS-CoV-2 Spike Proteins Infect a Broad Spectrum of Human Cell Lines through Multiple Entry Mechanisms

Viruses ◽

10.3390/v13060953 ◽

2021 ◽

Vol 13 (6) ◽

pp. 953

Author(s):

Chuan Xu ◽

Annie Wang ◽

Ke Geng ◽

William Honnen ◽

Xuening Wang ◽

...

Keyword(s):

Cell Lines ◽

Viral Entry ◽

Broad Spectrum ◽

A549 Cells ◽

Cell Types ◽

Pseudotyped Virus ◽

Angiotensin Converting Enzyme 2 ◽

Human Immunodeficiency Viruses ◽

Tyrosine Protein Kinase ◽

Overexpressing Cell

Severe acute respiratory syndrome-related coronavirus (SARS-CoV-2), the causative agent of coronavirus disease 19 (COVID-19), enters cells through attachment to the human angiotensin converting enzyme 2 (hACE2) via the receptor-binding domain (RBD) in the surface/spike (S) protein. Several pseudotyped viruses expressing SARS-CoV-2 S proteins are available, but many of these can only infect hACE2-overexpressing cell lines. Here, we report the use of a simple, two-plasmid, pseudotyped virus system comprising a SARS-CoV-2 spike-expressing plasmid and an HIV vector with or without vpr to investigate the SARS-CoV-2 entry event in various cell lines. When an HIV vector without vpr was used, pseudotyped SARS-CoV-2 viruses produced in the presence of fetal bovine serum (FBS) were able to infect only engineered hACE2-overexpressing cell lines, whereas viruses produced under serum-free conditions were able to infect a broader range of cells, including cells without hACE2 overexpression. When an HIV vector containing vpr was used, pseudotyped viruses were able to infect a broad spectrum of cell types regardless of whether viruses were produced in the presence or absence of FBS. Infection sensitivities of various cell types did not correlate with mRNA abundance of hACE2, TMPRSS2, or TMPRSS4. Pseudotyped SARS-CoV-2 viruses and replication-competent SARS-CoV-2 virus were equally sensitive to neutralization by an anti-spike RBD antibody in cells with high abundance of hACE2. However, the anti-spike RBD antibody did not block pseudotyped viral entry into cell lines with low abundance of hACE2. We further found that CD147 was involved in viral entry in A549 cells with low abundance of hACE2. Thus, our assay is useful for drug and antibody screening as well as for investigating cellular receptors, including hACE2, CD147, and tyrosine-protein kinase receptor UFO (AXL), for the SARS-CoV-2 entry event in various cell lines.

Download Full-text

Infection, dissemination, and transmission efficiencies of Zika virus in Aedes aegypti after serial passage in mosquito or mammalian cell lines or alternating passage in both cell types

Parasites & Vectors ◽

10.1186/s13071-021-04726-1 ◽

2021 ◽

Vol 14 (1) ◽

Author(s):

Lourdes G. Talavera-Aguilar ◽

Reyes A. Murrieta ◽

Sungmin Kiem ◽

Rosa C. Cetina-Trejo ◽

Carlos M. Baak-Baak ◽

...

Keyword(s):

Aedes Aegypti ◽

Cell Lines ◽

Zika Virus ◽

Cell Types ◽

Mosquito Cell ◽

Cell Passage ◽

Genetic Changes ◽

Synonymous Substitutions ◽

Vertebrate Cell

Abstract Background Zika virus (ZIKV) is an arthropod-borne virus (arbovirus) with an urban transmission cycle that primarily involves humans and Aedes aegypti. Evidence suggests that the evolution of some arboviruses is constrained by their dependency on alternating between disparate (vertebrate and invertebrate) hosts. The goals of this study are to compare the genetic changes that occur in ZIKV after serial passaging in mosquito or vertebrate cell lines or alternate passaging in both cell types and to compare the replication, dissemination, and transmission efficiencies of the cell culture-derived viruses in Ae. aegypti. Methods An isolate of ZIKV originally acquired from a febrile patient in Yucatan, Mexico, was serially passaged six times in African green monkey kidney (Vero) cells or Aedes albopictus (C6/36) cells or both cell types by alternating passage. A colony of Ae. aegypti from Yucatan was established, and mosquitoes were challenged with the cell-adapted viruses. Midguts, Malpighian tubules, ovaries, salivary glands, wings/legs and saliva were collected at various times after challenge and tested for evidence of virus infection. Results Genome sequencing revealed the presence of two non-synonymous substitutions in the premembrane and NS1 regions of the mosquito cell-adapted virus and two non-synonymous substitutions in the capsid and NS2A regions of both the vertebrate cell-adapted and alternate-passaged viruses. Additional genetic changes were identified by intrahost variant frequency analysis. Virus maintained by continuous C6/36 cell passage was significantly more infectious in Ae. aegypti than viruses maintained by alternating passage and consecutive Vero cell passage. Conclusions Mosquito cell-adapted ZIKV displayed greater in vivo fitness in Ae. aegypti compared to the other viruses, indicating that obligate cycling between disparate hosts carries a fitness cost. These data increase our understanding of the factors that drive ZIKV adaptation and evolution and underscore the important need to consider the in vivo passage histories of flaviviruses to be evaluated in vector competence studies. Graphic abstract "Image missing"

Download Full-text

The plasminogen system and cell surfaces: evidence for plasminogen and urokinase receptors on the same cell type.

The Journal of Cell Biology ◽

10.1083/jcb.103.6.2411 ◽

1986 ◽

Vol 103 (6) ◽

pp. 2411-2420 ◽

Cited By ~ 276

Author(s):

E F Plow ◽

D E Freaney ◽

J Plescia ◽

L A Miles

Keyword(s):

Cell Line ◽

Plasminogen Activator ◽

Cell Lines ◽

Specific Binding ◽

High Capacity ◽

Fetal Lung ◽

Fibroblast Cell ◽

Cell Types ◽

Cell Surfaces ◽

Catalytically Active

The capacity of cells to interact with the plasminogen activator, urokinase, and the zymogen, plasminogen, was assessed using the promyeloid leukemic U937 cell line and the diploid fetal lung GM1380 fibroblast cell line. Urokinase bound to both cell lines in a time-dependent, specific, and saturable manner (Kd = 0.8-2.0 nM). An active catalytic site was not required for urokinase binding to the cells, and 55,000-mol-wt urokinase was selectively recognized. Plasminogen also bound to the two cell lines in a specific and saturable manner. This interaction occurred with a Kd of 0.8-0.9 microM and was of very high capacity (1.6-3.1 X 10(7) molecules bound/cell). The interaction of plasminogen with both cell types was partially sensitive to trypsinization of the cells and required an unoccupied high affinity lysine-binding site in the ligand. When plasminogen was added to the GM1380 cells, a line with high intrinsic plasminogen activator activity, the bound ligand was comprised of both plasminogen and plasmin. Urokinase, in catalytically active or inactive form, enhanced plasminogen binding to the two cell lines by 1.4-3.3-fold. Plasmin was the predominant form of the bound ligand when active urokinase was added, and preformed plasmin can also bind directly to the cells. Plasmin on the cell surface was also protected from its primary inhibitor, alpha 2-antiplasmin. These results indicate that the two cell lines possess specific binding sites for plasminogen and urokinase, and a family of widely distributed cellular receptors for these components may be considered. Endogenous or exogenous plasminogen activators can generate plasmin on cell surfaces, and such activation may provide a mechanism for arming cell surfaces with the broad proteolytic activity of this enzyme.

Download Full-text