scholarly journals Interpretable, scalable, and transferrable functional projection of large-scale transcriptome data using constrained matrix decomposition

2021 ◽  
Author(s):  
Nicholas Panchy ◽  
Kazuhide Watanabe ◽  
Tian Hong

AbstractLarge-scale transcriptome data, such as single-cell RNA-sequencing data, have provided unprecedented resources for studying biological processes at the systems level. Numerous dimensionality reduction methods have been developed to visualize and analyze these transcriptome data. In addition, several existing methods allow inference of functional variations among samples using gene sets with known biological functions. However, it remains challenging to analyze transcriptomes with reduced dimensions that are both interpretable in terms of dimensions’ directionalities and transferrable to new data. In this study, we used gene set non-negative principal component analysis (gsPCA) and non-negative matrix factorization (gsNMF) to analyze large-scale transcriptome datasets. We found that these methods provide low-dimensional information about the progression of biological processes in a quantitative manner, and their performances are comparable to existing functional variation analysis methods in terms of distinguishing multiple cell states and samples from multiple conditions. Remarkably, upon training with a subset of data, these methods allow predictions of locations in the functional space using data from experimental conditions that are not exposed to the models. Specifically, our models predicted the extent of progression and reversion for cells in the epithelial-mesenchymal transition (EMT) continuum. These methods revealed conserved EMT program among multiple types of single cells and tumor samples. Finally, we provide several recommendations on the choice between the two linear methods and the optimal algorithmic parameters. Our methods show that simple constrained matrix decomposition can produce to low-dimensional information in functionally interpretable and transferrable space, and can be widely useful for analyzing large-scale transcriptome data.

2021 ◽  
Vol 12 ◽  
Author(s):  
Nicholas Panchy ◽  
Kazuhide Watanabe ◽  
Tian Hong

Large-scale transcriptome data, such as single-cell RNA-sequencing data, have provided unprecedented resources for studying biological processes at the systems level. Numerous dimensionality reduction methods have been developed to visualize and analyze these transcriptome data. In addition, several existing methods allow inference of functional variations among samples using gene sets with known biological functions. However, it remains challenging to analyze transcriptomes with reduced dimensions that are interpretable in terms of dimensions’ directionalities, transferrable to new data, and directly expose the contribution or association of individual genes. In this study, we used gene set non-negative principal component analysis (gsPCA) and non-negative matrix factorization (gsNMF) to analyze large-scale transcriptome datasets. We found that these methods provide low-dimensional information about the progression of biological processes in a quantitative manner, and their performances are comparable to existing functional variation analysis methods in terms of distinguishing multiple cell states and samples from multiple conditions. Remarkably, upon training with a subset of data, these methods allow predictions of locations in the functional space using data from experimental conditions that are not exposed to the models. Specifically, our models predicted the extent of progression and reversion for cells in the epithelial-mesenchymal transition (EMT) continuum. These methods revealed conserved EMT program among multiple types of single cells and tumor samples. Finally, we demonstrate this approach is broadly applicable to data and gene sets beyond EMT and provide several recommendations on the choice between the two linear methods and the optimal algorithmic parameters. Our methods show that simple constrained matrix decomposition can produce to low-dimensional information in functionally interpretable and transferrable space, and can be widely useful for analyzing large-scale transcriptome data.


2021 ◽  
Vol 12 (2) ◽  
Author(s):  
Xiaoli Liu ◽  
Zuwei Yin ◽  
Linping Xu ◽  
Huaimin Liu ◽  
Lifeng Jiang ◽  
...  

AbstractLong noncoding RNAs (lncRNAs) play crucial roles in regulating a variety of biological processes in lung adenocarcinoma (LUAD). In our study, we mainly explored the functional roles of a novel lncRNA long intergenic non-protein coding RNA 1426 (LINC01426) in LUAD. We applied bioinformatics analysis to find the expression of LINC01426 was upregulated in LUAD tissue. Functionally, silencing of LINC01426 obviously suppressed the proliferation, migration, epithelial–mesenchymal transition (EMT), and stemness of LUAD cells. Then, we observed that LINC01426 functioned through the hedgehog pathway in LUAD. The effect of LINC01426 knockdown could be fully reversed by adding hedgehog pathway activator SAG. In addition, we proved that LINC01426 could not affect SHH transcription and its mRNA level. Pull-down sliver staining and RIP assay revealed that LINC01426 could interact with USP22. Ubiquitination assays manifested that LINC01426 and USP22 modulated SHH ubiquitination levels. Rescue assays verified that SHH overexpression rescued the cell growth, migration, and stemness suppressed by LINC01426 silencing. In conclusion, LINC01426 promotes LUAD progression by recruiting USP22 to stabilize SHH protein and thus activate the hedgehog pathway.


Author(s):  
Jinfen Wei ◽  
Zixi Chen ◽  
Meiling Hu ◽  
Ziqing He ◽  
Dawei Jiang ◽  
...  

Hypoxia is a characteristic of tumor microenvironment (TME) and is a major contributor to tumor progression. Yet, subtype identification of tumor-associated non-malignant cells at single-cell resolution and how they influence cancer progression under hypoxia TME remain largely unexplored. Here, we used RNA-seq data of 424,194 single cells from 108 patients to identify the subtypes of cancer cells, stromal cells, and immune cells; to evaluate their hypoxia score; and also to uncover potential interaction signals between these cells in vivo across six cancer types. We identified SPP1+ tumor-associated macrophage (TAM) subpopulation potentially enhanced epithelial–mesenchymal transition (EMT) by interaction with cancer cells through paracrine pattern. We prioritized SPP1 as a TAM-secreted factor to act on cancer cells and found a significant enhanced migration phenotype and invasion ability in A549 lung cancer cells induced by recombinant protein SPP1. Besides, prognostic analysis indicated that a higher expression of SPP1 was found to be related to worse clinical outcome in six cancer types. SPP1 expression was higher in hypoxia-high macrophages based on single-cell data, which was further validated by an in vitro experiment that SPP1 was upregulated in macrophages under hypoxia-cultured compared with normoxic conditions. Additionally, a differential analysis demonstrated that hypoxia potentially influences extracellular matrix remodeling, glycolysis, and interleukin-10 signal activation in various cancer types. Our work illuminates the clearer underlying mechanism in the intricate interaction between different cell subtypes within hypoxia TME and proposes the guidelines for the development of therapeutic targets specifically for patients with high proportion of SPP1+ TAMs in hypoxic lesions.


2021 ◽  
Author(s):  
Ruoyan Li ◽  
John R. Ferdinand ◽  
Kevin W. Loudon ◽  
Georgina S. Bowyer ◽  
Lira Mamanova ◽  
...  

Tumour behaviour is dependent on the oncogenic properties of cancer cells and their multi-cellular interactions. These dependencies were examined through 270,000 single cell transcriptomes and 100 micro-dissected whole exomes obtained from 12 patients with kidney tumours. Tissue was sampled from multiple regions of tumour core, tumour-normal interface, normal surrounding tissues, and peripheral blood. We found the principal spatial location of CD8+ T cell clonotypes largely defined exhaustion state, with clonotypic heterogeneity not explained by somatic intra-tumoural heterogeneity. De novo mutation calling from single cell RNA sequencing data allows us to lineage-trace and infer clonality of cells. We discovered six meta-programmes that distinguish tumour cell function. An epithelial-mesenchymal transition meta-programme, enriched at the tumour-normal interface appears modulated through macrophage expressed IL1B, potentially forming a therapeutic target.


2020 ◽  
Author(s):  
Gregor Sturm ◽  
Tamas Szabo ◽  
Georgios Fotakis ◽  
Marlene Haider ◽  
Dietmar Rieder ◽  
...  

AbstractSummaryAdvances in single-cell technologies have enabled the investigation of T cell phenotypes and repertoires at unprecedented resolution and scale. Bioinformatic methods for the efficient analysis of these large-scale datasets are instrumental for advancing our understanding of adaptive immune responses in cancer, but also in infectious diseases like COVID-19. However, while well-established solutions are accessible for the processing of single-cell transcriptomes, no streamlined pipelines are available for the comprehensive characterization of T cell receptors. Here we propose Scirpy, a scalable Python toolkit that provides simplified access to the analysis and visualization of immune repertoires from single cells and seamless integration with transcriptomic data.Availability and implementationScirpy source code and documentation are available at https://github.com/icbi-lab/scirpy.


2017 ◽  
Author(s):  
Akpéli V. Nordor ◽  
Djamel Nehar-Belaid ◽  
Sophie Richon ◽  
David Klatzmann ◽  
Dominique Bellet ◽  
...  

ABSTRACTBackgroundThe placenta relies on phenotypes that are characteristic of cancer to successfully implant the embryo in the uterus during early pregnancy. Notably, it has to invade its host tissues, promote angiogenesis, while surviving hypoxia, and escape the immune system. Similarities in DNA methylation patterns between the placenta and cancers suggest that common epigenetic mechanisms may be involved in regulating these behaviors.ResultsWe show here that megabase-scale patterns of hypomethylation distinguish first from third trimester chorionic villi in the placenta, and that these patterns mirror those that distinguish many tumors from corresponding normal tissues. We confirmed these findings in villous cytotrophoblasts isolated from the placenta and identified a time window at the end of the first trimester, when these cells come into contact with maternal blood as the likely time period for the methylome alterations. Furthermore, the large genomic regions affected by these patterns of hypomethylation encompass genes involved in pathways related to epithelial-mesenchymal transition (EMT), immune response and inflammation. Analyses of expression profiles corresponding to genes in these hypomethylated regions in colon adenocarcinoma tumors point to networks of differentially expressed genes previously implicated in carcinogenesis and placentogenesis, where nuclear factor kappa B (NF-kB) is a key hub.ConclusionTaken together, our results suggest the existence of epigenetic switches involving large-scale changes of methylation in the placenta during pregnancy and in tumors during neoplastic transformation. The characterization of such epigenetic switches might lead to the identification of biomarkers and drug targets in oncology as well as in obstetrics and gynecology.


2016 ◽  
Author(s):  
Klaus Fiedler

AbstractAlpha 1-6 fucosyltransferase (Fut8) is known for its properties as an enhancer of nonsmall cell lung cancer metastasis and as a suppressor in hepatocellular carcinoma cells (Hep3B). Promising candidates of affected molecules include E-cadherin. In its absence, during epithelial-mesenchymal transition, the pathway triggers signaling to the nucleus via β-catenin-TCF/LEF. Contrarily, in less metastatic tumors, Fut8 stimulates cell-cell adhesion. Regulated classes of molecules could also include the sorting machinery of polarized epithelial cells, sorted ligands or both, that may be altered in cellular transformation. I have analyzed here the cargo receptor VIP36 (Vesicular-integral membrane protein of 36 kD) for carbohydrate interaction. It has been described as a lectin in the ERGIC (endoplasmic reticulum-Golgi intermediate compartment), Golgi apparatus and plasma membrane. The docking reveals top-interacting carbohydrates of the N-glycan and O-glycan class that encompass N-linked glycans of high mannose and equally complex type which likely function as sorted ligands in epithelial cells. O-glycans score lower and include core 2 residue binding. I show that fucose core modifications by Fut8 stimulate binding of N-linked glycans to VIP36, which is known to be different from binding of galectins 3 and 9. This suggests that Fut8-upregulation may directly alter the affinity of sorted cargo and may enhance the sorting to the apical pathway as exemplified in hepatocytes and traffic to bile. High affinity binding of the ganglioside GM1 carbohydrate headgroup to VIP36 suggests a linkage with protein and glycosphingolipid apical transfer in epithelial cells. Thus, this fundamental approach with large scale docking of 165 carbohydrates including 19 N-glycan high mannose, 17 Nglycan hybrid, 9 N-glycan complex, 17 O-glycan core, 27 Sialoside, 25 Fucoside and 51 other glycan residues suggests, that linked cargo-receptor apical transport may provide a path to epithelial polarization that may be modulated by core fucosylation.


2020 ◽  
Author(s):  
Xiaohong Hou ◽  
Guiyin Zhou ◽  
Yinchun Fan ◽  
Qiang Zhang ◽  
Chengming Xiang ◽  
...  

Abstract Background Glioblastoma (GBM) is one of the most malignant tumors that can afflict the central nervous system. Previous studies have observed that there are individual differences in the treatment response of immune checkpoint inhibitors in glioblastoma. This study’s aim is to ascertain the factors that may affect the efficacy of immunosuppressant therapy. Methods The clinical data of this study were obtained from a public database. Then, the data was analyzed and processed by R software and corresponding R package. To verify the results of the analysis, information was gathered from 89 GBM patients in our hospital and thereafter the corresponding paraffin sections were stained and quantitatively analyzed by immunohistochemistry. Results From the analysis, it was observed that both CD276 and HAVCR2 were significantly overexpressed in GBM and could be associated with patient prognosis. The analysis of single cell RNA sequencing data and GBM data analysis found an immune subtype with poor prognosis. Further analysis found that the high expression of CD276, HAVCR2 and CD163 was closely related to epithelial-mesenchymal transition (EMT) and could affect the patient prognosis of PD-L1 high expression. GSVA enrichment analysis showed that CD276, HAVCR2 and CD163 might induce EMT by JAK-STAT3 signaling pathway, and RUNX1 and IKZF1 might be transcription factors that regulate CD276/HAVCR2 high expression. Conclusions We found an immune subtype with poor prognosis of GBM, the high expression of CD276, HAVCR2 and CD163 with EMT are closely related and may be one of the factors affecting the efficacy of Anti-PD-L1.


Author(s):  
Nishanth Belugali Nataraj ◽  
Ilaria Marrocco ◽  
Yosef Yarden

Cancer is initiated largely by specific cohorts of genetic aberrations, which are generated by mutagens and often mimic active growth factor receptors, or downstream effectors. Once initiated cells outgrow and attract blood vessels, a multi-step process, called metastasis, disseminates cancer cells primarily through vascular routes. The major steps of the metastatic cascade comprise intravasation into blood vessels, circulation as single or collectives of cells, and eventual colonization of distant organs. Herein, we consider metastasis as a multi-step process that seized principles and molecular players employed by physiological processes, such as tissue regeneration and migration of neural crest progenitors. Our discussion contrasts the irreversible nature of mutagenesis, which establishes primary tumors, and the reversible epigenetic processes (e.g. epithelial–mesenchymal transition) underlying the establishment of micro-metastases and secondary tumors. Interestingly, analyses of sequencing data from untreated metastases inferred depletion of putative driver mutations among metastases, in line with the pivotal role played by growth factors and epigenetic processes in metastasis. Conceivably, driver mutations may not confer the same advantage in the microenvironment of the primary tumor and of the colonization site, hence phenotypic plasticity rather than rigid cellular states hardwired by mutations becomes advantageous during metastasis. We review the latest reported examples of growth factors harnessed by the metastatic cascade, with the goal of identifying opportunities for anti-metastasis interventions. In summary, because the overwhelming majority of cancer-associated deaths are caused by metastatic disease, understanding the complexity of metastasis, especially the roles played by growth factors, is vital for preventing, diagnosing and treating metastasis.


Sign in / Sign up

Export Citation Format

Share Document