scholarly journals Detecting Interactive Gene Groups for Single-Cell RNA-Seq Data Based on Co-Expression Network Analysis and Subgraph Learning

Cells ◽  
2020 ◽  
Vol 9 (9) ◽  
pp. 1938 ◽  
Author(s):  
Xiucai Ye ◽  
Weihang Zhang ◽  
Yasunori Futamura ◽  
Tetsuya Sakurai

High-throughput sequencing technologies have enabled the generation of single-cell RNA-seq (scRNA-seq) data, which explore both genetic heterogeneity and phenotypic variation between cells. Some methods have been proposed to detect the related genes causing cell-to-cell variability for understanding tumor heterogeneity. However, most existing methods detect the related genes separately, without considering gene interactions. In this paper, we proposed a novel learning framework to detect the interactive gene groups for scRNA-seq data based on co-expression network analysis and subgraph learning. We first utilized spectral clustering to identify the subpopulations of cells. For each cell subpopulation, the differentially expressed genes were then selected to construct a gene co-expression network. Finally, the interactive gene groups were detected by learning the dense subgraphs embedded in the gene co-expression networks. We applied the proposed learning framework on a real cancer scRNA-seq dataset to detect interactive gene groups of different cancer subtypes. Systematic gene ontology enrichment analysis was performed to examine the detected genes groups by summarizing the key biological processes and pathways. Our analysis shows that different subtypes exhibit distinct gene co-expression networks and interactive gene groups with different functional enrichment. The interactive genes are expected to yield important references for understanding tumor heterogeneity.

Author(s):  
Xuepu Sun ◽  
Yu Guo ◽  
Yu Zhang ◽  
Peng Zhao ◽  
Zhaoqing Wang ◽  
...  

Transcriptomes and DNA methylation of colon cancer at the single-cell level are used to identify marker genes and improve diagnoses and therapies. Seven colon cancer subtypes are recognized based on the single-cell RNA sequence, and the differentially expressed genes regulated by dysregulated methylation are identified as marker genes for different types of colon cancer. Compared with normal colon cells, marker genes of different types show very obvious specificity, especially upregulated genes in tumors. Functional enrichment analysis for marker genes indicates a possible relation between colon cancer and nervous system disease, moreover, the weak immune system is verified in colon cancer. The heightened expression of markers and the reduction of methylation in colon cancer promote tumor development in an extensive mechanism so that there is no biological process that can be enriched in different types.


2020 ◽  
Author(s):  
Yuzhou Chang ◽  
Carter Allen ◽  
Changlin Wan ◽  
Dongjun Chung ◽  
Chi Zhang ◽  
...  

AbstractSummarySingle-cell RNA-Seq (scRNA-Seq) data is useful in discovering cell heterogeneity and signature genes in specific cell populations in cancer and other complex diseases. Specifically, the investigation of functional gene modules (FGM) can help to understand gene interactive networks and complex biological processes. QUBIC2 is recognized as one of the most efficient and effective tools for FGM identification from scRNA-Seq data. However, its limited availability to a C implementation restricted its application to only a few downstream analyses functionalities. We developed an R package named IRIS-FGM (Integrative scRNA-Seq Interpretation System for Functional Gene Module analysis) to support the investigation of FGMs and cell clustering using scRNA-Seq data. Empowered by QUBIC2, IRIS-FGM can effectively identify co-expressed and co-regulated FGMs, predict cell types/clusters, uncover differentially expressed genes, and perform functional enrichment analysis. It is noteworthy that IRIS-FGM can also takes Seurat objects as input, which facilitate easy integration with existing analysis pipeline.Availability and ImplementationIRIS-FGM is implemented in R environment (as of version 3.6) with the source code freely available at https://github.com/OSU-BMBL/[email protected] informationSupplementary data are available at Bioinformatics online.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Negin Sheybani ◽  
Mohammad Reza Bakhtiarizadeh ◽  
Abdolreza Salehi

AbstractIn dairy cattle, endometritis is a severe infectious disease that occurs following parturition. It is clear that genetic factors are involved in the etiology of endometritis, however, the molecular pathogenesis of endometritis is not entirely understood. In this study, a system biology approach was used to better understand the molecular mechanisms underlying the development of endometritis. Forty transcriptomic datasets comprising of 20 RNA-Seq (GSE66825) and 20 miRNA-Seq (GSE66826) were obtained from the GEO database. Next, the co-expressed modules were constructed based on RNA-Seq (Rb-modules) and miRNA-Seq (mb-modules) data, separately, using a weighted gene co-expression network analysis (WGCNA) approach. Preservation analysis was used to find the non-preserved Rb-modules in endometritis samples. Afterward, the non-preserved Rb-modules were assigned to the mb-modules to construct the integrated regulatory networks. Just highly connected genes (hubs) in the networks were considered and functional enrichment analysis was used to identify the biological pathways associated with the development of the disease. Furthermore, additional bioinformatic analysis including protein–protein interactions network and miRNA target prediction were applied to enhance the reliability of the results. Thirty-five Rb-modules and 10 mb-modules were identified and 19 and 10 modules were non-preserved, respectively, which were enriched in biological pathways related to endometritis like inflammation and ciliogenesis. Two non-preserved Rb-modules were significantly assigned to three mb-modules and three and two important sub-networks in the Rb-modules were identified, respectively, including important mRNAs, lncRNAs and miRNAs genes like IRAK1, CASP3, CCDC40, CCDC39, ZMYND10, FOXJ1, TLR4, IL10, STAT3, FN1, AKT1, CD68, ENSBTAG00000049936, ENSBTAG00000050527, ENSBTAG00000051242, ENSBTAG00000049287, bta-miR-449, bta-miR-484, bta-miR-149, bta-miR-30b and bta-miR-423. The potential roles of these genes have been previously demonstrated in endometritis or related pathways, which reinforced putative functions of the suggested integrated regulatory networks in the endometritis pathogenesis. These findings may help further elucidate the underlying mechanisms of bovine endometritis.


2021 ◽  
Author(s):  
Weihao Chen ◽  
Zhifeng Li ◽  
Wei Sun ◽  
Mingxing Chu

Abstract Background:In sheep, FecB is the essential biomarker of the fertility, previous researches have provided a detailed insight on the regulation involved estrus phase and FecB in the reproductive-related tissues including hypothalamus, pituitary, and ovary. However, as the host of embryo development and connection between the ovary and the uterus, little is known about the interaction between mRNAs and lncRNAs in sheep oviduct. In the present study, RNA-Seq was performed to identify the transcriptomic profiles of mRNAs and lncRNAs in oviduct during estrus phase of sheep with FecBBB/++ genotypes.Results:In total, 21,863 lncRNAs and 43,674 mRNAs were identified, 57 DE lncRNAs and 637 DE mRNAs were revealed in the comparisons between follicular phase and luteal phase, 26 DE lncRNAs and 421 DE lncRNAs were revealed in the comparisons between FecB BB genotype and FecB ++ genotype. Functional enrichment analysis suggested that GO and KEGG terms related to reproduction such as SAGA complex, ATP-binding cassette (ABC), Nestin, and Hippo signalling pathway. DE-interaction network suggested that LNC_018420 maybe the key regulators related to embryo development in sheep oviduct.Conclusion:This was the first study to reveal the transcriptomic profiles of mRNAs and lncRNAs in the oviduct of FecB BB/++ sheep at estrus phase using RNA-Seq. Our findings can provide new understanding on the molecular mechanisms of mRNAs and lncRNAs underlying sheep embryo development and also opening new lines of investigation in sheep reproduction.


2020 ◽  
Author(s):  
Xi Pan ◽  
Jian-Hao Liu

Abstract Background Nasopharyngeal carcinoma (NPC) is a heterogeneous carcinoma that the underlying molecular mechanisms involved in the tumor initiation, progression, and migration are largely unclear. The purpose of the present study was to identify key biomarkers and small-molecule drugs for NPC screening, diagnosis, and therapy via gene expression profile analysis. Methods Raw microarray data of NPC were retrieved from the Gene Expression Omnibus (GEO) database and analyzed to screen out the potential differentially expressed genes (DEGs). The key modules associated with histology grade and tumor stage was identified by using weighted correlation network analysis (WGCNA). Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses of genes in the key module were performed to identify potential mechanisms. Candidate hub genes were obtained, which based on the criteria of module membership (MM) and high connectivity. Then we used receiver operating characteristic (ROC) curve to evaluate the diagnostic value of hub genes. The Connectivity map database was further used to screen out small-molecule drugs of hub genes. Results A total of 430 DEGs were identified based on two GEO datasets. The green gene module was considered as key module for the tumor stage of NPC via WGCNA analysis. The results of functional enrichment analysis revealed that genes in the green module were enriched in regulation of cell cycle, p53 signaling pathway, cell part morphogenesis. Furthermore, four DEGs-related hub genes in the green module were considered as the final hub genes. Then ROC revealed that the final four hub genes presented with high areas under the curve, suggesting these hub genes may be diagnostic biomarkers for NPC. Meanwhile, we screened out several small-molecule drugs that have provided potentially therapeutic goals for NPC. Conclusions Our research identified four potential prognostic biomarkers and several candidate small-molecule drugs for NPC, which may contribute to the new insights for NPC therapy.


2020 ◽  
Vol 49 (D1) ◽  
pp. D1420-D1430
Author(s):  
Dongqing Sun ◽  
Jin Wang ◽  
Ya Han ◽  
Xin Dong ◽  
Jun Ge ◽  
...  

Abstract Cancer immunotherapy targeting co-inhibitory pathways by checkpoint blockade shows remarkable efficacy in a variety of cancer types. However, only a minority of patients respond to treatment due to the stochastic heterogeneity of tumor microenvironment (TME). Recent advances in single-cell RNA-seq technologies enabled comprehensive characterization of the immune system heterogeneity in tumors but posed computational challenges on integrating and utilizing the massive published datasets to inform immunotherapy. Here, we present Tumor Immune Single Cell Hub (TISCH, http://tisch.comp-genomics.org), a large-scale curated database that integrates single-cell transcriptomic profiles of nearly 2 million cells from 76 high-quality tumor datasets across 27 cancer types. All the data were uniformly processed with a standardized workflow, including quality control, batch effect removal, clustering, cell-type annotation, malignant cell classification, differential expression analysis and functional enrichment analysis. TISCH provides interactive gene expression visualization across multiple datasets at the single-cell level or cluster level, allowing systematic comparison between different cell-types, patients, tissue origins, treatment and response groups, and even different cancer-types. In summary, TISCH provides a user-friendly interface for systematically visualizing, searching and downloading gene expression atlas in the TME from multiple cancer types, enabling fast, flexible and comprehensive exploration of the TME.


Author(s):  
Mohit Jha ◽  
Anvita Gupta ◽  
Sudha Singh ◽  
Khushhali Menaria Pandey

Co-infection with tuberculosis (TB) is the preeminent cause of demise in human immunodeficiency virus (HIV) infected individuals. However, diagnosis of TB, particularly in the presence of an HIV co-infection, can be limiting owing to the high inaccuracy associated with conventional diagnostic strategies. Here we determine dysregulated pathways in TB-HIV co-infection and HIV infection utilizing coexpression networks. Primarily, we utilized preservation statistics to identify gene modules that exhibit a weak conservation of network topology within HIV infected and TB-HIV co-infected networks. Raw data was downloaded from Gene Expression Omnibus (GSE50834) and duly pre-processed. Co-expression networks for each condition (HIV infected and TB-HIV co-infected) were constructed independently. Preservation of HIV infected network edges was evaluated with respect to TB-HIV co-infected and vice versa using weighted correlation network analysis. Two out of the 22 modules were identified as exhibiting weak preservation in both conditions. Functional enrichment analysis identified that weakly preserved modules were pertinent to the condition under study. For instance, weakly preserved TBHIV co-infected module T1 enriched for genes associated with mitochondrion exhibited the highest fraction of gene interaction pairs exclusive to TB-HIV co-infection. Concisely, we illustrated the application of using preservation statistics to detect modules functionally linked with dysregulated pathways in disease, as exemplified by the mitochondrion module T1. Our analyses discovered gene clusters that are non-randomly linked with the disease. Highly specific gene pairs pointed to interactions between known markers of disease and favoured identification of possible markers that are likely to be associated with the disease.


2021 ◽  
Vol 90 ◽  
pp. 107415
Author(s):  
Junyi Li ◽  
Wei Jiang ◽  
Henry Han ◽  
Jing Liu ◽  
Bo Liu ◽  
...  

2020 ◽  
Vol 36 (15) ◽  
pp. 4233-4239
Author(s):  
Di Ran ◽  
Shanshan Zhang ◽  
Nicholas Lytal ◽  
Lingling An

Abstract Motivation Single-cell RNA-sequencing (scRNA-seq) has become an important tool to unravel cellular heterogeneity, discover new cell (sub)types, and understand cell development at single-cell resolution. However, one major challenge to scRNA-seq research is the presence of ‘drop-out’ events, which usually is due to extremely low mRNA input or the stochastic nature of gene expression. In this article, we present a novel single-cell RNA-seq drop-out correction (scDoc) method, imputing drop-out events by borrowing information for the same gene from highly similar cells. Results scDoc is the first method that directly involves drop-out information to accounting for cell-to-cell similarity estimation, which is crucial in scRNA-seq drop-out imputation but has not been appropriately examined. We evaluated the performance of scDoc using both simulated data and real scRNA-seq studies. Results show that scDoc outperforms the existing imputation methods in reference to data visualization, cell subpopulation identification and differential expression detection in scRNA-seq data. Availability and implementation R code is available at https://github.com/anlingUA/scDoc. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Kim M. Summers ◽  
Stephen J. Bush ◽  
David A. Hume

AbstractThe mononuclear phagocyte system (MPS) is a family of cells including progenitors, circulating blood monocytes, resident tissue macrophages and dendritic cells (DC) present in every tissue in the body. To test the relationships between markers and transcriptomic diversity in the MPS, we collected from NCBI-GEO >500 quality RNA-seq datasets generated from mouse MPS cells isolated from multiple tissues. The primary data were randomly down-sized to a depth of 10 million reads and requantified. The resulting dataset was clustered using the network analysis tool Graphia. A sample-to-sample matrix revealed that MPS populations could be separated based upon tissue of origin. Cells identified as classical DC subsets, cDC1 and cDC2, and lacking Fcgr1 (CD64), were centrally-located within the MPS cluster and no more distinct than other MPS cell types. A gene-to-gene correlation matrix identified large generic co-expression clusters associated with MPS maturation and innate immune function. Smaller co-expression gene clusters including the transcription factors that drive them showed higher expression within defined isolated cells, including macrophages and DC from specific tissues. They include a cluster containing Lyve1 that implies a function in endothelial cell homeostasis, a cluster of transcripts enriched in intestinal macrophages and a generic cDC cluster associated with Ccr7. However, transcripts encoding many other putative MPS subset markers including Adgre1, Itgax, Itgam, Clec9a, Cd163, Mertk, Retnla and H2-a/e (class II MHC) clustered idiosyncratically and were not correlated with underlying functions. The data provide no support for the concept of markers of M2 polarization or the specific adaptation of DC to present antigen to T cells. Co-expression of immediate early genes (e.g. Egr1, Fos, Dusp1) and inflammatory cytokines and chemokines (Tnf, Il1b, Ccl3/4) indicated that all tissue disaggregation protocols activate MPS cells. Tissue-specific expression clusters indicated that all cell isolation procedures also co-purify other unrelated cell types that may interact with MPS cells in vivo. Comparative analysis of public RNA-seq and single cell RNA-seq data from the same lung cell populations showed that the extensive heterogeneity implied by the global cluster analysis may be even greater at a single cell level with few markers strongly correlated with each other. This analysis highlights the power of large datasets to identify the diversity of MPS cellular phenotypes, and the limited predictive value of surface markers to define lineages, functions or subpopulations.


Sign in / Sign up

Export Citation Format

Share Document