scholarly journals Integrative transcriptomics reveals sexually dimorphic microRNA control of the cholinergic/neurokine interface in schizophrenia and bipolar disorder

2019 ◽  
Author(s):  
Sebastian Lobentanzer ◽  
Geula Hanin ◽  
Jochen Klein ◽  
Hermona Soreq

SummaryRNA-sequencing analyses are often limited to identifying lowest p-value transcripts, which does not address polygenic phenomena. To overcome this limitation, we developed an integrative approach that combines large scale transcriptomic meta-analysis of patient brain tissues with single-cell sequencing data of CNS neurons, short RNA-sequencing of human male- and female-originated cell lines, and connectomics of transcription factor- and microRNA-interactions with perturbed transcripts. We used this pipeline to analyze cortical transcripts of schizophrenia and bipolar disorder patients. While these pathologies show massive transcriptional parallels, their clinically well-known sexual dimorphisms remain unexplained. Our method explicates the differences between afflicted men and women, and identifies disease-affected pathways of cholinergic transmission and gp130-family neurokine controllers of immune function, interlinked by microRNAs. This approach may open new perspectives for seeking biomarkers and therapeutic targets, also in other transmitter systems and diseases.

2020 ◽  
Author(s):  
Benedict Hew ◽  
Qiao Wen Tan ◽  
William Goh ◽  
Jonathan Wei Xiong Ng ◽  
Kenny Koh ◽  
...  

AbstractBacterial resistance to antibiotics is a growing problem that is projected to cause more deaths than cancer in 2050. Consequently, novel antibiotics are urgently needed. Since more than half of the available antibiotics target the bacterial ribosomes, proteins that are involved in protein synthesis are thus prime targets for the development of novel antibiotics. However, experimental identification of these potential antibiotic target proteins can be labor-intensive and challenging, as these proteins are likely to be poorly characterized and specific to few bacteria. In order to identify these novel proteins, we established a Large-Scale Transcriptomic Analysis Pipeline in Crowd (LSTrAP-Crowd), where 285 individuals processed 26 terabytes of RNA-sequencing data of the 17 most notorious bacterial pathogens. In total, the crowd processed 26,269 RNA-seq experiments and used the data to construct gene co-expression networks, which were used to identify more than a hundred uncharacterized genes that were transcriptionally associated with protein synthesis. We provide the identity of these genes together with the processed gene expression data. The data can be used to identify other vulnerabilities or bacteria, while our approach demonstrates how the processing of gene expression data can be easily crowdsourced.


GigaScience ◽  
2020 ◽  
Vol 9 (10) ◽  
Author(s):  
Francesca Pia Caruso ◽  
Luciano Garofano ◽  
Fulvio D'Angelo ◽  
Kai Yu ◽  
Fuchou Tang ◽  
...  

ABSTRACT Background Single-cell RNA sequencing is the reference technique for characterizing the heterogeneity of the tumor microenvironment. The composition of the various cell types making up the microenvironment can significantly affect the way in which the immune system activates cancer rejection mechanisms. Understanding the cross-talk signals between immune cells and cancer cells is of fundamental importance for the identification of immuno-oncology therapeutic targets. Results We present a novel method, single-cell Tumor–Host Interaction tool (scTHI), to identify significantly activated ligand–receptor interactions across clusters of cells from single-cell RNA sequencing data. We apply our approach to uncover the ligand–receptor interactions in glioma using 6 publicly available human glioma datasets encompassing 57,060 gene expression profiles from 71 patients. By leveraging this large-scale collection we show that unexpected cross-talk partners are highly conserved across different datasets in the majority of the tumor samples. This suggests that shared cross-talk mechanisms exist in glioma. Conclusions Our results provide a complete map of the active tumor–host interaction pairs in glioma that can be therapeutically exploited to reduce the immunosuppressive action of the microenvironment in brain tumor.


2018 ◽  
Author(s):  
Xianwen Ren ◽  
Liangtao Zheng ◽  
Zemin Zhang

ABSTRACTClustering is a prevalent analytical means to analyze single cell RNA sequencing data but the rapidly expanding data volume can make this process computational challenging. New methods for both accurate and efficient clustering are of pressing needs. Here we proposed a new clustering framework based on random projection and feature construction for large scale single-cell RNA sequencing data, which greatly improves clustering accuracy, robustness and computational efficacy for various state-of-the-art algorithms benchmarked on multiple real datasets. On a dataset with 68,578 human blood cells, our method reached 20% improvements for clustering accuracy and 50-fold acceleration but only consumed 66% memory usage compared to the widely-used software package SC3. Compared to k-means, the accuracy improvement can reach 3-fold depending on the concrete dataset. An R implementation of the framework is available from https://github.com/Japrin/sscClust.


2020 ◽  
Author(s):  
Hunyong Cho ◽  
Chuwen Liu ◽  
John S. Preisser ◽  
Di Wu

SummaryMeasuring gene-gene dependence in single cell RNA sequencing (scRNA-seq) count data is often of interest and remains challenging, because an unidentified portion of the zero counts represent non-detected RNA due to technical reasons. Conventional statistical methods that fail to account for technical zeros incorrectly measure the dependence among genes. To address this problem, we propose a bivariate zero-inflated negative binomial (BZINB) model constructed using a bivariate Poisson-gamma mixture with dropout indicators for the technical (excess) zeros. Parameters are estimated based on the EM algorithm and are used to measure the underlying dependence by decomposing the two sources of zeros. Compared to existing models, the proposed BZINB model is specifically designed for estimating dependence and is more flexible, while preserving the marginal zero-inflated negative binomial distributions. Additionally, it has a simple latent variable framework, allowing parameters to have clear and intuitive interpretations, and its computation is feasible with large scale data. Using a recent scRNA-seq dataset, we illustrate model fitting and how the model-based measures can be different from naive measures. The inferential ability of the proposed model is evaluated in a simulation study. An R package ‘bzinb’ is available on CRAN.


2021 ◽  
Vol 15 (Supplement_1) ◽  
pp. S142-S143
Author(s):  
K Arnauts ◽  
C Lapierre ◽  
B Verstockt ◽  
S Verstockt ◽  
P Sudhakar ◽  
...  

Abstract Background Alterations in the intestinal microbiota play a pivotal role in the pathogenesis of Inflammatory Bowel Diseases (IBD). Although there is a lot of interest in restoring dysbiosis, the effects of microbial alterations are not fully understood. In addition, it is known that epithelial cells from IBD patients maintain intrinsic defects1. For that reason, our aim was to unravel if epithelial cells of UC patients are more sensitive towards microbiota stimulation, compared to non-IBD controls. Methods Intestinal organoids of UC patients (n=8) and non-IBD controls (n=8) were grown as monolayers on Transwell inserts. Upon confluency (evaluated by transepithelial electrical resistance (TEER)), monolayers were stimulated for 24 hours with TNF-α (100 ng/ml), IL-1β (20 ng/ml) and Flagellin (1 µg/ml) to mimic inflammation. Fresh fecal samples of a selected donor (n=1, high microbial cell count and presence of selected phyla2) and UC patients (n=3, endoscopic sub-mayo ≥2) were filtered and stored in 0.9% NaCl. Monolayers were stimulated for 6 hours with 3.108 microbial cells (cell count by Flow Cytometry). RNA sequencing was performed by Truseq for Illumina. Differentially expressed genes (DEG) were studied by DESeq2 (FDR <0.05). Results Although TEER measurements indicated a higher epithelial cell permeability upon UC microbiota stimulation in UC patients compared to non-IBD controls (p=0.038; Mann-Whitney; Figure 1), we could not confirm this distinct response based on RNA sequencing data at principal component analysis (PCA). Several epithelial barrier genes were significantly upregulated between UC and non-IBD epithelium at nominal p-value, while only CLDN1 and 18 were significant for FDR <0.05 (Figure 2). Clustering on PCA was driven by microbial treatment and not by epithelial origin (Figure 3). Inflamed monolayers of UC patients showed different baseline characteristics (129 DEG; e.g. HLA-G, MUC2, CLDN1, IL23A, PARP8; Figure 4A), but did not propagate in a different response upon microbiota exposure compared to non-IBD controls. Treatment with microbiota of UC patients (23 DEG; e.g. PARP9, TGFBI, ANXA13) or the selected donor (58 DEG; e.g. CCL5, CLDN18, TGFBI) only induced minor differences between epithelial cell types (Figure 4B). Conclusion We observed no different response in epithelial cells of UC patients towards microbiota stimulation compared to non-IBD epithelial cells on transcriptomic level. Further validation on barrier integrity is needed. We observed no indications that microbial treatment would be less beneficial to UC patients, based on the epithelial cell response. Addition of (patient specific) immune cells will contribute to unraveling host-microbiota interactions in IBD patients. References


2020 ◽  
Author(s):  
Zeyu Jiao ◽  
Yinglei Lai ◽  
Jujiao Kang ◽  
Weikang Gong ◽  
Liang Ma ◽  
...  

AbstractHigh-throughput technologies, such as magnetic resonance imaging (MRI) and DNA/RNA sequencing (DNA-seq/RNA-seq), have been increasingly used in large-scale association studies. With these technologies, important biomedical research findings have been generated. The reproducibility of these findings, especially from structural MRI (sMRI) and functional MRI (fMRI) association studies, has recently been questioned. There is an urgent demand for a reliable overall reproducibility assessment for large-scale high-throughput association studies. It is also desirable to understand the relationship between study reproducibility and sample size in an experimental design. In this study, we developed a novel approach: the mixture model reproducibility index (M2RI) for assessing study reproducibility of large-scale association studies. With M2RI, we performed study reproducibility analysis for several recent large sMRI/fMRI data sets. The advantages of our approach were clearly demonstrated, and the sample size requirements for different phenotypes were also clearly demonstrated, especially when compared to the Dice coefficient (DC). We applied M2RI to compare two MRI or RNA sequencing data sets. The reproducibility assessment results were consistent with our expectations. In summary, M2RI is a novel and useful approach for assessing study reproducibility, calculating sample sizes and evaluating the similarity between two closely related studies.


2018 ◽  
Author(s):  
Koen Van Den Berge ◽  
Katharina Hembach ◽  
Charlotte Soneson ◽  
Simone Tiberi ◽  
Lieven Clement ◽  
...  

Gene expression is the fundamental level at which the result of various genetic and regulatory programs are observable. The measurement of transcriptome-wide gene expression has convincingly switched from microarrays to sequencing in a matter of years. RNA sequencing (RNA-seq) provides a quantitative and open system for profiling transcriptional outcomes on a large scale and therefore facilitates a large diversity of applications, including basic science studies, but also agricultural or clinical situations. In the past 10 years or so, much has been learned about the characteristics of the RNA-seq datasets as well as the performance of the myriad of methods developed. In this review, we give an overall view of the developments in RNA-seq data analysis, including experimental design, with an explicit focus on quantification of gene expression and statistical approaches for differential expression. We also highlight emerging data types, such as single-cell RNA-seq and gene expression profiling using long-read technologies.


2018 ◽  
Author(s):  
Akdes Serin Harmancı ◽  
Arif O. Harmanci ◽  
Xiaobo Zhou

AbstractRNA sequencing experiments generate large amounts of information about expression levels of genes. Although they are mainly used for quantifying expression levels, they contain much more biologically important information such as copy number variants (CNV). Here, we propose CaSpER, a signal processing approach for identification, visualization, and integrative analysis of focal and large-scale CNV events in multiscale resolution using either bulk or single-cell RNA sequencing data. CaSpER performs smoothing of the genome-wide RNA sequencing signal profiles in different multiscale resolutions, identifying CNV events at different length scales. CaSpER also employs a novel methodology for generation of genome-wide B-allele frequency (BAF) signal profile from the reads and utilizes it in multiscale fashion for correction of CNV calls. The shift in allelic signal is used to quantify the loss-of-heterozygosity (LOH) which is valuable for CNV identification. CaSpER uses Hidden Markov Models (HMM) to assign copy number states to regions. The multiscale nature of CaSpER enables comprehensive analysis of focal and large-scale CNVs and LOH segments. CaSpER performs well in accuracy compared to gold standard SNP genotyping arrays. In particular, analysis of single cell Glioblastoma (GBM) RNA sequencing data with CaSpER reveals novel mutually exclusive and co-occurring CNV sub-clones at different length scales. Moreover, CaSpER discovers gene expression signatures of CNV sub-clones, performs gene ontology (GO) enrichment analysis and identifies potential therapeutic targets for the sub-clones. CaSpER increases the utility of RNA-sequencing datasets and complements other tools for complete characterization and visualization of the genomic and transcriptomic landscape of single cell and bulk RNA sequencing data, especially in cancer research.


eLife ◽  
2020 ◽  
Vol 9 ◽  
Author(s):  
MGP van der Wijst ◽  
DH de Vries ◽  
HE Groot ◽  
G Trynka ◽  
CC Hon ◽  
...  

In recent years, functional genomics approaches combining genetic information with bulk RNA-sequencing data have identified the downstream expression effects of disease-associated genetic risk factors through so-called expression quantitative trait locus (eQTL) analysis. Single-cell RNA-sequencing creates enormous opportunities for mapping eQTLs across different cell types and in dynamic processes, many of which are obscured when using bulk methods. Rapid increase in throughput and reduction in cost per cell now allow this technology to be applied to large-scale population genetics studies. To fully leverage these emerging data resources, we have founded the single-cell eQTLGen consortium (sc-eQTLGen), aimed at pinpointing the cellular contexts in which disease-causing genetic variants affect gene expression. Here, we outline the goals, approach and potential utility of the sc-eQTLGen consortium. We also provide a set of study design considerations for future single-cell eQTL studies.


Author(s):  
Mingxuan Gao ◽  
Mingyi Ling ◽  
Xinwei Tang ◽  
Shun Wang ◽  
Xu Xiao ◽  
...  

Abstract With the development of single-cell RNA sequencing (scRNA-seq) technology, it has become possible to perform large-scale transcript profiling for tens of thousands of cells in a single experiment. Many analysis pipelines have been developed for data generated from different high-throughput scRNA-seq platforms, bringing a new challenge to users to choose a proper workflow that is efficient, robust and reliable for a specific sequencing platform. Moreover, as the amount of public scRNA-seq data has increased rapidly, integrated analysis of scRNA-seq data from different sources has become increasingly popular. However, it remains unclear whether such integrated analysis would be biassed if the data were processed by different upstream pipelines. In this study, we encapsulated seven existing high-throughput scRNA-seq data processing pipelines with Nextflow, a general integrative workflow management framework, and evaluated their performance in terms of running time, computational resource consumption and data analysis consistency using eight public datasets generated from five different high-throughput scRNA-seq platforms. Our work provides a useful guideline for the selection of scRNA-seq data processing pipelines based on their performance on different real datasets. In addition, these guidelines can serve as a performance evaluation framework for future developments in high-throughput scRNA-seq data processing.


Sign in / Sign up

Export Citation Format

Share Document