EST analysis of cDNA libraries from the entomopathogenic fungus Beauveria (Cordyceps) bassiana. I. Evidence for stage-specific gene expression in aerial conidia, in vitro blastospores and submerged conidia

The entomopathogenic fungus Beauveria (Cordyceps) bassiana holds much promise as a pest biological control agent. B. bassiana produces at least three in vitro single cell infectious propagules, including aerial conidia, vegetative cells termed blastospores and submerged conidia, that display different morphological, biochemical and virulence properties. Populations of aerial conidia, blastospores and submerged conidia were produced on agar plates, rich liquid broth cultures and under conditions of nutrient limitation in submerged cultures, respectively. cDNA libraries were generated from mRNA isolated from each B. bassiana cell type and ∼2500 5′ end sequences were determined from each library. Sequences derived from aerial conidia clustered into 284 contigs and 963 singlets, with those derived from blastospores and submerged conidia forming 327 contigs with 788 singlets, and 303 contigs and 1079 contigs, respectively. Almost half (40–45 %) of the sequences in each library displayed either no significant similarity (e value >10−4) or similarity to hypothetical proteins found in the NCBI database. The expressed sequence tag dataset also included sequences representing a significant portion of proteins in cellular metabolism, information storage and processing, transport and cell processes, including cell division and posttranslational modifications. Transcripts encoding a diverse array of pathogenicity-related genes, including proteases, lipases, esterases, phosphatases and enzymes producing toxic secondary metabolites, were also identified. Comparative analysis between the libraries identified 2416 unique sequences, of which 20–30 % were unique to each library, and only ∼6 % of the sequences were shared between all three libraries. The unique and divergent representation of the B. bassiana transcriptome in the cDNA libraries from each cell type suggests robust differential gene expression profiles in response to environmental conditions.

Download Full-text

A computational method for direct imputation of cell type-specific expression profiles and cellular compositions from bulk-tissue RNA-Seq in brain disorders

10.1101/2020.05.28.121483 ◽

2020 ◽

Author(s):

Abolfazl Doostparast Torshizi ◽

Jubao Duan ◽

Kai Wang

Keyword(s):

Gene Expression ◽

Expression Profiles ◽

Complex Diseases ◽

Specific Gene ◽

Cellular Composition ◽

Rna Seq ◽

Cell Type ◽

Specific Expression ◽

Cell Type Specific Expression ◽

Cell Type Specific

AbstractThe importance of cell type-specific gene expression in disease-relevant tissues is increasingly recognized in genetic studies of complex diseases. However, the vast majority of gene expression studies are conducted on bulk tissues, necessitating computational approaches to infer biological insights on cell type-specific contribution to diseases. Several computational methods are available for cell type deconvolution (that is, inference of cellular composition) from bulk RNA-Seq data, but cannot impute cell type-specific expression profiles. We hypothesize that with external prior information such as single cell RNA-seq (scRNA-seq) and population-wide expression profiles, it can be a computationally tractable and identifiable to estimate both cellular composition and cell type-specific expression from bulk RNA-Seq data. Here we introduce CellR, which addresses cross-individual gene expression variations by employing genome-wide tissue-wise expression signatures from GTEx to adjust the weights of cell-specific gene markers. It then transforms the deconvolution problem into a linear programming model while taking into account inter/intra cellular correlations, and uses a multi-variate stochastic search algorithm to estimate the expression level of each gene in each cell type. Extensive analyses on several complex diseases such as schizophrenia, Alzheimer’s disease, Huntington’s disease, and type 2 diabetes validated efficiency of CellR, while revealing how specific cell types contribute to different diseases. We conducted numerical simulations on human cerebellum to generate pseudo-bulk RNA-seq data and demonstrated its efficiency in inferring cell-specific expression profiles. Moreover, we inferred cell-specific expression levels from bulk RNA-seq data on schizophrenia and computed differentially expressed genes within certain cell types. Using predicted gene expression profile on excitatory neurons, we were able to reproduce our recently published findings on TCF4 being a master regulator in schizophrenia and showed how this gene and its targets are enriched in excitatory neurons. In summary, CellR compares favorably (both accuracy and stability of inference) against competing approaches on inferring cellular composition from bulk RNA-seq data, but also allows direct imputation of cell type-specific gene expression, opening new doors to re-analyze gene expression data on bulk tissues in complex diseases.

Download Full-text

TMOD-13. MODELING THE GENETIC, TRANSCRIPTOMIC, AND CELLULAR HETEROGENEITY OF GLIOBLASTOMA USING TUMOR ORGANOIDS

Neuro-Oncology ◽

10.1093/neuonc/noz175.1112 ◽

2019 ◽

Vol 21 (Supplement_6) ◽

pp. vi265-vi265

Author(s):

Daniel Zhang ◽

Fadi Jacob ◽

Ryan Salinas ◽

Phuong Nguyen ◽

Guo-li Ming ◽

...

Keyword(s):

Gene Expression ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Cellular Heterogeneity ◽

Definitive Treatment ◽

Functional Study ◽

Specific Gene ◽

Fresh Tissue ◽

Mechanistic Investigation

Abstract Glioblastoma exhibits enormous genetic, transcriptional, and cellular heterogeneity at the macroscopic level across regions of the tumor as well as at the microscopic level between neighboring cells, all of which present significant challenges towards creating a definitive treatment for this devastating disease. We have developed a method of generating glioblastoma organoids (GBOs) from fresh tissue obtained directly from surgical resection and maintaining them in a defined medium without bFGF/EGF. Whole exome sequencing revealed that GBOs maintain the genomic landscape of their parent tumors. Somatic and copy number variants are present in the GBOs at similar allele frequencies or copy ratios as in the parent tumor, suggesting that the relative proportions of clonal populations are largely maintained in the organoids. Bulk transcriptomic analysis demonstrated strong gene expression correlations between the parent tumor and corresponding GBOs through 12 weeks of culture. Some tumors were sampled at multiple different anatomic regions, and the corresponding GBOs maintained region-specific gene expression signatures and genomic variants. EGFRvIII, a tumor-specific variant targeted in a number of emerging therapies, also remains present in the GBOs at similar transcript frequencies, reflecting the native heterogeneity of the parent tumor. Finally, we used single cell transcriptomics to examine cellular heterogeneity and find that GBOs contain many different cell types that exhibit similar gene expression profiles as the matching cell type in the corresponding parent tumor. Notably, these GBOs retain neoplastic as well as non-neoplastic cells, such as tumor associated macrophages / microglia, T-cells, endothelial cells, stromal cells, and oligodendrocytes. These GBOs preserve complex tumor heterogeneity an in vitro environment, creating opportunities for extended manipulation, characterization, and functional study for mechanistic investigation and therapeutic testing.

Download Full-text

SCDC: bulk gene expression deconvolution by multiple single-cell RNA sequencing references

Briefings in Bioinformatics ◽

10.1093/bib/bbz166 ◽

2020 ◽

Cited By ~ 13

Author(s):

Meichen Dong ◽

Aatish Thennavan ◽

Eugene Urrutia ◽

Yun Li ◽

Charles M Perou ◽

...

Keyword(s):

Gene Expression ◽

Single Cell ◽

Rna Sequencing ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Specific Gene ◽

Rna Seq ◽

Cell Type ◽

Mixed Cell ◽

Single Cell Rna Sequencing

Abstract Recent advances in single-cell RNA sequencing (scRNA-seq) enable characterization of transcriptomic profiles with single-cell resolution and circumvent averaging artifacts associated with traditional bulk RNA sequencing (RNA-seq) data. Here, we propose SCDC, a deconvolution method for bulk RNA-seq that leverages cell-type specific gene expression profiles from multiple scRNA-seq reference datasets. SCDC adopts an ENSEMBLE method to integrate deconvolution results from different scRNA-seq datasets that are produced in different laboratories and at different times, implicitly addressing the problem of batch-effect confounding. SCDC is benchmarked against existing methods using both in silico generated pseudo-bulk samples and experimentally mixed cell lines, whose known cell-type compositions serve as ground truths. We show that SCDC outperforms existing methods with improved accuracy of cell-type decomposition under both settings. To illustrate how the ENSEMBLE framework performs in complex tissues under different scenarios, we further apply our method to a human pancreatic islet dataset and a mouse mammary gland dataset. SCDC returns results that are more consistent with experimental designs and that reproduce more significant associations between cell-type proportions and measured phenotypes.

Download Full-text

Digital sorting of complex tissues for cell type-specific gene expression profiles

BMC Bioinformatics ◽

10.1186/1471-2105-14-89 ◽

2013 ◽

Vol 14 (1) ◽

pp. 89 ◽

Cited By ~ 108

Author(s):

Yi Zhong ◽

Ying-Wooi Wan ◽

Kaifang Pang ◽

Lionel ML Chow ◽

Zhandong Liu

Keyword(s):

Gene Expression ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Specific Gene ◽

Cell Type ◽

Specific Gene Expression ◽

Cell Type Specific

Download Full-text

New experimental and computational approaches to the analysis of gene expression.

Acta Biochimica Polonica ◽

10.18388/abp.1998_4351 ◽

1998 ◽

Vol 45 (4) ◽

pp. 929-934 ◽

Cited By ~ 20

Author(s):

J A Rafalski ◽

M Hanafey ◽

G H Miao ◽

A Ching ◽

J M Lee ◽

...

Keyword(s):

Gene Expression ◽

Data Base ◽

Developmental Stages ◽

Expressed Sequence Tag ◽

Expression Profiles ◽

Soybean Seed ◽

Fluorescent Labeling ◽

Cdna Libraries ◽

Northern Analysis ◽

Expression Levels

Public and private EST (Expressed Sequence Tag) programs provide access to a large number of ESTs from a number of plant species, including Arabidopsis, corn, soybean, rice, wheat. In addition to the homology of each EST to genes in GenBank, information about homology to all other ESTs in the data base can be obtained. To estimate expression levels of genes represented in the DuPont EST data base we count the number of times each gene has been seen in different cDNA libraries, from different tissues, developmental stages or induction conditions. This quantitation of message levels is quite accurate for highly expressed messages and, unlike conventional Northern blots, allows comparison of expression levels between different genes. Lists of most highly expresses genes in different libraries can be compiled. Also, if EST data is available for cDNA libraries derived from different developmental stages, gene expression profiles across development can be assembled. We present an example of such a profile for soybean seed development. Gene expression data obtained from Electronic Northern analysis can be confirmed and extended beyond the realm of highly expressed genes by using high density DNA arrays. The ESTs identified as interesting can be arrayed on nylon or glass and probed with total labeled cDNA first strand from the tissue of interest. Two-color fluorescent labeling allows accurate mRNA ratio measurements. We are currently using the DNA array technology to study chemical induction of gene expression and the biosynthesis of oil, carbohydrate and protein in developing seeds.

Download Full-text

Strategies for cellular deconvolution in human brain RNA sequencing data

10.1101/2020.01.19.910976 ◽

2020 ◽

Cited By ~ 1

Author(s):

Olukayode A. Sosina ◽

Matthew N Tran ◽

Kristen R Maynard ◽

Ran Tao ◽

Margaret A. Taub ◽

...

Keyword(s):

Gene Expression ◽

Cell Size ◽

Expression Profiles ◽

Brain Regions ◽

Specific Gene ◽

Rna Seq ◽

Cell Type ◽

Sequencing Data ◽

Target User ◽

Size Estimates

AbstractStatistical deconvolution strategies have emerged over the past decade to estimate the proportion of various cell populations in homogenate tissue sources like brain using gene expression data. Here we show that several existing deconvolution algorithms which estimate the RNA composition of homogenate tissue, relates to the amount of RNA attributable to each cell type, and not the cellular composition relating to the underlying fraction of cells. Incorporating “cell size” parameters into RNA-based deconvolution algorithms can successfully recover cellular fractions in homogenate brain RNA-seq data. We lastly show that using both cell sizes and cell type-specific gene expression profiles from brain regions other than the target/user-provided bulk tissue RNA-seq dataset consistently results in biased cell fractions. We report several independently constructed cell size estimates as a community resource and extend the MuSiC framework to accommodate these cell size estimates (https://github.com/xuranw/MuSiC/).

Download Full-text

SCDC: Bulk Gene Expression Deconvolution by Multiple Single-Cell RNA Sequencing References

10.1101/743591 ◽

2019 ◽

Cited By ~ 1

Author(s):

Meichen Dong ◽

Aatish Thennavan ◽

Eugene Urrutia ◽

Yun Li ◽

Charles M. Perou ◽

...

Keyword(s):

Gene Expression ◽

Single Cell ◽

Rna Sequencing ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Specific Gene ◽

Rna Seq ◽

Cell Type ◽

Mixed Cell ◽

Single Cell Rna Sequencing

AbstractRecent advances in single-cell RNA sequencing (scRNA-seq) enable characterization of transcriptomic profiles with single-cell resolution and circumvent averaging artifacts associated with traditional bulk RNA sequencing (RNA-seq) data. Here, we propose SCDC, a deconvolution method for bulk RNA-seq that leverages cell-type specific gene expression profiles from multiple scRNA-seq reference datasets. SCDC adopts an ENSEMBLE method to integrate deconvolution results from different scRNA-seq datasets that are produced in different laboratories and at different times, implicitly addressing the problem of batch-effect confounding. SCDC is benchmarked against existing methods using both in silico generated pseudo-bulk samples and experimentally mixed cell lines, whose known cell-type compositions serve as ground truths. We show that SCDC outperforms existing methods with improved accuracy of cell-type decomposition under both settings. To illustrate how the ENSEMBLE framework performs in complex tissues under different scenarios, we further apply our method to a human pancreatic islet dataset and a mouse mammary gland dataset. SCDC returns results that are more consistent with experimental designs and that reproduce more significant associations between cell-type proportions and measured phenotypes.

Download Full-text

Methods to analyze cell type-specific gene expression profiles from heterogeneous cell populations

Animal Cells and Systems ◽

10.1080/19768354.2016.1191544 ◽

2016 ◽

Vol 20 (3) ◽

pp. 113-117 ◽

Cited By ~ 5

Author(s):

Jane Jung ◽

Hosung Jung

Keyword(s):

Gene Expression ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Cell Populations ◽

Specific Gene ◽

Cell Type ◽

Specific Gene Expression ◽

Cell Type Specific

Download Full-text

Sources of Variation in Cell-Type RNA-Seq Profiles

10.21203/rs.2.23415/v1 ◽

2020 ◽

Cited By ~ 1

Author(s):

Johan Gustafsson ◽

Felix Held ◽

Jonathan Robinson ◽

Elias Björnson ◽

Rebecka Jörnsten ◽

...

Keyword(s):

Gene Expression ◽

Single Cell ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Specific Gene ◽

Rna Seq ◽

Cell Type ◽

Specific Gene Expression ◽

Cell Type Specific ◽

Technical Factors

Abstract Background Cell-type specific gene expression profiles are needed for many computational methods operating on bulk RNA-Seq samples, such as deconvolution of cell-type fractions and digital cytometry. However, the gene expression profile of a cell type can vary substantially due to both technical factors and biological differences in cell state and surroundings, reducing the efficacy of such methods. Here, we investigated which factors contribute most to this variation. Results We evaluated different normalization methods, quantified the magnitude of variation introduced by different sources, and examined the differences between UMI-based single-cell RNA-Seq and bulk RNA-Seq. We applied methods such as random forest regression to a collection of publicly available bulk and single-cell RNA-Seq datasets containing B and T cells, and found that the technical variation across laboratories is of the same magnitude as the biological variation across cell types. Tissue of origin and cell subtype are less important but still substantial factors, while the difference between individuals is relatively small. We also show that much of the differences between UMI-based single-cell and bulk RNA-Seq methods can be explained by the number of read duplicates per mRNA molecule in the single-cell sample.Conclusions Our work shows the importance of either matching or correcting for technical factors when creating cell-type specific gene expression profiles that are to be used together with bulk samples.

Download Full-text