From 1D sequence to 3D chromatin dynamics and cellular functions: a phase separation perspective

AbstractThe high-order chromatin structure plays a non-negligible role in gene regulation. However, the mechanism for the formation of different chromatin structures in different cells and the sequence dependence of this process remain to be elucidated. As the nucleotide distributions in human and mouse genomes are highly uneven, we identified CGI forest and prairie genomic domains based on CGI density, which better segregates genomic elements along the genome than GC content. The genome is then divided into two sequentially, epigenetically, and transcriptionally distinct regions. These two types of megabase-sized domains spatially segregate, but to a different extent in different cell types. Overall, the forests and prairies gradually segregate from each other in development, differentiation, and senescence. The multi-scale forest-prairie spatial intermingling is cell-type specific and increases in differentiation, thus helps define the cell identity. We propose that the phase separation of the 1D mosaic sequence in space, serving as a potential driving force, together with cell type specific epigenetic marks and transcription factors, shapes the chromatin structure in different cell types and renders them distinct genomic properties. The mosaicity of the genome manifested in terms of alternative forests and prairies of a species could be related to its biological processes such as differentiation, aging and body temperature control.

Download Full-text

SeqEnhDL: sequence-based classification of cell type-specific enhancers using deep learning models

10.1101/2020.05.13.093997 ◽

2020 ◽

Author(s):

Yupeng Wang ◽

Rosario B. Jaime-Lara ◽

Abhrarup Roy ◽

Ying Sun ◽

Xinyue Liu ◽

...

Keyword(s):

Neural Network ◽

Deep Learning ◽

Dna Sequences ◽

Cell Types ◽

Learning Models ◽

Cell Type ◽

Coding Sequences ◽

Sequence Features ◽

Cell Type Specific ◽

Different Cell Types

AbstractWe propose SeqEnhDL, a deep learning framework for classifying cell type-specific enhancers based on sequence features. DNA sequences of “strong enhancer” chromatin states in nine cell types from the ENCODE project were retrieved to build and test enhancer classifiers. For any DNA sequence, sequential k-mer (k=5, 7, 9 and 11) fold changes relative to randomly selected non-coding sequences were used as features for deep learning models. Three deep learning models were implemented, including multi-layer perceptron (MLP), Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN). All models in SeqEnhDL outperform state-of-the-art enhancer classifiers including gkm-SVM and DanQ, with regard to distinguishing cell type-specific enhancers from randomly selected non-coding sequences. Moreover, SeqEnhDL is able to directly discriminate enhancers from different cell types, which has not been achieved by other enhancer classifiers. Our analysis suggests that both enhancers and their tissue-specificity can be accurately identified according to their sequence features. SeqEnhDL is publicly available at https://github.com/wyp1125/SeqEnhDL.

Download Full-text

Yeast alpha 2 repressor positions nucleosomes in TRP1/ARS1 chromatin.

Molecular and Cellular Biology ◽

10.1128/mcb.10.5.2247 ◽

1990 ◽

Vol 10 (5) ◽

pp. 2247-2260 ◽

Cited By ~ 77

Author(s):

S Y Roth ◽

A Dean ◽

R T Simpson

Keyword(s):

Binding Site ◽

Mating Type ◽

Chromatin Structure ◽

Nucleosome Position ◽

Cell Types ◽

Alpha Cells ◽

Cell Type ◽

Yeast Plasmid ◽

A Cells ◽

Cell Type Specific

The yeast alpha 2 repressor suppresses expression of a-mating-type-specific genes in haploid alpha and diploid a/alpha cell types. We inserted the alpha 2-binding site into the multicopy TRP1/ARS1 yeast plasmid and examined the effects of alpha 2 on the chromatin structure of the derivative plasmids in alpha cells, and a/alpha cells. Whereas no effect on nucleosome position was observed in a cells, nucleosomes were precisely and stably positioned over sequences flanking the alpha 2 operator in alpha and a/alpha cells. In addition, when the alpha 2 operator was located upstream of the TRP1 gene, an extended array of positioned nucleosomes was formed in alpha cells and a/alpha cells, with formation of a nucleosome not present in a cells, and TRP1 mRNA production was substantially reduced. These data indicate that alpha 2 causes a positioning of nucleosomes over sequences proximal to its operator in TRP1/ARS1 chromatin and suggest that changes in chromatin structure may be related to alpha 2 repression of cell-type-specific genes.

Download Full-text

The complement of desmosomal plaque proteins in different cell types.

The Journal of Cell Biology ◽

10.1083/jcb.101.4.1442 ◽

1985 ◽

Vol 101 (4) ◽

pp. 1442-1454 ◽

Cited By ~ 122

Author(s):

P Cowin ◽

H P Kapprell ◽

W W Franke

Keyword(s):

Guinea Pig ◽

Cell Types ◽

Cardiac Cells ◽

Myocardial Tissue ◽

Cell Type ◽

Wide Range ◽

Specific Manner ◽

A Cell ◽

Cell Type Specific ◽

Different Cell Types

Desmosomal plaque proteins have been identified in immunoblotting and immunolocalization experiments on a wide range of cell types from several species, using a panel of monoclonal murine antibodies to desmoplakins I and II and a guinea pig antiserum to desmosomal band 5 protein. Specifically, we have taken advantage of the fact that certain antibodies react with both desmoplakins I and II, whereas others react only with desmoplakin I, indicating that desmoplakin I contains unique regions not present on the closely related desmoplakin II. While some of these antibodies recognize epitopes conserved between chick and man, others display a narrow species specificity. The results show that proteins whose size, charge, and biochemical behavior are very similar to those of desmoplakin I and band 5 protein of cow snout epidermis are present in all desmosomes examined. These include examples of simple and pseudostratified epithelia and myocardial tissue, in addition to those of stratified epithelia. In contrast, in immunoblotting experiments, we have detected desmoplakin II only among cells of stratified and pseudostratified epithelial tissues. This suggests that the desmosomal plaque structure varies in its complement of polypeptides in a cell-type specific manner. We conclude that the obligatory desmosomal plaque proteins, desmoplakin I and band 5 protein, are expressed in a coordinate fashion but independently from other differentiation programs of expression such as those specific for either epithelial or cardiac cells.

Download Full-text

Detection of cell-type-specific risk-CpG sites in epigenome-wide association studies

10.1101/415109 ◽

2018 ◽

Cited By ~ 1

Author(s):

Xiangyu Luo ◽

Can Yang ◽

Yingying Wei

Keyword(s):

High Resolution ◽

Statistical Method ◽

Individual Cell ◽

Association Studies ◽

Cell Types ◽

Cell Type ◽

Cpg Sites ◽

Cpg Site ◽

Cell Type Specific ◽

Different Cell Types

In epigenome-wide association studies, the measured signals for each sample are a mixture of methylation profiles from different cell types. The current approaches to the association detection only claim whether a cytosine-phosphate-guanine (CpG) site is associated with the phenotype or not, but they cannot determine the cell type in which the risk-CpG site is affected by the phenotype. Here, we propose a solid statistical method, HIgh REsolution (HIRE), which not only substantially improves the power of association detection at the aggregated level as compared to the existing methods but also enables the detection of risk-CpG sites for individual cell types.

Download Full-text

Yeast alpha 2 repressor positions nucleosomes in TRP1/ARS1 chromatin

Molecular and Cellular Biology ◽

10.1128/mcb.10.5.2247-2260.1990 ◽

1990 ◽

Vol 10 (5) ◽

pp. 2247-2260

Author(s):

S Y Roth ◽

A Dean ◽

R T Simpson

Keyword(s):

Binding Site ◽

Mating Type ◽

Chromatin Structure ◽

Nucleosome Position ◽

Cell Types ◽

Alpha Cells ◽

Cell Type ◽

Yeast Plasmid ◽

A Cells ◽

Cell Type Specific

Download Full-text

Detecting cell-type-specific allelic expression imbalance by integrative analysis of bulk and single-cell RNA sequencing data

10.1101/2020.08.26.267815 ◽

2020 ◽

Author(s):

Jiaxin Fan ◽

Xuran Wang ◽

Rui Xiao ◽

Mingyao Li

Keyword(s):

Single Cell ◽

Rna Sequencing ◽

Cell Types ◽

Allelic Expression ◽

Rna Seq ◽

Allelic Expression Imbalance ◽

Cell Type ◽

Single Cell Rna Sequencing ◽

Cell Type Specific ◽

Different Cell Types

AbstractAllelic expression imbalance (AEI), quantified by the relative expression of two alleles of a gene in a diploid organism, can help explain phenotypic variations among individuals. Traditional methods detect AEI using bulk RNA sequencing (RNA-seq) data, a data type that averages out cell-to-cell heterogeneity in gene expression across cell types. Since the patterns of AEI may vary across different cell types, it is desirable to study AEI in a cell-type-specific manner. Although this can be achieved by single-cell RNA sequencing (scRNA-seq), it requires full-length transcript to be sequenced in single cells of a large number of individuals, which are still cost prohibitive to generate. To overcome this limitation and utilize the vast amount of existing disease relevant bulk tissue RNA-seq data, we developed BSCET, which enables the characterization of cell-type-specific AEI in bulk RNA-seq data by integrating cell type composition information inferred from a small set of scRNA-seq samples, possibly obtained from an external dataset. By modeling covariate effect, BSCET can also detect genes whose cell-type-specific AEI are associated with clinical factors. Through extensive benchmark evaluations, we show that BSCET correctly detected genes with cell-type-specific AEI and differential AEI between healthy and diseased samples using bulk RNA-seq data. BSCET also uncovered cell-type-specific AEIs that were missed in bulk data analysis when the directions of AEI are opposite in different cell types. We further applied BSCET to two pancreatic islet bulk RNA-seq datasets, and detected genes showing cell-type-specific AEI that are related to the progression of type 2 diabetes. Since bulk RNA-seq data are easily accessible, BSCET provided a convenient tool to integrate information from scRNA-seq data to gain insight on AEI with cell type resolution. Results from such analysis will advance our understanding of cell type contributions in human diseases.Author SummaryDetection of allelic expression imbalance (AEI), a phenomenon where the two alleles of a gene differ in their expression magnitude, is a key step towards the understanding of phenotypic variations among individuals. Existing methods detect AEI use bulk RNA sequencing (RNA-seq) data and ignore AEI variations among different cell types. Although single-cell RNA sequencing (scRNA-seq) has enabled the characterization of cell-to-cell heterogeneity in gene expression, the high costs have limited its application in AEI analysis. To overcome this limitation, we developed BSCET to characterize cell-type-specific AEI using the widely available bulk RNA-seq data by integrating cell-type composition information inferred from scRNA-seq samples. Since the degree of AEI may vary with disease phenotypes, we further extended BSCET to detect genes whose cell-type-specific AEIs are associated with clinical factors. Through extensive benchmark evaluations and analyses of two pancreatic islet bulk RNA-seq datasets, we demonstrated BSCET’s ability to refine bulk-level AEI to cell-type resolution, and to identify genes whose cell-type-specific AEIs are associated with the progression of type 2 diabetes. With the vast amount of easily accessible bulk RNA-seq data, we believe BSCET will be a valuable tool for elucidating cell type contributions in human diseases.

Download Full-text

Detecting cell-type-specific allelic expression imbalance by integrative analysis of bulk and single-cell RNA sequencing data

PLoS Genetics ◽

10.1371/journal.pgen.1009080 ◽

2021 ◽

Vol 17 (3) ◽

pp. e1009080

Author(s):

Jiaxin Fan ◽

Xuran Wang ◽

Rui Xiao ◽

Mingyao Li

Keyword(s):

Single Cell ◽

Rna Sequencing ◽

Cell Types ◽

Allelic Expression ◽

Rna Seq ◽

Allelic Expression Imbalance ◽

Cell Type ◽

Single Cell Rna Sequencing ◽

Cell Type Specific ◽

Different Cell Types

Allelic expression imbalance (AEI), quantified by the relative expression of two alleles of a gene in a diploid organism, can help explain phenotypic variations among individuals. Traditional methods detect AEI using bulk RNA sequencing (RNA-seq) data, a data type that averages out cell-to-cell heterogeneity in gene expression across cell types. Since the patterns of AEI may vary across different cell types, it is desirable to study AEI in a cell-type-specific manner. Although this can be achieved by single-cell RNA sequencing (scRNA-seq), it requires full-length transcript to be sequenced in single cells of a large number of individuals, which are still cost prohibitive to generate. To overcome this limitation and utilize the vast amount of existing disease relevant bulk tissue RNA-seq data, we developed BSCET, which enables the characterization of cell-type-specific AEI in bulk RNA-seq data by integrating cell type composition information inferred from a small set of scRNA-seq samples, possibly obtained from an external dataset. By modeling covariate effect, BSCET can also detect genes whose cell-type-specific AEI are associated with clinical factors. Through extensive benchmark evaluations, we show that BSCET correctly detected genes with cell-type-specific AEI and differential AEI between healthy and diseased samples using bulk RNA-seq data. BSCET also uncovered cell-type-specific AEIs that were missed in bulk data analysis when the directions of AEI are opposite in different cell types. We further applied BSCET to two pancreatic islet bulk RNA-seq datasets, and detected genes showing cell-type-specific AEI that are related to the progression of type 2 diabetes. Since bulk RNA-seq data are easily accessible, BSCET provided a convenient tool to integrate information from scRNA-seq data to gain insight on AEI with cell type resolution. Results from such analysis will advance our understanding of cell type contributions in human diseases.

Download Full-text

Single-Cell Regulatory Network Inference and Clustering Identifies Cell-Type Specific Expression Pattern of Transcription Factors in Mouse Sciatic Nerve

Frontiers in Cellular Neuroscience ◽

10.3389/fncel.2021.676515 ◽

2021 ◽

Vol 15 ◽

Author(s):

Mingchao Li ◽

Qing Min ◽

Matthew C. Banton ◽

Xinpeng Dun

Keyword(s):

Transcription Factors ◽

Sciatic Nerve ◽

Single Cell ◽

Cell Types ◽

Cell Type ◽

Specific Expression ◽

Single Cell Rna Sequencing ◽

Cell Type Specific Expression ◽

Cell Type Specific ◽

Different Cell Types

Advances in single-cell RNA sequencing technologies and bioinformatics methods allow for both the identification of cell types in a complex tissue and the large-scale gene expression profiling of various cell types in a mixture. In this report, we analyzed a single-cell RNA sequencing (scRNA-seq) dataset for the intact adult mouse sciatic nerve and examined cell-type specific transcription factor expression and activity during peripheral nerve homeostasis. In total, we identified 238 transcription factors expressed in nine different cell types of intact mouse sciatic nerve. Vascular smooth muscle cells have the lowest number of transcription factors expressed with 17 transcription factors identified. Myelinating Schwann cells (mSCs) have the highest number of transcription factors expressed, with 61 transcription factors identified. We created a cell-type specific expression map for the identified 238 transcription factors. Our results not only provide valuable information about the expression pattern of transcription factors in different cell types of adult peripheral nerves but also facilitate future studies to understand the function of key transcription factors in the peripheral nerve homeostasis and disease.

Download Full-text

Automated identification of cell-type–specific genes and alternative promoters

10.1101/2021.12.01.470587 ◽

2021 ◽

Author(s):

Mickaël Mendez ◽

Jayson Harshbarger ◽

Michael M. Hoffman

Keyword(s):

Random Forest ◽

Individual Cell ◽

Cell Types ◽

Differentially Expressed ◽

Pairwise Comparisons ◽

Alternative Promoters ◽

Cell Type ◽

Bootstrap Approach ◽

Cell Type Specific ◽

Different Cell Types

Background: Identifying key transcriptional features, such as genes or transcripts, involved in cellular differentiation remains a challenging problem. Current methods for identifying key transcriptional features predominantly rely on pairwise comparisons among different cell types. These methods also identify long lists of differentially expressed transcriptional features. Combining the results from many such pairwise comparisons to find the transcriptional features specific only to one cell type is not straightforward. Thus, one must have a principled method for amalgamating pairwise cell type comparisons that makes full use of prior knowledge about the developmental relationships between cell types. Method: We developed Cell Lineage Analysis (CLA), a computational method which identifies transcriptional features with expression patterns that discriminate cell types, incorporating Cell Ontology knowledge on the relationship between different cell types. CLA uses random forest classification with a stratified bootstrap to increase the accuracy of binary classifiers when each cell type have a different number of samples. Regularized random forest results in a classifier that selects few but important transcriptional features. For each cell type pair, CLA runs multiple instances of regularized random forest and reports the transcriptional features consistently selected. CLA not only discriminates individual cell types but can also discriminate lineages of cell types related in the developmental hierarchy. Results: We applied CLA to Functional Annotation of the Mammalian Genome 5 (FANTOM5) data and identified discriminative transcription factor and long non-coding RNA (lncRNA) genes for 71 human cell types. With capped analysis of gene expression (CAGE) data, CLA identified individual cell-type–specific alternative promoters for cell surface markers. Compared to random forest with a standard bootstrap approach, CLA's stratified bootstrap approach improved the accuracy of gene expression classification models for more than 95% of 2060 cell type pairs examined. Applied on 10X Genomics single-cell RNA-seq data for CD14+ monocytes and FCGR3A+ monocytes, CLA selected only 13 discriminative genes. These genes included the top 9 out of 370 significantly differentially expressed genes obtained from conventional differential expression analysis methods. Discussion: Our CLA method combines tools to simplify the interpretation of transcriptome datasets from many cell types. It automates the identification of the most differentially expressed genes for each cell type pairs CLA's lineage score allows easy identification of the best transcriptional markers for each cell type and lineage in both bulk and single-cell transcriptomic data. Availability: CLA is available at https://cla.hoffmanlab.org. We deposited the version of the CLA source with which we ran our experiments at https://doi.org/10.5281/zenodo.3630670. We deposited other analysis code and results at https://doi.org/10.5281/zenodo.5735636.

Download Full-text

Complex transcriptional regulation and independent evolution of fungal-like traits in a relative of animals

eLife ◽

10.7554/elife.08904 ◽

2015 ◽

Vol 4 ◽

Cited By ~ 39

Author(s):

Alex de Mendoza ◽

Hiroshi Suga ◽

Jon Permanyer ◽

Manuel Irimia ◽

Iñaki Ruiz-Trillo

Keyword(s):

Cell Types ◽

Cell Type ◽

Dynamic Regulation ◽

Transcriptional Dynamics ◽

Genome Regulation ◽

A Cell ◽

Cell Type Specific ◽

Gene Modules ◽

Multicellular Organisms ◽

Different Cell Types

Cell-type specification through differential genome regulation is a hallmark of complex multicellularity. However, it remains unclear how this process evolved during the transition from unicellular to multicellular organisms. To address this question, we investigated transcriptional dynamics in the ichthyosporean Creolimax fragrantissima, a relative of animals that undergoes coenocytic development. We find that Creolimax utilizes dynamic regulation of alternative splicing, long inter-genic non-coding RNAs and co-regulated gene modules associated with animal multicellularity in a cell-type specific manner. Moreover, our study suggests that the different cell types of the three closest animal relatives (ichthyosporeans, filastereans and choanoflagellates) are the product of lineage-specific innovations. Additionally, a proteomic survey of the secretome reveals adaptations to a fungal-like lifestyle. In summary, the diversity of cell types among protistan relatives of animals and their complex genome regulation demonstrates that the last unicellular ancestor of animals was already capable of elaborate specification of cell types.

Download Full-text