scholarly journals Spectacle: Faster and more accurate chromatin state annotation using spectral learning

2014 ◽  
Author(s):  
Jimin Song ◽  
Kevin C Chen

Recently, a wealth of epigenomic data has been generated by biochemical assays and next-generation sequencing (NGS) technologies. In particular, histone modification data generated by the ENCODE project and other large-scale projects show specific patterns associated with regulatory elements in the human genome.It is important to build a unified statistical model to decipher the patterns of multiple histone modifications in a cell type to annotate chromatin states such as transcription start sites, enhancers and transcribed regions rather than to map histone modifications individually to regulatory elements. Several genome-wide statistical models have been developed based on hidden Markov models (HMMs). These methods typically use the Expectation-Maximization (EM) algorithm to estimate the parameters of the model.Here we used spectral learning, a state-of-the-art parameter estimation algorithm in machine learning.We found that spectral learning plus a few (up to five) iterations of local optimization of the likelihood outperforms the standard EM algorithm.We also evaluated our software implementation called Spectacle on independent biological datasets and found that Spectacle annotated experimentally defined functional elements such as enhancers significantly better than a previous state-of-the-art method. Spectacle can be downloaded from https://github.com/jiminsong/Spectacle .

2020 ◽  
Author(s):  
Sylvan C. Baca ◽  
David Y. Takeda ◽  
Ji-Heui Seo ◽  
Justin Hwang ◽  
Sheng Yu Ku ◽  
...  

AbstractLineage plasticity, the ability of a cell to alter its identity, is an increasingly common mechanism of adaptive resistance to targeted therapy in cancer1,2. An archetypal example is the development of neuroendocrine prostate cancer (NEPC) after treatment of prostate adenocarcinoma (PRAD) with inhibitors of androgen signaling. NEPC is an aggressive variant of prostate cancer that aberrantly expresses genes characteristic of neuroendocrine (NE) tissues and no longer depends on androgens. To investigate the epigenomic basis of this resistance mechanism, we profiled histone modifications in NEPC and PRAD patient-derived xenografts (PDXs) using chromatin immunoprecipitation and sequencing (ChIP-seq). We identified a vast network of cis-regulatory elements (N~15,000) that are recurrently activated in NEPC. The FOXA1 transcription factor (TF), which pioneers androgen receptor (AR) chromatin binding in the prostate epithelium3,4, is reprogrammed to NE-specific regulatory elements in NEPC. Despite loss of dependence upon AR, NEPC maintains FOXA1 expression and requires FOXA1 for proliferation and expression of NE lineage-defining genes. Ectopic expression of the NE lineage TFs ASCL1 and NKX2-1 in PRAD cells reprograms FOXA1 to bind to NE regulatory elements and induces enhancer activity as evidenced by histone modifications at these sites. Our data establish the importance of FOXA1 in NEPC and provide a principled approach to identifying novel cancer dependencies through epigenomic profiling.


2017 ◽  
Author(s):  
Can Wang ◽  
Shihua Zhang

AbstractHistone modifications have been widely elucidated to play vital roles in gene regulation and cell identity. The Roadmap Epigenomics Consortium generated a reference catalogue of several key histone modifications across >100s of human cell types and tissues. Decoding these epigenomes into functional regulatory elements is a challenging task in computational biology. To this end, we adopted a differential chromatin modification analysis framework to comprehensively determine and characterize cell type-specific regulatory elements (CSREs) and their histone modification codes in the human epigenomes of five histone modifications across 127 tissues or cell types. The CSREs show significant relevance with cell type-specific biological functions and diseases and cell identity. Clustering of CSREs with their specificity signals reveals diverse histone codes, demonstrating the diversity of functional roles of CSREs within the same cell or tissue. Last but not least, dynamics of CSREs from close cell types or tissues can give a detailed view of developmental processes such as normal tissue development and cancer occurrence.


2019 ◽  
Vol 35 (14) ◽  
pp. i23-i30 ◽  
Author(s):  
Sahar Tavakoli ◽  
Shibu Yooseph

Abstract Motivation The interactions among the constituent members of a microbial community play a major role in determining the overall behavior of the community and the abundance levels of its members. These interactions can be modeled using a network whose nodes represent microbial taxa and edges represent pairwise interactions. A microbial network is typically constructed from a sample-taxa count matrix that is obtained by sequencing multiple biological samples and identifying taxa counts. From large-scale microbiome studies, it is evident that microbial community compositions and interactions are impacted by environmental and/or host factors. Thus, it is not unreasonable to expect that a sample-taxa matrix generated as part of a large study involving multiple environmental or clinical parameters can be associated with more than one microbial network. However, to our knowledge, microbial network inference methods proposed thus far assume that the sample-taxa matrix is associated with a single network. Results We present a mixture model framework to address the scenario when the sample-taxa matrix is associated with K microbial networks. This count matrix is modeled using a mixture of K Multivariate Poisson Log-Normal distributions and parameters are estimated using a maximum likelihood framework. Our parameter estimation algorithm is based on the minorization–maximization principle combined with gradient ascent and block updates. Synthetic datasets were generated to assess the performance of our approach on absolute count data, compositional data and normalized data. We also addressed the recovery of sparse networks based on an l1-penalty model. Availability and implementation MixMPLN is implemented in R and is freely available at https://github.com/sahatava/MixMPLN. Supplementary information Supplementary data are available at Bioinformatics online.


PLoS Genetics ◽  
2021 ◽  
Vol 17 (9) ◽  
pp. e1009821
Author(s):  
Donghui Choe ◽  
Richard Szubin ◽  
Saugat Poudel ◽  
Anand Sastry ◽  
Yoseb Song ◽  
...  

RNA sequencing techniques have enabled the systematic elucidation of gene expression (RNA-Seq), transcription start sites (differential RNA-Seq), transcript 3′ ends (Term-Seq), and post-transcriptional processes (ribosome profiling). The main challenge of transcriptomic studies is to remove ribosomal RNAs (rRNAs), which comprise more than 90% of the total RNA in a cell. Here, we report a low-cost and robust bacterial rRNA depletion method, RiboRid, based on the enzymatic degradation of rRNA by thermostable RNase H. This method implemented experimental considerations to minimize nonspecific degradation of mRNA and is capable of depleting pre-rRNAs that often comprise a large portion of RNA, even after rRNA depletion. We demonstrated the highly efficient removal of rRNA up to a removal efficiency of 99.99% for various transcriptome studies, including RNA-Seq, Term-Seq, and ribosome profiling, with a cost of approximately $10 per sample. This method is expected to be a robust method for large-scale high-throughput bacterial transcriptomic studies.


2018 ◽  
Author(s):  
Uwe Schwartz ◽  
Attila Németh ◽  
Sarah Diermeier ◽  
Josef Exler ◽  
Stefan Hansch ◽  
...  

AbstractPackaging of DNA into chromatin regulates DNA accessibility and, consequently, all DNA-dependent processes, such as transcription, recombination, repair, and replication. The nucleosome is the basic packaging unit of DNA forming arrays that are suggested, by biochemical studies, to fold hierarchically into ordered higher-order structures of chromatin. This defined organization of chromatin has been recently questioned using microscopy techniques, proposing a rather irregular structure. To gain more insight into the principles of chromatin organization, we applied an in situ differential MNase-seq strategy and analyzed in silico the results of complete and partial digestions of human chromatin. We investigated whether different levels of chromatin packaging exist in the cell. Thus, we assessed the accessibility of chromatin within distinct domains of kb to Mb genomic regions by utilizing statistical data analyses and computer modelling. We found no difference in the degree of compaction between domains of euchromatin and heterochromatin or between other sequence and epigenomic features of chromatin. Thus, our data suggests the absence of differentially compacted domains of higher-order structures of chromatin. Moreover, we identified only local structural changes, with individual hyper-accessible nucleosomes surrounding regulatory elements, such as enhancers and transcription start sites. The regulatory sites per se are occupied with structurally altered nucleosomes, exhibiting increased MNase sensitivity. Our findings provide biochemical evidence that supports an irregular model of large-scale chromatin organization.


2021 ◽  
Author(s):  
Li Yao ◽  
Jin Liang ◽  
Abdullah Ozer ◽  
Alden King-Yung Leung ◽  
John T. Lis ◽  
...  

Mounting evidence supports the idea that transcriptional patterns serve as more specific identifiers of active enhancers than histone marks; however, the optimal strategy to identify active enhancers both experimentally and computationally has not been determined. In this study, we compared 13 genome-wide RNA sequencing assays in K562 cells and showed that the nuclear run-on followed by cap-selection assay (namely, GRO/PRO-cap) has significant advantages in eRNA detection and active enhancer identification. We also introduced a new analytical tool, Peak Identifier for Nascent-Transcript Sequencing (PINTS), to identify active promoters and enhancers genome-wide and pinpoint the precise location of the 5′ transcription start sites (TSSs) within these regulatory elements. Finally, we compiled a comprehensive enhancer candidate compendium based on the detected eRNA TSSs available in 120 cell and tissue types. To facilitate the exploration and prioritization of these enhancer candidates, we also built a user-friendly web server (https://pints.yulab.org) for the compendium with various additional genomic and epigenomic annotations. With the knowledge of the best available assays and pipelines, this large-scale annotation of candidate enhancers will pave the road for selection and characterization of their functions in a time-, labor-, and cost-effective manner in future.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Sylvan C. Baca ◽  
David Y. Takeda ◽  
Ji-Heui Seo ◽  
Justin Hwang ◽  
Sheng Yu Ku ◽  
...  

AbstractLineage plasticity, the ability of a cell to alter its identity, is an increasingly common mechanism of adaptive resistance to targeted therapy in cancer. An archetypal example is the development of neuroendocrine prostate cancer (NEPC) after treatment of prostate adenocarcinoma (PRAD) with inhibitors of androgen signaling. NEPC is an aggressive variant of prostate cancer that aberrantly expresses genes characteristic of neuroendocrine (NE) tissues and no longer depends on androgens. Here, we investigate the epigenomic basis of this resistance mechanism by profiling histone modifications in NEPC and PRAD patient-derived xenografts (PDXs) using chromatin immunoprecipitation and sequencing (ChIP-seq). We identify a vast network of cis-regulatory elements (N~15,000) that are recurrently activated in NEPC. The FOXA1 transcription factor (TF), which pioneers androgen receptor (AR) chromatin binding in the prostate epithelium, is reprogrammed to NE-specific regulatory elements in NEPC. Despite loss of dependence upon AR, NEPC maintains FOXA1 expression and requires FOXA1 for proliferation and expression of NE lineage-defining genes. Ectopic expression of the NE lineage TFs ASCL1 and NKX2-1 in PRAD cells reprograms FOXA1 to bind to NE regulatory elements and induces enhancer activity as evidenced by histone modifications at these sites. Our data establish the importance of FOXA1 in NEPC and provide a principled approach to identifying cancer dependencies through epigenomic profiling.


Sign in / Sign up

Export Citation Format

Share Document