scholarly journals HERON: A Novel Tool Enables Identification of Long, Weakly Enriched Genomic Domains in ChIP-seq Data

2021 ◽  
Vol 22 (15) ◽  
pp. 8123
Author(s):  
Anna Macioszek ◽  
Bartek Wilczynski

The explosive development of next-generation sequencing-based technologies has allowed us to take an unprecedented look at many molecular signatures of the non-coding genome. In particular, the ChIP-seq (Chromatin ImmunoPrecipitation followed by sequencing) technique is now very commonly used to assess the proteins associated with different non-coding DNA regions genome-wide. While the analysis of such data related to transcription factor binding is relatively straightforward, many modified histone variants, such as H3K27me3, are very important for the process of gene regulation but are very difficult to interpret. We propose a novel method, called HERON (HiddEn MaRkov mOdel based peak calliNg), for genome-wide data analysis that is able to detect DNA regions enriched for a certain feature, even in difficult settings of weakly enriched long DNA domains. We demonstrate the performance of our method both on simulated and experimental data.

2019 ◽  
Vol 2 (4) ◽  
pp. e201900318 ◽  
Author(s):  
Junaid Akhtar ◽  
Piyush More ◽  
Steffen Albrecht ◽  
Federico Marini ◽  
Waldemar Kaiser ◽  
...  

Chromatin immunoprecipitation (ChIP) followed by next generation sequencing (ChIP-Seq) is a powerful technique to study transcriptional regulation. However, the requirement of millions of cells to generate results with high signal-to-noise ratio precludes it in the study of small cell populations. Here, we present a tagmentation-assisted fragmentation ChIP (TAF-ChIP) and sequencing method to generate high-quality histone profiles from low cell numbers. The data obtained from the TAF-ChIP approach are amenable to standard tools for ChIP-Seq analysis, owing to its high signal-to-noise ratio. The epigenetic profiles from TAF-ChIP approach showed high agreement with conventional ChIP-Seq datasets, thereby underlining the utility of this approach.


2017 ◽  
Author(s):  
George Busby ◽  
Ryan Christ ◽  
Gavin Band ◽  
Ellen Leffler ◽  
Quang Si Le ◽  
...  

AbstractGene-flow from an ancestrally differentiated group has been shown to be a powerful source of selectively advantageous variants. To understand whether recent gene-flow may have contributed to adaptation among humans in sub-Saharan Africa, we applied a novel method to identify deviations in ancestry inferred from genome-wide data in 48 populations. Among the signals of ancestry deviation that we find in the Fula, an historically pastoralist ethnic group from the Gambia, are the region that encodes the lactose persistence phenotype, LCT/MCM6, which has the highest proportion of Eurasian ancestry in the genome. The region with the lowest proportion of non-African ancestry is across DARC, which encodes the Duffy null phenotype and is protective for Plasmodium vivax malaria. In the Jola from the Gambia and a Khoesan speaking group from Namibia we find multiple regions with inferred ancestry deviation including the Major Histocompatibility Complex. Our analysis shows the potential for adaptive gene-flow in recent human history.


2018 ◽  
Author(s):  
Dejian Zhao ◽  
Deyou Zheng

AbstractBackgroundNoises and artifacts may arise in several steps of the next-generation sequencing (NGS) process. Recently, a NGS library preparation method called SMART, orSwitchingMechanismAt the 5’ end of theRNATranscript, is introduced to prepare ChIP-seq (chromatin immunoprecipitation and deep sequencing) libraries from small amount of DNA material. The protocol adds Ts to the 3’ end of DNA templates, which is subsequently recognized and used by SMART poly(dA) primers for reverse transcription and then addition of PCR primers and sequencing adapters. The poly(dA) primers, however, can anneal to poly(T) sequences in a genome and amplify DNA fragments that are not enriched in the immunoprecipitated DNA templates. This off-target amplification results in false signals in the ChIP-seq data.ResultsHere, we show that the off-target ChIP-seq reads derived from false amplification of poly(T/A) genomic sequences have unique and strand-specific features. Accordingly, we develop a tool (called “SMARTcleaner”) that can exploit the features to remove SMART ChIP-seq artifacts. Application of SMARTcleaner to several SMART ChIP-seq datasets demonstrates that it can remove reads from off-target amplification effectively, leading to improved ChIP-seq peaks and results.ConclusionsSMARTcleaner could identify and clean the false signals in SMART-based ChIP-seq libraries, leading to improvement in peak calling, and downstream data analysis and interpretation.


2018 ◽  
Author(s):  
Junaid Akhtar ◽  
Piyush More ◽  
Steffen Albrecht ◽  
Federico Marini ◽  
Waldemar Kaiser ◽  
...  

AbstractChromatin immunoprecipitation (ChIP) followed by next generation sequencing (ChIP-Seq) is powerful technique to study transcriptional regulation. However, the requirement of millions of cells to generate results with high signal-to-noise ratio precludes it in the study of small cell populations. Here, we present a Tagmentation-Assisted Fragmentation ChIP (TAF-ChIP) and sequencing method to generate high-quality results from low cell numbers. The data obtained from the TAF-ChIP approach is amenable to standard tools for ChIP-Seq analysis, owing to its high signal-to-noise ratio. The epigenetic profiles from TAF-ChIP approach showed high agreement with conventional ChIP-Seq datasets, thereby underlining the utility of this approach.


2022 ◽  
Author(s):  
William M Yashar ◽  
Garth Kong ◽  
Jake VanCampen ◽  
Brittany M Smith ◽  
Daniel J Coleman ◽  
...  

Genome-wide mapping of the histone modification landscape is critical to understanding tran-scriptional regulation. Cleavage Under Targets and Tagmentation (CUT&Tag) is a new method for profiling the localization of covalent histone modifications, offering improved sensitivity and decreased cost compared with Chromatin Immunoprecipitation Sequencing (ChIP-seq). Here, we present GoPeaks, a peak calling method specifically designed for histone modification CUT&Tag data. GoPeaks implements a Binomial distribution and stringent read count cut-off to nominate candidate genomic regions. We compared the performance of GoPeaks against com-monly used peak calling algorithms to detect H3K4me3, H3K4me1, and H3K27Ac peaks from CUT&Tag data. These histone modifications display a range of peak profiles and are frequently used in epigenetic studies. We found GoPeaks robustly detects genome-wide histone modifica-tions and, notably, identifies H3K27Ac with improved sensitivity compared to other standard peak calling algorithms.


2021 ◽  
Vol 7 (3) ◽  
pp. eabd9036
Author(s):  
Sara Saez-Atienzar ◽  
Sara Bandres-Ciga ◽  
Rebekah G. Langston ◽  
Jonggeol J. Kim ◽  
Shing Wan Choi ◽  
...  

Despite the considerable progress in unraveling the genetic causes of amyotrophic lateral sclerosis (ALS), we do not fully understand the molecular mechanisms underlying the disease. We analyzed genome-wide data involving 78,500 individuals using a polygenic risk score approach to identify the biological pathways and cell types involved in ALS. This data-driven approach identified multiple aspects of the biology underlying the disease that resolved into broader themes, namely, neuron projection morphogenesis, membrane trafficking, and signal transduction mediated by ribonucleotides. We also found that genomic risk in ALS maps consistently to GABAergic interneurons and oligodendrocytes, as confirmed in human single-nucleus RNA-seq data. Using two-sample Mendelian randomization, we nominated six differentially expressed genes (ATG16L2, ACSL5, MAP1LC3A, MAPKAPK3, PLXNB2, and SCFD1) within the significant pathways as relevant to ALS. We conclude that the disparate genetic etiologies of this fatal neurological disease converge on a smaller number of final common pathways and cell types.


2021 ◽  
Vol 7 (13) ◽  
pp. eabe4414
Author(s):  
Guido Alberto Gnecchi-Ruscone ◽  
Elmira Khussainova ◽  
Nurzhibek Kahbatkyzy ◽  
Lyazzat Musralina ◽  
Maria A. Spyrou ◽  
...  

The Scythians were a multitude of horse-warrior nomad cultures dwelling in the Eurasian steppe during the first millennium BCE. Because of the lack of first-hand written records, little is known about the origins and relations among the different cultures. To address these questions, we produced genome-wide data for 111 ancient individuals retrieved from 39 archaeological sites from the first millennia BCE and CE across the Central Asian Steppe. We uncovered major admixture events in the Late Bronze Age forming the genetic substratum for two main Iron Age gene-pools emerging around the Altai and the Urals respectively. Their demise was mirrored by new genetic turnovers, linked to the spread of the eastern nomad empires in the first centuries CE. Compared to the high genetic heterogeneity of the past, the homogenization of the present-day Kazakhs gene pool is notable, likely a result of 400 years of strict exogamous social rules.


GigaScience ◽  
2021 ◽  
Vol 10 (1) ◽  
Author(s):  
Taras K Oleksyk ◽  
Walter W Wolfsberger ◽  
Alexandra M Weber ◽  
Khrystyna Shchubelka ◽  
Olga T Oleksyk ◽  
...  

Abstract Background The main goal of this collaborative effort is to provide genome-wide data for the previously underrepresented population in Eastern Europe, and to provide cross-validation of the data from genome sequences and genotypes of the same individuals acquired by different technologies. We collected 97 genome-grade DNA samples from consented individuals representing major regions of Ukraine that were consented for public data release. BGISEQ-500 sequence data and genotypes by an Illumina GWAS chip were cross-validated on multiple samples and additionally referenced to 1 sample that has been resequenced by Illumina NovaSeq6000 S4 at high coverage. Results The genome data have been searched for genomic variation represented in this population, and a number of variants have been reported: large structural variants, indels, copy number variations, single-nucletide polymorphisms, and microsatellites. To our knowledge, this study provides the largest to-date survey of genetic variation in Ukraine, creating a public reference resource aiming to provide data for medical research in a large understudied population. Conclusions Our results indicate that the genetic diversity of the Ukrainian population is uniquely shaped by evolutionary and demographic forces and cannot be ignored in future genetic and biomedical studies. These data will contribute a wealth of new information bringing forth a wealth of novel, endemic and medically related alleles.


Nature ◽  
2021 ◽  
Vol 592 (7853) ◽  
pp. 253-257 ◽  
Author(s):  
Mateja Hajdinjak ◽  
Fabrizio Mafessoni ◽  
Laurits Skov ◽  
Benjamin Vernot ◽  
Alexander Hübner ◽  
...  

AbstractModern humans appeared in Europe by at least 45,000 years ago1–5, but the extent of their interactions with Neanderthals, who disappeared by about 40,000 years ago6, and their relationship to the broader expansion of modern humans outside Africa are poorly understood. Here we present genome-wide data from three individuals dated to between 45,930 and 42,580 years ago from Bacho Kiro Cave, Bulgaria1,2. They are the earliest Late Pleistocene modern humans known to have been recovered in Europe so far, and were found in association with an Initial Upper Palaeolithic artefact assemblage. Unlike two previously studied individuals of similar ages from Romania7 and Siberia8 who did not contribute detectably to later populations, these individuals are more closely related to present-day and ancient populations in East Asia and the Americas than to later west Eurasian populations. This indicates that they belonged to a modern human migration into Europe that was not previously known from the genetic record, and provides evidence that there was at least some continuity between the earliest modern humans in Europe and later people in Eurasia. Moreover, we find that all three individuals had Neanderthal ancestors a few generations back in their family history, confirming that the first European modern humans mixed with Neanderthals and suggesting that such mixing could have been common.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
An Zheng ◽  
Michael Lamkin ◽  
Yutong Qiu ◽  
Kevin Ren ◽  
Alon Goren ◽  
...  

Abstract Background A major challenge in evaluating quantitative ChIP-seq analyses, such as peak calling and differential binding, is a lack of reliable ground truth data. Accurate simulation of ChIP-seq data can mitigate this challenge, but existing frameworks are either too cumbersome to apply genome-wide or unable to model a number of important experimental conditions in ChIP-seq. Results We present ChIPs, a toolkit for rapidly simulating ChIP-seq data using statistical models of key experimental steps. We demonstrate how ChIPs can be used for a range of applications, including benchmarking analysis tools and evaluating the impact of various experimental parameters. ChIPs is implemented as a standalone command-line program written in C++ and is available from https://github.com/gymreklab/chips. Conclusions ChIPs is an efficient ChIP-seq simulation framework that generates realistic datasets over a flexible range of experimental conditions. It can serve as an important component in various ChIP-seq analyses where ground truth data are needed.


Sign in / Sign up

Export Citation Format

Share Document