scholarly journals CoBRA: Containerized Bioinformatics workflow for Reproducible ChIP/ATAC-seq Analysis - from differential peak calling to pathway analysis

2020 ◽  
Author(s):  
Xintao Qiu ◽  
Avery S. Feit ◽  
Ariel Feiglin ◽  
Yingtian Xie ◽  
Nikolas Kesten ◽  
...  

AbstractChIP-seq and ATAC-seq have become essential technologies used as effective methods of measuring protein-DNA interactions and chromatin accessibility. However, there is a need for a scalable and reproducible pipeline that incorporates correct normalization between samples, adjustment of copy number variations, and integration of new downstream analysis tools. Here we present CoBRA, a modularized computational workflow which quantifies ChIP and ATAC-seq peak regions and performs unsupervised and supervised analysis. CoBRA provides a comprehensive state-of-the-art ChIP and ATAC-seq analysis pipeline that is usable by scientists with limited computational experience. This enables researchers to gain rapid insight into protein-DNA interactions and chromatin accessibility through sample clustering, differential peak calling, motif enrichment, comparison of sites to a reference DB and pathway analysis.Code availability: https://bitbucket.org/cfce/cobra

2021 ◽  
Author(s):  
Jeremiah Suryatenggara ◽  
Kol Jia Yong ◽  
Danielle E. Tenen ◽  
Daniel G. Tenen ◽  
Mahmoud A. Bassal

AbstractChIP-Seq is a technique used to analyse protein-DNA interactions. The protein-DNA complex is pulled down using a protein antibody, after which sequencing and analysis of the bound DNA fragments is performed. A key bioinformatics analysis step is “peak” calling - identifying regions of enrichment. Benchmarking studies have consistently shown that no optimal peak caller exists. Peak callers have distinct selectivity and specificity characteristics which are often not additive and seldom completely overlap in many scenarios. In the absence of a universal peak caller, we rationalized one ought to utilize multiple peak-callers to 1) gauge peak confidence as determined through detection by multiple algorithms, and 2) more thoroughly survey the protein-bound landscape by capturing peaks not detected by individual peak callers owing to algorithmic limitations and biases. We therefore developed an integrated ChIP-Seq Analysis Pipeline (ChIP-AP) which performs all analysis steps from raw fastq files to final result, and utilizes four commonly used peak callers to more thoroughly and comprehensively analyse datasets. Results are integrated and presented in a single file enabling users to apply selectivity and sensitivity thresholds to select the consensus peak set, the union peak set, or any sub-set in-between to more confidently and comprehensively explore the protein-bound landscape. (https://github.com/JSuryatenggara/ChIP-AP).


Sign in / Sign up

Export Citation Format

Share Document