Background:
Transcription factors are DNA-binding proteins that play key roles in
many fundamental biological processes. Unraveling their interactions with DNA is essential
to identify their target genes and understand the regulatory network. Genome-wide identification
of their binding sites became feasible thanks to recent progress in experimental and computational
approaches. ChIP-chip, ChIP-seq, and ChIP-exo are three widely used techniques
to demarcate genome-wide transcription factor binding sites.
Objective:
This review aims to provide an overview of these three techniques including their
experiment procedures, computational approaches, and popular analytic tools.
Conclusion:
ChIP-chip, ChIP-seq, and ChIP-exo have been the major techniques to study genome-
wide in vivo protein-DNA interaction. Due to the rapid development of next-generation
sequencing technology, array-based ChIP-chip is deprecated and ChIP-seq has become the
most widely used technique to identify transcription factor binding sites in genome-wide. The
newly developed ChIP-exo further improves the spatial resolution to single nucleotide. Numerous
tools have been developed to analyze ChIP-chip, ChIP-seq and ChIP-exo data. However,
different programs may employ different mechanisms or underlying algorithms thus
each will inherently include its own set of statistical assumption and bias. So choosing the
most appropriate analytic program for a given experiment needs careful considerations.
Moreover, most programs only have command line interface so their installation and usage
will require basic computation expertise in Unix/Linux.