Assessment of the CTCF Binding Sites and Repeat-Positions Upstream the Human H19 Gene
AbstractThe human H19 and IGF2 genes share an Imprinting Control Region (ICR) that regulates gene expression in a parent-of-origin dependent manner. Understanding of the ICR sequence organization is critical to accurate localization of disease-associated abnormalities including Beckwith-Wiedemann and Silver-Russell syndromes. Previous studies established that the ICR of the H19 - IGF2 imprinted domain included several repeated DNA segments. Using BLAST, BLAT, and Clustal Omega, I conducted detailed sequence comparisons to evaluate the annotation of the unique-repeats upstream of the H19 transcription start site (TSS) and to investigate the extent of similarities among the various repeats. Initial analyses confirmed the existence of two DNA segments consisting of two types of repeats (A and B). However, I find that one of the repeats (B7) is unlikely to be a partial repeat. I provide the genomic positions of the various repeats in the build hg19 of the human genome. I also evaluated the previously predicted CTCF sites (1 to 7) in the context of the ENCODE data: including the positions of DNase I HS clusters and results of ChIP assays. My evaluations did not support the existence of CTCF site 5. Furthermore, the ENCODE data revealed a previously unknown chromatin boundary (consisting of CTCF, RAD21, and SMC3), in a CpG island (CpG27) between the A1 repeat and the H19 TSS. Furthermore, a sequence within this boundary corresponds to a newly discovered CTCF site (I named it CTCF site 8). My discovery of this chromatin boundary in CpG27 entails mechanistic implications.