scholarly journals Comprehensive, integrated, and phased whole-genome analysis of the primary ENCODE cell line K562

2017 ◽  
Author(s):  
Bo Zhou ◽  
Steve S. Ho ◽  
Stephanie U. Greer ◽  
Xiaowei Zhu ◽  
John M. Bell ◽  
...  

ABSTRACTK562 is widely used in biomedical research. It is one of three tier-one cell lines of ENCODE and also most commonly used for large-scale CRISPR/Cas9 screens. Although its functional genomic and epigenomic characteristics have been extensively studied, its genome sequence and genomic structural features have never been comprehensively analyzed. Such information is essential for the correct interpretation and understanding of the vast troves of existing functional genomics and epigenomics data for K562. We performed and integrated deep-coverage whole-genome (short-insert), mate-pair, and linked-read sequencing as well as karyotyping and array CGH analysis to identify a wide spectrum of genome characteristics in K562: copy numbers (CN) of aneuploid chromosome segments at high-resolution, SNVs and Indels (both corrected for CN in aneuploid regions), loss of heterozygosity, mega-base-scale phased haplotypes often spanning entire chromosome arms, structural variants (SVs) including small and large-scale complex SVs and non-reference retrotransposon insertions. Many SVs were phased, assembled, and experimentally validated. We identified multiple allele-specific deletions and duplications within the tumor suppressor geneFHIT. Taking aneuploidy into account, we re-analyzed K562 RNA-seq and whole-genome bisulfite sequencing data for allele-specific expression and allele-specific DNA methylation. We also show examples of how deeper insights into regulatory complexity are gained by integrating genomic variant information and structural context with functional genomics and epigenomics data. Furthermore, using K562 haplotype information, we produced an allele-specific CRISPR targeting map. This comprehensive whole-genome analysis serves as a resource for future studies that utilize K562 as well as a framework for the analysis of other cancer genomes.


2018 ◽  
Author(s):  
Bo Zhou ◽  
Steve S. Ho ◽  
Stephanie U. Greer ◽  
Noah Spies ◽  
John M. Bell ◽  
...  

SUMMARYThe HepG2 cancer cell line is one of the most widely-used biomedical research and one of the main cell lines of ENCODE. Vast numbers of functional genomics and epigenomics datasets have been produced to characterize its biology. However, the correct interpretation such data requires an understanding of the cell line’s genome sequence and genome structure. Using a variety of sequencing and analysis methods, we identified a wide spectrum of HepG2 genome characteristics: copy numbers of chromosomal segments, SNVs and Indels (corrected for aneuploidy), phased haplotypes extending to entire chromosome arms, loss of heterozygosity, retrotransposon insertions, structural variants (SVs) including complex and somatic genomic rearrangements. We also identified allele-specific expression and DNA methylation genome-wide and assembled an allele-specific CRISPR/Cas9 targeting map.SIGNIFICANCEHaplotype-resolved and comprehensive whole-genome analysis of a widely-used cell line for cancer research and ENCODE, HepG2, serves as an essential resource for unlocking complex cancer gene regulation using a genome-integrated framework and also provides genomic context for the analysis of ~1,000 functional datasets to date on ENCODE for biological discovery. We also demonstrate how deeper insights into genomic regulatory complexity are gained by adopting a genome-integrated framework.



2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Ying Wang ◽  
Jidong Ru ◽  
Xianglian Meng ◽  
Jianhua Song ◽  
Qingfeng Jiang ◽  
...  

Single nucleotide polymorphisms (SNPs) play a significant role in microRNA (miRNA) generation, processing, and function and contribute to multiple phenotypes and diseases. Therefore, whole-genome analysis of how SNPs affect miRNA maturation mechanisms is important for precision medicine. The present study established an SNP-associated pre-miRNA (SNP-pre-miRNA) database, named miRSNPBase, and constructed SNP-pre-miRNA sequences. We also identified phenotypes and disease biomarker-associated isoform miRNA (isomiR) based on miRFind, which was developed in our previous study. We identified functional SNPs and isomiRs. We analyzed the biological characteristics of functional SNPs and isomiRs and studied their distribution in different ethnic groups using whole-genome analysis. Notably, we used individuals from Great Britain (GBR) as examples and identified isomiRs and isomiR-associated SNPs (iso-SNPs). We performed sequence alignments of isomiRs and miRNA sequencing data to verify the identified isomiRs and further revealed GBR ethnographic epigenetic dominant biomarkers. The SNP-pre-miRNA database consisted of 886 pre-miRNAs and 2640 SNPs. We analyzed the effects of SNP type, SNP location, and SNP-mediated free energy change during mature miRNA biogenesis and found that these factors were closely associated to mature miRNA biogenesis. Remarkably, 158 isomiRs were verified in the miRNA sequencing data for the 18 GBR samples. Our results indicated that SNPs affected the mature miRNA processing mechanism and contributed to the production of isomiRs. This mechanism may have important significance for epigenetic changes and diseases.



Author(s):  
Magdalena Wysocka ◽  
Tamar Monteiro ◽  
Carine de Pina ◽  
Deisy Gonçalves ◽  
Sandrine de Pina ◽  
...  




2016 ◽  
Vol 79 ◽  
pp. 44-50 ◽  
Author(s):  
Jana McGinnis ◽  
Jennifer Laplante ◽  
Matthew Shudt ◽  
Kirsten St. George


2007 ◽  
Vol 9 (12) ◽  
pp. 2993-3007 ◽  
Author(s):  
Lisa Y. Stein ◽  
Daniel J. Arp ◽  
Paul M. Berube ◽  
Patrick S. G. Chain ◽  
Loren Hauser ◽  
...  


Sign in / Sign up

Export Citation Format

Share Document