Resolving complex structures at oncovirus integration loci with conjugate graph
ABSTRACTOncovirus integrations cause complex structural variations (SVs) on host genomes. We propose a conjugate graph model to reconstruct the rearranged local genomic map (LGM) at integrated loci. Simulation tests prove the reliability and credibility of the algorithm. Applications of the algorithm to whole-genome sequencing data of Human papillomavirus (HPV) and hepatitis B virus (HBV)-infected cancer samples gained biological insights on oncovirus integrations. We observed five affection patterns of oncovirus integrations from the HPV and HBV-integrated cancer samples, including the exon loss, promoter gain, hyper-amplification of tumor gene, the viral cis-regulation inserted at the single intron and at the intergenic region. We found that the focal duplicates and host SVs are frequent in the HPV-integrated LGMs, while the focal deletions and complex virus SVs are prevalent in HBV-integrated LGMs. Furthermore, with the results yields from our method, we found the enhanced microhomology-mediated end joining (MMEJ) might lead to both HPV and HBV integrations, and conjectured that the HPV integrations might mainly occur during the DNA replication process.