GRIDSS2: harnessing the power of phasing and single breakends in somatic structural variant detection
AbstractHere we present GRIDSS2, a general purpose structural variant caller optimised for tumour/normal somatic calling. Using cell line, patient sample validation and cohort-level comparisons, we show GRIDSS2 outperforms recent state-of-the-art tools. We demonstrate GRIDSS2 retains high sensitivity and precision even for small events by identifying a small (32-100bp) duplication signature strongly associated with colorectal cancer using 3,782 metastatic cancers that have been deeply sequenced by the Hartwig Medical Foundation. Essential to the high precision achieved by GRIDSS2 is the novel reporting of single breakend variants: structural variants in which only one side can be unambiguously determined. We show that the inclusion of single breakends reduces the false negative rate from 10.4% to 3.4%. Demonstrating the power single breakend calling has in genomic regions traditionally considered inaccessible to short read callers, we find that 47% of somatic centromeric breaks are repaired to non-centromeric sequence, with chromosome 1 exhibiting a unique centromeric rearrangement signature. Finally, we show that somatic structural variants are highly clustered with GRIDSS2 able to phase 16% of somatic structural variants in the Hartwig cohort from short read sequencing alone.