Profiling Germline Adaptive Immune Receptor Repertoire with gAIRR Suite
ABSTRACTThe genetic profiling of germline Adaptive Immune Receptor Repertoire (AIRR), including T cell receptor (TR) and immunoglobulin (IG), might be medically important but currently insurmountable due to high genetic diversity and complex recombination. In this study, we developed the gAIRR Suite comprising three modules. gAIRR-seq, a probe capture-based targeted sequencing pipeline, profiles genomic sequences of TR and IG from individual DNA samples. The computational pipelines gAIRR-call and gAIRR-annotate call alleles from gAIRR-seq reads and whole-genome assemblies. We applied gAIRR-seq and gAIRR-call to genotype TRV and TRJ alleles of Genome in a Bottle (GIAB) DNA samples with 100% accuracy. gAIRR-annotate profiled the alleles of 13 high-quality whole-genome assemblies from 6 samples and further discovered 79 novel TRV alleles and 11 novel TRJ alleles. We validated a 65-kbp and a 10-kbp structural variant for HG002 on chromosomes 7 and 14, where TRD and J alleles reside. We also uncovered the disagreement of the human genome GRCh37 and GRCh38 in the TR regions; GRCh37 possesses a 270 kbp inversion and a 10 kbp deletion in chromosome 7 relative to GRCh38. The gAIRR Suite might benefit genetic study and future clinical applications for various immune-related phenotypes.