scholarly journals High-throughput Interpretation of Killer-cell Immunoglobulin-like Receptor Short-read Sequencing Data with PING

2021 ◽  
Author(s):  
Wesley Marin ◽  
Ravi Dandekar ◽  
Danillo G. Augusto ◽  
Tasneem Yusufali ◽  
Bianca Heyn ◽  
...  

The killer-cell immunoglobulin-like receptor ( KIR) complex on chromosome 19 encodes receptors that modulate the activity of natural killer cells, and variation in these genes has been linked to infectious and autoimmune disease, as well as having bearing on pregnancy and transplant outcomes. The medical relevance and high variability of KIR genes makes short-read sequencing an attractive technology for interrogating the region, providing a high-throughput, high-fidelity sequencing method that is cost-effective. However, because this gene complex is characterized by extensive nucleotide polymorphism, structural variation including gene fusions and deletions, and a high level of homology between genes, its interrogation at high resolution has been thwarted by bioinformatic challenges, with most studies limited to examining presence or absence of specific genes. Here, we present the PING (Pushing Immunogenetics to the Next Generation) pipeline, which incorporates empirical data, novel alignment strategies and a custom alignment processing workflow to enable high-throughput KIR sequence analysis from short-read data. PING provides KIR gene copy number classification functionality for all KIR genes through use of a comprehensive alignment reference. The gene copy number determined per individual enables an innovative genotype determination workflow using genotype-matched references. Together, these methods address the challenges imposed by the structural complexity and overall homology of the KIR complex. To determine copy number and genotype determination accuracy, we applied PING to European and African validation cohorts and a synthetic dataset. PING demonstrated exceptional copy number determination performance across all datasets and robust genotype determination performance. Finally, an investigation into discordant genotypes for the synthetic dataset provides insight into misaligned reads, advancing our understanding in interpretation of short-read sequencing data in complex genomic regions. PING promises to support a new era of studies of KIR polymorphism, delivering high-resolution KIR genotypes that are highly accurate, enabling high-quality, high-throughput KIR genotyping for disease and population studies.

2021 ◽  
Vol 17 (8) ◽  
pp. e1008904
Author(s):  
Wesley M. Marin ◽  
Ravi Dandekar ◽  
Danillo G. Augusto ◽  
Tasneem Yusufali ◽  
Bianca Heyn ◽  
...  

The killer-cell immunoglobulin-like receptor (KIR) complex on chromosome 19 encodes receptors that modulate the activity of natural killer cells, and variation in these genes has been linked to infectious and autoimmune disease, as well as having bearing on pregnancy and transplant outcomes. The medical relevance and high variability of KIR genes makes short-read sequencing an attractive technology for interrogating the region, providing a high-throughput, high-fidelity sequencing method that is cost-effective. However, because this gene complex is characterized by extensive nucleotide polymorphism, structural variation including gene fusions and deletions, and a high level of homology between genes, its interrogation at high resolution has been thwarted by bioinformatic challenges, with most studies limited to examining presence or absence of specific genes. Here, we present the PING (Pushing Immunogenetics to the Next Generation) pipeline, which incorporates empirical data, novel alignment strategies and a custom alignment processing workflow to enable high-throughput KIR sequence analysis from short-read data. PING provides KIR gene copy number classification functionality for all KIR genes through use of a comprehensive alignment reference. The gene copy number determined per individual enables an innovative genotype determination workflow using genotype-matched references. Together, these methods address the challenges imposed by the structural complexity and overall homology of the KIR complex. To determine copy number and genotype determination accuracy, we applied PING to European and African validation cohorts and a synthetic dataset. PING demonstrated exceptional copy number determination performance across all datasets and robust genotype determination performance. Finally, an investigation into discordant genotypes for the synthetic dataset provides insight into misaligned reads, advancing our understanding in interpretation of short-read sequencing data in complex genomic regions. PING promises to support a new era of studies of KIR polymorphism, delivering high-resolution KIR genotypes that are highly accurate, enabling high-quality, high-throughput KIR genotyping for disease and population studies.


2020 ◽  
Author(s):  
Timour Baslan ◽  
Sam Kovaka ◽  
Fritz J. Sedlazeck ◽  
Yanming Zhang ◽  
Robert Wappel ◽  
...  

ABSTRACTGenome copy number is an important source of genetic variation in health and disease. In cancer, clinically actionable Copy Number Alterations (CNAs) can be inferred from short-read sequencing data, enabling genomics-based precision oncology. Emerging Nanopore sequencing technologies offer the potential for broader clinical utility, for example in smaller hospitals, due to lower instrument cost, higher portability, and ease of use. Nonetheless, Nanopore sequencing devices are limited in terms of the number of retrievable sequencing reads/molecules compared to short-read sequencing platforms. This represents a challenge for applications that require high read counts such as CNA inference. To address this limitation, we targeted the sequencing of short-length DNA molecules loaded at optimized concentration in an effort to increase sequence read/molecule yield from a single nanopore run. We show that sequencing short DNA molecules reproducibly returns high read counts and allows high quality CNA inference. We demonstrate the clinical relevance of this approach by accurately inferring CNAs in acute myeloid leukemia samples. The data shows that, compared to traditional approaches such as chromosome analysis/cytogenetics, short molecule nanopore sequencing returns more sensitive, accurate copy number information in a cost effective and expeditious manner, including for multiplex samples. Our results provide a framework for the sequencing of relatively short DNA molecules on nanopore devices with applications in research and medicine, that include but are not limited to, CNAs.


2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Arman Shahrisa ◽  
Maryam Tahmasebi-Birgani ◽  
Hossein Ansari ◽  
Zahra Mohammadi ◽  
Vinicio Carloni ◽  
...  

Abstract Background Hepatocellular carcinoma (HCC) is the most common type of liver cancer that occurs predominantly in patients with previous liver conditions. In the absence of an ideal screening modality, HCC is usually diagnosed at an advanced stage. Recent studies show that loss or gain of genomic materials can activate the oncogenes or inactivate the tumor suppressor genes to predispose cells toward carcinogenesis. Here, we evaluated both the copy number alteration (CNA) and RNA sequencing data of 361 HCC samples in order to locate the frequently altered chromosomal regions and identify the affected genes. Results Our data show that the chr1q and chr8p are two hotspot regions for genomic amplifications and deletions respectively. Among the amplified genes, YY1AP1 (chr1q22) possessed the largest correlation between CNA and gene expression. Moreover, it showed a positive correlation between CNA and tumor grade. Regarding deleted genes, CHMP7 (chr8p21.3) possessed the largest correlation between CNA and gene expression. Protein products of both genes interact with other cellular proteins to carry out various functional roles. These include ASH1L, ZNF496, YY1, ZMYM4, CHMP4A, CHMP5, CHMP2A and CHMP3, some of which are well-known cancer-related genes. Conclusions Our in-silico analysis demonstrates the importance of copy number alterations in the pathology of HCC. These findings open a door for future studies that evaluate our results by performing additional experiments.


Blood ◽  
2013 ◽  
Vol 121 (23) ◽  
pp. 4703-4707 ◽  
Author(s):  
Vivien Béziat ◽  
James A. Traherne ◽  
Lisa L. Liu ◽  
Jyothi Jayaraman ◽  
Monika Enqvist ◽  
...  

Key Points KIR gene copy number variation influences NK cell education at the repertoire level due to a linear effect on KIR expression. No effect of KIR gene dose on NK cell education at the single cell level.


2021 ◽  
Vol 12 ◽  
Author(s):  
Leonardo M. Amorim ◽  
Danillo G. Augusto ◽  
Neda Nemat-Gorgani ◽  
Gonzalo Montero-Martin ◽  
Wesley M. Marin ◽  
...  

The KIR (killer-cell immunoglobulin-like receptor) region is characterized by structural variation and high sequence similarity among genes, imposing technical difficulties for analysis. We undertook the most comprehensive study to date of KIR genetic diversity in a large population sample, applying next-generation sequencing in 2,130 United States European-descendant individuals. Data were analyzed using our custom bioinformatics pipeline specifically designed to address technical obstacles in determining KIR genotypes. Precise gene copy number determination allowed us to identify a set of uncommon gene-content KIR haplotypes accounting for 5.2% of structural variation. In this cohort, KIR2DL4 is the framework gene that most varies in copy number (6.5% of all individuals). We identified phased high-resolution alleles in large multi-locus insertions and also likely founder haplotypes from which they were deleted. Additionally, we observed 250 alleles at 5-digit resolution, of which 90 have frequencies ≥1%. We found sequence patterns that were consistent with the presence of novel alleles in 398 (18.7%) individuals and contextualized multiple orphan dbSNPs within the KIR complex. We also identified a novel KIR2DL1 variant, Pro151Arg, and demonstrated by molecular dynamics that this substitution is predicted to affect interaction with HLA-C. No previous studies have fully explored the full range of structural and sequence variation of KIR as we present here. We demonstrate that pairing high-throughput sequencing with state-of-art computational tools in a large cohort permits exploration of all aspects of KIR variation including determination of population-level haplotype diversity, improving understanding of the KIR system, and providing an important reference for future studies.


Sign in / Sign up

Export Citation Format

Share Document