High-throughput Interpretation of Killer-cell Immunoglobulin-like Receptor Short-read Sequencing Data with PING

The killer-cell immunoglobulin-like receptor ( KIR) complex on chromosome 19 encodes receptors that modulate the activity of natural killer cells, and variation in these genes has been linked to infectious and autoimmune disease, as well as having bearing on pregnancy and transplant outcomes. The medical relevance and high variability of KIR genes makes short-read sequencing an attractive technology for interrogating the region, providing a high-throughput, high-fidelity sequencing method that is cost-effective. However, because this gene complex is characterized by extensive nucleotide polymorphism, structural variation including gene fusions and deletions, and a high level of homology between genes, its interrogation at high resolution has been thwarted by bioinformatic challenges, with most studies limited to examining presence or absence of specific genes. Here, we present the PING (Pushing Immunogenetics to the Next Generation) pipeline, which incorporates empirical data, novel alignment strategies and a custom alignment processing workflow to enable high-throughput KIR sequence analysis from short-read data. PING provides KIR gene copy number classification functionality for all KIR genes through use of a comprehensive alignment reference. The gene copy number determined per individual enables an innovative genotype determination workflow using genotype-matched references. Together, these methods address the challenges imposed by the structural complexity and overall homology of the KIR complex. To determine copy number and genotype determination accuracy, we applied PING to European and African validation cohorts and a synthetic dataset. PING demonstrated exceptional copy number determination performance across all datasets and robust genotype determination performance. Finally, an investigation into discordant genotypes for the synthetic dataset provides insight into misaligned reads, advancing our understanding in interpretation of short-read sequencing data in complex genomic regions. PING promises to support a new era of studies of KIR polymorphism, delivering high-resolution KIR genotypes that are highly accurate, enabling high-quality, high-throughput KIR genotyping for disease and population studies.

Download Full-text

High-throughput Interpretation of Killer-cell Immunoglobulin-like Receptor Short-read Sequencing Data with PING

PLoS Computational Biology ◽

10.1371/journal.pcbi.1008904 ◽

2021 ◽

Vol 17 (8) ◽

pp. e1008904

Author(s):

Wesley M. Marin ◽

Ravi Dandekar ◽

Danillo G. Augusto ◽

Tasneem Yusufali ◽

Bianca Heyn ◽

...

Keyword(s):

High Resolution ◽

High Throughput ◽

Copy Number ◽

Killer Cell ◽

Gene Copy Number ◽

Synthetic Dataset ◽

Gene Copy ◽

Sequencing Data ◽

Short Read ◽

Short Read Sequencing

The killer-cell immunoglobulin-like receptor (KIR) complex on chromosome 19 encodes receptors that modulate the activity of natural killer cells, and variation in these genes has been linked to infectious and autoimmune disease, as well as having bearing on pregnancy and transplant outcomes. The medical relevance and high variability of KIR genes makes short-read sequencing an attractive technology for interrogating the region, providing a high-throughput, high-fidelity sequencing method that is cost-effective. However, because this gene complex is characterized by extensive nucleotide polymorphism, structural variation including gene fusions and deletions, and a high level of homology between genes, its interrogation at high resolution has been thwarted by bioinformatic challenges, with most studies limited to examining presence or absence of specific genes. Here, we present the PING (Pushing Immunogenetics to the Next Generation) pipeline, which incorporates empirical data, novel alignment strategies and a custom alignment processing workflow to enable high-throughput KIR sequence analysis from short-read data. PING provides KIR gene copy number classification functionality for all KIR genes through use of a comprehensive alignment reference. The gene copy number determined per individual enables an innovative genotype determination workflow using genotype-matched references. Together, these methods address the challenges imposed by the structural complexity and overall homology of the KIR complex. To determine copy number and genotype determination accuracy, we applied PING to European and African validation cohorts and a synthetic dataset. PING demonstrated exceptional copy number determination performance across all datasets and robust genotype determination performance. Finally, an investigation into discordant genotypes for the synthetic dataset provides insight into misaligned reads, advancing our understanding in interpretation of short-read sequencing data in complex genomic regions. PING promises to support a new era of studies of KIR polymorphism, delivering high-resolution KIR genotypes that are highly accurate, enabling high-quality, high-throughput KIR genotyping for disease and population studies.

Download Full-text

High-resolution analyses of gene copy number reveal new insights into the prognosis and progression of breast cancers

Breast Cancer Research and Treatment ◽

10.1007/s10549-010-1146-y ◽

2010 ◽

Vol 128 (1) ◽

pp. 41-43

Author(s):

Alejandro Gru ◽

D. Craig Allred

Keyword(s):

High Resolution ◽

Copy Number ◽

Gene Copy Number ◽

Gene Copy ◽

Breast Cancers

Download Full-text

High resolution copy number inference in cancer using short-molecule nanopore sequencing

10.1101/2020.12.28.424602 ◽

2020 ◽

Author(s):

Timour Baslan ◽

Sam Kovaka ◽

Fritz J. Sedlazeck ◽

Yanming Zhang ◽

Robert Wappel ◽

...

Keyword(s):

Copy Number ◽

Cost Effective ◽

Chromosome Analysis ◽

Ease Of Use ◽

Precision Oncology ◽

Nanopore Sequencing ◽

Dna Molecules ◽

Sequencing Data ◽

Short Read ◽

Short Read Sequencing

ABSTRACTGenome copy number is an important source of genetic variation in health and disease. In cancer, clinically actionable Copy Number Alterations (CNAs) can be inferred from short-read sequencing data, enabling genomics-based precision oncology. Emerging Nanopore sequencing technologies offer the potential for broader clinical utility, for example in smaller hospitals, due to lower instrument cost, higher portability, and ease of use. Nonetheless, Nanopore sequencing devices are limited in terms of the number of retrievable sequencing reads/molecules compared to short-read sequencing platforms. This represents a challenge for applications that require high read counts such as CNA inference. To address this limitation, we targeted the sequencing of short-length DNA molecules loaded at optimized concentration in an effort to increase sequence read/molecule yield from a single nanopore run. We show that sequencing short DNA molecules reproducibly returns high read counts and allows high quality CNA inference. We demonstrate the clinical relevance of this approach by accurately inferring CNAs in acute myeloid leukemia samples. The data shows that, compared to traditional approaches such as chromosome analysis/cytogenetics, short molecule nanopore sequencing returns more sensitive, accurate copy number information in a cost effective and expeditious manner, including for multiplex samples. Our results provide a framework for the sequencing of relatively short DNA molecules on nanopore devices with applications in research and medicine, that include but are not limited to, CNAs.

Download Full-text

Quantum Dots-Enabled High-Resolution Analysis of Gene Copy Number Variation

IEEE Nanotechnology Magazine ◽

10.1109/mnano.2011.940950 ◽

2011 ◽

Vol 5 (2) ◽

pp. 23-27 ◽

Cited By ~ 1

Author(s):

Yi Zhang ◽

Tza-Huei Wang

Keyword(s):

Quantum Dots ◽

High Resolution ◽

Copy Number Variation ◽

Copy Number ◽

Gene Copy Number ◽

Gene Copy ◽

Gene Copy Number Variation ◽

High Resolution Analysis ◽

Number Variation ◽

Resolution Analysis

Download Full-text

Quantum dots-enabled high resolution analysis of gene copy number variation

2011 6th IEEE International Conference on Nano/Micro Engineered and Molecular Systems ◽

10.1109/nems.2011.6017475 ◽

2011 ◽

Author(s):

Yi Zhang ◽

Tza-Huei Wang

Keyword(s):

Quantum Dots ◽

High Resolution ◽

Copy Number Variation ◽

Copy Number ◽

Gene Copy Number ◽

Gene Copy ◽

Gene Copy Number Variation ◽

High Resolution Analysis ◽

Number Variation ◽

Resolution Analysis

Download Full-text

The pattern of gene copy number alteration (CNAs) in hepatocellular carcinoma: an in silico analysis

Molecular Cytogenetics ◽

10.1186/s13039-021-00553-2 ◽

2021 ◽

Vol 14 (1) ◽

Author(s):

Arman Shahrisa ◽

Maryam Tahmasebi-Birgani ◽

Hossein Ansari ◽

Zahra Mohammadi ◽

Vinicio Carloni ◽

...

Keyword(s):

Gene Expression ◽

Hepatocellular Carcinoma ◽

In Silico ◽

Copy Number ◽

Copy Number Alteration ◽

Gene Copy Number ◽

In Silico Analysis ◽

Gene Copy ◽

Sequencing Data ◽

Silico Analysis

Abstract Background Hepatocellular carcinoma (HCC) is the most common type of liver cancer that occurs predominantly in patients with previous liver conditions. In the absence of an ideal screening modality, HCC is usually diagnosed at an advanced stage. Recent studies show that loss or gain of genomic materials can activate the oncogenes or inactivate the tumor suppressor genes to predispose cells toward carcinogenesis. Here, we evaluated both the copy number alteration (CNA) and RNA sequencing data of 361 HCC samples in order to locate the frequently altered chromosomal regions and identify the affected genes. Results Our data show that the chr1q and chr8p are two hotspot regions for genomic amplifications and deletions respectively. Among the amplified genes, YY1AP1 (chr1q22) possessed the largest correlation between CNA and gene expression. Moreover, it showed a positive correlation between CNA and tumor grade. Regarding deleted genes, CHMP7 (chr8p21.3) possessed the largest correlation between CNA and gene expression. Protein products of both genes interact with other cellular proteins to carry out various functional roles. These include ASH1L, ZNF496, YY1, ZMYM4, CHMP4A, CHMP5, CHMP2A and CHMP3, some of which are well-known cancer-related genes. Conclusions Our in-silico analysis demonstrates the importance of copy number alterations in the pathology of HCC. These findings open a door for future studies that evaluate our results by performing additional experiments.

Download Full-text

Influence of KIR gene copy number on natural killer cell education

Blood ◽

10.1182/blood-2012-10-461442 ◽

2013 ◽

Vol 121 (23) ◽

pp. 4703-4707 ◽

Cited By ~ 56

Author(s):

Vivien Béziat ◽

James A. Traherne ◽

Lisa L. Liu ◽

Jyothi Jayaraman ◽

Monika Enqvist ◽

...

Keyword(s):

Copy Number ◽

Nk Cell ◽

Killer Cell ◽

Gene Copy Number ◽

Gene Copy ◽

Linear Effect ◽

Gene Copy Number Variation ◽

Key Points ◽

Number Variation ◽

Kir Gene

Key Points KIR gene copy number variation influences NK cell education at the repertoire level due to a linear effect on KIR expression. No effect of KIR gene dose on NK cell education at the single cell level.

Download Full-text

High-Resolution Analysis of Gene Copy Number Alterations in Human Prostate Cancer Using CGH on cDNA Microarrays: Impact of Copy Number on Gene Expression

Neoplasia ◽

10.1593/neo.03439 ◽

2004 ◽

Vol 6 (3) ◽

pp. 240-247 ◽

Cited By ~ 87

Author(s):

Maija Wolf ◽

Spyro Mousses ◽

Sampsa Hautaniemi ◽

Ritva Karhu ◽

Pia Huusko ◽

...

Keyword(s):

Gene Expression ◽

Prostate Cancer ◽

High Resolution ◽

Copy Number ◽

Gene Copy Number ◽

Gene Copy ◽

Human Prostate Cancer ◽

Copy Number Alterations ◽

High Resolution Analysis ◽

Resolution Analysis

Download Full-text

High-Resolution Characterization of KIR Genes in a Large North American Cohort Reveals Novel Details of Structural and Sequence Diversity

Frontiers in Immunology ◽

10.3389/fimmu.2021.674778 ◽

2021 ◽

Vol 12 ◽

Author(s):

Leonardo M. Amorim ◽

Danillo G. Augusto ◽

Neda Nemat-Gorgani ◽

Gonzalo Montero-Martin ◽

Wesley M. Marin ◽

...

Keyword(s):

High Resolution ◽

Copy Number ◽

Structural Variation ◽

High Throughput Sequencing ◽

Full Range ◽

Large Population ◽

Killer Cell ◽

Population Sample ◽

Haplotype Diversity ◽

Gene Copy

The KIR (killer-cell immunoglobulin-like receptor) region is characterized by structural variation and high sequence similarity among genes, imposing technical difficulties for analysis. We undertook the most comprehensive study to date of KIR genetic diversity in a large population sample, applying next-generation sequencing in 2,130 United States European-descendant individuals. Data were analyzed using our custom bioinformatics pipeline specifically designed to address technical obstacles in determining KIR genotypes. Precise gene copy number determination allowed us to identify a set of uncommon gene-content KIR haplotypes accounting for 5.2% of structural variation. In this cohort, KIR2DL4 is the framework gene that most varies in copy number (6.5% of all individuals). We identified phased high-resolution alleles in large multi-locus insertions and also likely founder haplotypes from which they were deleted. Additionally, we observed 250 alleles at 5-digit resolution, of which 90 have frequencies ≥1%. We found sequence patterns that were consistent with the presence of novel alleles in 398 (18.7%) individuals and contextualized multiple orphan dbSNPs within the KIR complex. We also identified a novel KIR2DL1 variant, Pro151Arg, and demonstrated by molecular dynamics that this substitution is predicted to affect interaction with HLA-C. No previous studies have fully explored the full range of structural and sequence variation of KIR as we present here. We demonstrate that pairing high-throughput sequencing with state-of-art computational tools in a large cohort permits exploration of all aspects of KIR variation including determination of population-level haplotype diversity, improving understanding of the KIR system, and providing an important reference for future studies.

Download Full-text