scholarly journals Targeted Sequencing Identifies the Genetic Variants Associated with High-altitude Polycythemia in the Tibetan Population

Author(s):  
Zhiying Zhang ◽  
Lifeng Ma ◽  
Xiaowei Fan ◽  
Kun Wang ◽  
Lijun Liu ◽  
...  

AbstractHigh-altitude polycythemia (HAPC) is characterized by excessive proliferation of erythrocytes, resulting from the hypobaric hypoxia condition in high altitude. The genetic variants and molecular mechanisms of HAPC remain unclear in highlanders. We recruited 141 Tibetan dwellers, including 70 HAPC patients and 71 healthy controls, to detect the possible genetic variants associated with the disease; and performed targeted sequencing on 529 genes associated with the oxygen metabolism and erythrocyte regulation, utilized unconditional logistic regression analysis and GO (gene ontology) analysis to investigate the genetic variations of HAPC. We identified 12 single nucleotide variants, harbored in 12 genes, associated with the risk of HAPC (4.7 ≤ odd ratios ≤ 13.6; 7.6E − 08 ≤ p-value ≤ 1E − 04). The pathway enrichment study of these genes indicated the three pathways, the PI3K-AKT pathway, JAK-STAT pathway, and HIF-1 pathway, are essential, which p-values as 3.70E − 08, 1.28 E − 07, and 3.98 E − 06, respectively. We are hopeful that our results will provide a reference for the etiology research of HAPC. However, additional genetic risk factors and functional investigations are necessary to confirm our results further.

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Gavin W. Wilson ◽  
Mathieu Derouet ◽  
Gail E. Darling ◽  
Jonathan C. Yeung

AbstractIdentifying single nucleotide variants has become common practice for droplet-based single-cell RNA-seq experiments; however, presently, a pipeline does not exist to maximize variant calling accuracy. Furthermore, molecular duplicates generated in these experiments have not been utilized to optimally detect variant co-expression. Herein, we introduce scSNV designed from the ground up to “collapse” molecular duplicates and accurately identify variants and their co-expression. We demonstrate that scSNV is fast, with a reduced false-positive variant call rate, and enables the co-detection of genetic variants and A>G RNA edits across twenty-two samples.


Blood ◽  
2015 ◽  
Vol 126 (23) ◽  
pp. 4207-4207
Author(s):  
Brian S White ◽  
Irena Lanc ◽  
Daniel Auclair ◽  
Robert Fulton ◽  
Mark A Fiala ◽  
...  

Abstract Background: Multiple myeloma (MM) is a hematologic cancer characterized by a diversity of genetic lesions-translocations, copy number alterations (CNAs), and single nucleotide variants (SNVs). The prognostic value of translocations and of CNAs has been well established. Determining the clinical significance of SNVs, which are recurrently mutated at much lower frequencies, and how this significance is impacted by translocations and CNAs requires additional, large-scale correlative studies. Such studies can be facilitated by cost-effective targeted sequencing approaches. Hence, we designed a single-platform targeted sequencing approach capable of detecting all three variant types. Methods: We designed oligonucleotide probes complementary to the coding regions of 467 genes and to the IgH and MYC loci, allowing a probe to closely match at most 5 regions within the genome. Genes were selected if they were expressed in an independent RNA-seq MM data set and harbored germline SNP-filtered variants that: (1) occurred with frequency >3%, (2) were clustered in hotspots, (3) occurred in recurrently mutated "cancer genes" (as annotated in COSMIC or MutSig), or (4) occurred in genes involved in DNA repair and/or B-cell biology. IgH and MYC tiling was unbiased (with respect to annotated features within the loci) and spanned from 50 kilobasepairs (kbps) upstream of both regions to 50 kbps downstream of IgH and 100 kbps downstream of MYC. Results: We performed targeted sequencing of 96 CD138-enriched samples derived from MM patients, as well as matched peripheral blood leukocyte normal controls. Sequencing depth (mean 107X) was commensurate with that of available exome sequencing data from these samples (mean 71X). Samples harbored a mean of 25 non-silent variants, including those in known MM-associated genes: NRAS (24%), KRAS (22%), FAM46C (17%), TP53 (10%), DIS3 (8%), and BRAF (3%). Variants detected by both platforms showed a strong correlation (r^2 = 0.8). The capture array detected activating, oncogenic variants in NRAS Q61K (n=3 patients) and KRAS G12C/D/R/V (n=5) that were not detected in exome data. Additionally, we found non-silent, capture-specific variants in MTOR (3%) and in two transcription-related genes that have been previously implicated in cancer: ZFHX4 (5%) and CHD3 (5%). To assess the potential role of deep subclonal variants and our ability to detect them, we performed additional sequencing (mean 565X) on six of the tumor/normal pairs. This revealed 14 manually-reviewed, non-silent variants that were not detected by the initial targeted sequencing. These had a mean variant allele frequency of 2.8% and included mutations in DNMT3A and FAM46C. At least one of these 14 variants occurred in five of the six re-sequenced samples. This highlights the importance of this additional depth, which will be used in future studies. Our approach successfully detected CNAs near expected frequencies, including hyperdiploidy (52%), del(13) (43%), and gain of 1q (35%). Similarly, it inferred IgH translocations at expected frequencies: t(4;14) (14%), t(6;14) (3%), t(11;14) (15%), and t(14;20) (1%). As expected, translocations occur predominantly within the IgH constant region, but also frequently 5' (i.e., telomeric) of the IGHM switch region, and occasionally within the V and D regions. We detected MYC -associated translocations, whose frequencies have been the subject of debate, at 10% (n=9 patients), with five involving IgH, three having both partners in or near MYC, and one having both types. Finally, our platform detected novel IgH translocations with partners near DERL3 (n=2), MYCN (n=1), and FLT3 (n=1). Additional evidence suggests that DERL3 and MYCN may be targets of IgH-induced overexpression: of 84 RNA-seq patient samples, six exhibited outlying expression of DERL3, including one sample in which we detected the translocation in corresponding DNA, and one exhibited outlying expression of MYCN. Conclusion: Our MM-specific targeted sequencing strategy is capable of detecting deeply subclonal SNVs, in addition to CNAs and IgH and MYC translocations. Though additional validation is required, particularly with respect to translocation detection, we anticipate that such technology will soon enable clinical testing on a single sequencing platform. Disclosures Vij: Celgene, Onyx, Takeda, Novartis, BMS, Sanofi, Janssen, Merck: Consultancy; Takeda, Onyx: Research Funding.


2021 ◽  
Author(s):  
Turki Sobahy ◽  
Meshari Alazmi

Genomic medicine stands to be revolutionized through the understanding of single nucleotide variants (SNVs) and their expression in single-gene disorders (mendelian diseases). Computational tools can play a vital role in the exploration of such variations and their pathogenicity. Consequently, we developed the ensemble prediction tool AllelePred to identify deleterious SNVs and disease causative genes. In comparison to other tools, our classifier achieves higher accuracy, precision, F1 score, and coverage for different types of coding variants. Furthermore, this research analyzes and structures 168,945 broad spectrum genetic variants from the genomes of the Saudi population to denote the accuracy of the model. When compared, AllelePred was able to structure the unlabeled Saudi genetic variants of the dataset to mimic the data characteristics of the known labeled data. On this basis, we accumulated a list of highly probable deleterious variants that we recommend for further experimental validation prior to medical diagnostic usage.<br>


PLoS ONE ◽  
2021 ◽  
Vol 16 (5) ◽  
pp. e0251585
Author(s):  
Pete Heinzelman ◽  
Philip A. Romero

Understanding how human ACE2 genetic variants differ in their recognition by SARS-CoV-2 can facilitate the leveraging of ACE2 as an axis for treating and preventing COVID-19. In this work, we experimentally interrogate thousands of ACE2 mutants to identify over one hundred human single-nucleotide variants (SNVs) that are likely to have altered recognition by the virus, and make the complementary discovery that ACE2 residues distant from the spike interface influence the ACE2-spike interaction. These findings illuminate new links between ACE2 sequence and spike recognition, and could find substantial utility in further fundamental research that augments epidemiological analyses and clinical trial design in the contexts of both existing strains of SARS-CoV-2 and novel variants that may arise in the future.


Epigenomics ◽  
2020 ◽  
Vol 12 (18) ◽  
pp. 1633-1650
Author(s):  
Xi Xu ◽  
Chaoju Gong ◽  
Yunfeng Wang ◽  
Yanyan Hu ◽  
Hong Liu ◽  
...  

Aim: We aim to identify driving genes of colorectal cancer (CRC) through multi-omics analysis. Materials & methods: We downloaded multi-omics data of CRC from The Cancer Genome Atlas dataset. Integrative analysis of single-nucleotide variants, copy number variations, DNA methylation and differentially expressed genes identified candidate genes that carry CRC risk. Kernal genes were extracted from the weighted gene co-expression network analysis. A competing endogenous RNA network composed of CRC-related genes was constructed. Biological roles of genes were further investigated in vitro. Results: We identified LRRC26 and REP15 as novel prognosis-related driving genes for CRC. LRRC26 hindered tumorigenesis of CRC in vitro. Conclusion: Our study identified novel driving genes and may provide new insights into the molecular mechanisms of CRC.


2021 ◽  
Vol 15 (1) ◽  
Author(s):  
Vladimir Avramović ◽  
Simona Denise Frederiksen ◽  
Marjana Brkić ◽  
Maja Tarailo-Graovac

Abstract Background Genetic variation databases provide invaluable information on the presence and frequency of genetic variants in the ‘untargeted’ human population, aggregated with the primary goal to facilitate the interpretation of clinically important variants. The presence of somatic variants in such databases can affect variant assessment in undiagnosed rare disease (RD) patients. Previously, the impact of somatic mosaicism was only considered in relation to two Mendelian disease-associated genes. Here, we expand the analyses to identify additional mosaicism-prone genes in blood-derived reference population databases. Results To identify additional mosaicism-prone genes relevant to RDs, we focused on known/previously established ClinVar pathogenic and likely pathogenic single-nucleotide variants, residing in genes associated with early onset, severe autosomal dominant diseases. We asked whether any of these variants are present in a higher-than-expected frequency in the reference population databases and whether there is evidence of somatic origin (i.e., allelic imbalance) rather than germline heterozygosity (~ half of the reads supporting alternative allele). The mosaicism-prone genes identified were further categorized according to the processes they are involved in. Beyond the previously reported ASXL1 and DNMT3A, we identified 7 additional autosomal dominant RD-associated genes with known pathogenic single-nucleotide variants present in the reference population databases and good evidence of allelic imbalance: BRAF, CBL, FGFR3, IDH2, KRAS, PTPN11 and SETBP1. From this group of 9 genes, the majority (n = 7) was important for hematopoiesis. In addition, 4 of these genes were involved in cell proliferation. Further assessment of the known 156 hematopoietic genes led to identification of 48 genes (21 not yet associated with RDs) with at least some evidence of mosaicism detectable in reference population databases. Conclusions These results stress the importance of considering genes involved in hematopoiesis and cell proliferation when interpreting the presence and frequency of genetic variants in blood-derived reference population databases, both public and private. This is especially important when considering new variants of uncertain significance in known hematopoietic/cell proliferation RD genes and future novel gene–disease associations involving this class of genes.


Author(s):  
Sergey Abramov ◽  
Alexandr Boytsov ◽  
Dariia Bykova ◽  
Dmitry D. Penzar ◽  
Ivan Yevshin ◽  
...  

AbstractSequence variants in gene regulatory regions alter gene expression and contribute to phenotypes of individual cells and the whole organism, including disease susceptibility and progression. Single-nucleotide variants in enhancers or promoters may affect gene transcription by altering transcription factor binding sites. Differential transcription factor binding in heterozygous genomic loci provides a natural source of information on such regulatory variants. We present a novel approach to call the allele-specific transcription factor binding events at single-nucleotide variants in ChIP-Seq data, taking into account the joint contribution of aneuploidy and local copy number variation, that is estimated directly from variant calls. We have conducted a meta-analysis of more than 7 thousand ChIP-Seq experiments and assembled the database of allele-specific binding events listing more than half a million entries at nearly 270 thousand single-nucleotide polymorphisms for several hundred human transcription factors and cell types. These polymorphisms are enriched for associations with phenotypes of medical relevance and often overlap eQTLs, making candidates for causality by linking variants with molecular mechanisms. Specifically, there is a special class of switching sites, where different transcription factors preferably bind alternative alleles, thus revealing allele-specific rewiring of molecular circuitry.


Blood ◽  
2010 ◽  
Vol 116 (21) ◽  
pp. 3233-3233
Author(s):  
Julia A. Meyer ◽  
Laura E. Hogan ◽  
Jinhua Wang ◽  
Jun J. Yang ◽  
Jay Patel ◽  
...  

Abstract Abstract 3233 Introduction: Relapsed ALL carries a very poor prognosis despite intensive therapy, indicating the need for new insights into disease mechanisms. We have previously used gene expression profiling (Hogan et al. ASH 2009) and copy number analysis (Yang et al. Blood 2008) in paired diagnosis and relapsed ALL samples to better understand the biologic mechanisms leading to recurrent disease. To create an integrated genomic profile of ALL, we have now focused on high throughput RNA sequencing to detect changes in the transcriptome from diagnosis to relapse. Patients/Methods: To date we have sequenced 6 matched diagnosis/relapse pairs (i.e. 12 marrow samples) from B-precursor ALL patients enrolled on Children's Oncology Group (COG) P9906 and AALL0232 trials. RNA libraries were prepared from poly-A selected RNA and sequenced using 54 base pair single end reads using the Illumina Genome Analyzer IIx. Each sample was sequenced in at least 7 lanes, generating an average of 100 million reads per sample. BWA (v0.5.8) was used to align the reads to the human genome, producing an average of 53 million mapped reads. Samtools (v0.1.8) was then used to predict genetic variants across the genome, filtering out variants with a low mapping quality (<Q20), sub-optimal alignment (X:1>0), low coverage (<8X), or overlap with known single nucleotide polymorphisms (SNPs) from dbSNP (r131) or the 1000 Genomes Project. Results: We observed a total of 119,000 genetic variants across all samples, with comparable overall mutational burden at relapse and diagnosis. To identify candidate lesions that may indicate a selection for common chemoresistance pathways, we focused our analysis on relapse-enriched, non-synonymous variants. 8,486 non-synonymous variants (insertions/deletions and single nucleotide variants [SNV]) were identified that occurred more often at relapse compared to diagnosis. Our analysis was focused on relapse-enriched SNVs that coded for non-synonymous changes, of which 154 were prioritized for validation. Validation was completed using matched genomic DNA samples and PCR products were directly sequenced. Mutation calls were made by manual review of tracings using the Mutation Surveyor program from Softgenetics. Thirty-three percent of predicted SNV loci were validated, but upon further sequencing of matched germline samples, five relapse specific mutations were confirmed. Mutations in COBRA1, FAM120A, RGS12, SND2, and SMEK2 were found in individual patient relapse samples. Validation is currently ongoing to confirm additional SNVs and an expanded validation of mutations will be completed in an additional 66 matched diagnosis/relapse pairs from COG 9906 and AALL 0232 and 0331 studies. Relapse specific isoforms identifying alternative exon usage was also detected in 15 genes, all of which were shared amongst multiple patients. In addition, a significant increase (p=6.7×10−6) was observed in the number of poly-adenylation sites in the genes of the relapse samples. Conclusions: While, isoform specific expression was shared amongst patients at relapse, all relapse specific mutations were private and our data to date indicate that a diversity of mechanisms contribute to relapsed disease. Further sequencing analysis of our expanded cohort of samples will determine the mutation and isoform expression prevalence, as well as the functional significance and the potential therapeutic relevance. Disclosures: No relevant conflicts of interest to declare.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Sergey Abramov ◽  
Alexandr Boytsov ◽  
Daria Bykova ◽  
Dmitry D. Penzar ◽  
Ivan Yevshin ◽  
...  

AbstractSequence variants in gene regulatory regions alter gene expression and contribute to phenotypes of individual cells and the whole organism, including disease susceptibility and progression. Single-nucleotide variants in enhancers or promoters may affect gene transcription by altering transcription factor binding sites. Differential transcription factor binding in heterozygous genomic loci provides a natural source of information on such regulatory variants. We present a novel approach to call the allele-specific transcription factor binding events at single-nucleotide variants in ChIP-Seq data, taking into account the joint contribution of aneuploidy and local copy number variation, that is estimated directly from variant calls. We have conducted a meta-analysis of more than 7 thousand ChIP-Seq experiments and assembled the database of allele-specific binding events listing more than half a million entries at nearly 270 thousand single-nucleotide polymorphisms for several hundred human transcription factors and cell types. These polymorphisms are enriched for associations with phenotypes of medical relevance and often overlap eQTLs, making candidates for causality by linking variants with molecular mechanisms. Specifically, there is a special class of switching sites, where different transcription factors preferably bind alternative alleles, thus revealing allele-specific rewiring of molecular circuitry.


2019 ◽  
Vol 2019 ◽  
pp. 1-8
Author(s):  
Luobu Gesang ◽  
Lamu Gusang ◽  
Ciren Dawa ◽  
Gawa Gesang ◽  
Kang Li

Background. The hypoxic conditions at high altitudes are great threats to survival, causing pressure for adaptation. More and more high-altitude denizens are not adapted with the condition known as high-altitude polycythemia (HAPC) that featured excessive erythrocytosis. As a high-altitude sickness, the etiology of HAPC is still unclear. Methods. In this study, we reported the whole-genome sequencing-based study of 10 native Tibetans with HAPC and 10 control subjects followed by genotyping of selected 21 variants from discovered single nucleotide variants (SNVs) in an independent cohort (232 cases and 266 controls). Results. We discovered the egl nine homologue 3 (egln3/phd3) (14q13.1, rs1346902, P=1.91×10−5) and PPP1R2P1 (Protein Phosphatase 1 Regulatory Inhibitor Subunit 2) gene (6p21.32, rs521539, P=0.012). Our results indicated an unbiased framework to identify etiological mechanisms of HAPC and showed that egln3/phd3 and PPP1R2P1 may be associated with the susceptibility to HAPC. Egln3/phd3b is associated with hypoxia-inducible factor subunit α (HIFα). Protein Phosphatase 1 Regulatory Inhibitor is associated with reactive oxygen species (ROS) and oxidative stress. Conclusions. Our genome sequencing conducted in Tibetan HAPC patients identified egln3/phd3 and PPP1R2P1 associated with HAPC.


Sign in / Sign up

Export Citation Format

Share Document