haplotype phasing
Recently Published Documents


TOTAL DOCUMENTS

71
(FIVE YEARS 25)

H-INDEX

13
(FIVE YEARS 2)

Author(s):  
Maaike van der Lee ◽  
William J. Rowell ◽  
Roberta Menafra ◽  
Henk-Jan Guchelaar ◽  
Jesse J. Swen ◽  
...  

AbstractThe use of pharmacogenomics in clinical practice is becoming standard of care. However, due to the complex genetic makeup of pharmacogenes, not all genetic variation is currently accounted for. Here, we show the utility of long-read sequencing to resolve complex pharmacogenes by analyzing a well-characterised sample. This data consists of long reads that were processed to resolve phased haploblocks. 73% of pharmacogenes were fully covered in one phased haploblock, including 9/15 genes that are 100% complex. Variant calling accuracy in the pharmacogenes was high, with 99.8% recall and 100% precision for SNVs and 98.7% precision and 98.0% recall for Indels. For the majority of gene-drug interactions in the DPWG and CPIC guidelines, the associated genes could be fully resolved (62% and 63% respectively). Together, these findings suggest that long-read sequencing data offers promising opportunities in elucidating complex pharmacogenes and haplotype phasing while maintaining accurate variant calling.


2021 ◽  
Author(s):  
Jan Forster ◽  
David Laehnemann ◽  
Annette Paschen ◽  
Alexander Schramm ◽  
Martin Schuler ◽  
...  

Motivation: Haplotype phasing approaches have been shown to improve accuracy of the search space of neoantigen prediction by determining if adjacent variants are located on the same chromosomal copy. However, the aneuploid nature of cancer cells as well as the admixture of different clones in tumor bulk sequencing data are challenging the current diploid based phasing algorithms. We present microphaser, a small-scale phasing approach to improve haplotyping variants in cancer samples. Microphaser aims to create a more accurate neopeptidome for downstream neoantigen prediction. Results: Microphaser achieves large concordance with state-of-the-art phasing-aware neoantigen prediction pipeline neoepiscope, with differences in edge cases and an improved filtering step. Availability: Microphaser is written in the Rust programming language. It is made available via Github (https://github.com/koesterlab/microphaser)and Bioconda. The corresponding prediction pipeline (https://github.com/snakemake-workflows/dna-seq-neoantigen-prediction) has been written within the Snakemake workflow management system and can be deployed as part of the snakemake-workflows project.


Author(s):  
Ziad Al Bkhetan ◽  
Gursharan Chana ◽  
Cheng Soon Ong ◽  
Benjamin Goudey ◽  
Kotagiri Ramamohanarao

Abstract Motivation The high accuracy of recent haplotype phasing tools is enabling the integration of haplotype (or phase) information more widely in genetic investigations. One such possibility is phase-aware expression quantitative trait loci (eQTL) analysis, where haplotype-based analysis has the potential to detect associations that may otherwise be missed by standard SNP-based approaches. Results We present eQTLHap, a novel method to investigate associations between gene expression and genetic variants, considering their haplotypic and genotypic effect. Using multiple simulations based on real data, we demonstrate that phase-aware eQTL analysis significantly outperforms typical SNP-based methods when the causal genetic architecture involves multiple SNPs. We show that phase-aware eQTL analysis is robust to phasing errors, showing only a minor impact ($<4\%$) on sensitivity. Applying eQTLHap to real GEUVADIS and GTEx datasets detects numerous novel eQTLs undetected by a single-SNP approach, with 22 eQTLs replicating across studies or tissue types, highlighting the utility of phase-aware eQTL analysis. Availability and implementation https://github.com/ziadbkh/eQTLHap. Contact [email protected] Supplementary information Supplementary data are available at Briefings in Bioinformatics online.


2021 ◽  
Vol 12 ◽  
Author(s):  
Junfu Guo ◽  
Chang Shi ◽  
Xi Chen ◽  
Ou Wang ◽  
Ping Liu ◽  
...  

Co-barcoded reads originating from long DNA fragments (mean length >30 kbp) maintain both single base level accuracy and long-range genomic information. We propose a pipeline, stLFRsv, to detect structural variation using co-barcoded reads. stLFRsv identifies abnormal large gaps between co-barcoded reads to detect potential breakpoints and reconstruct complex structural variants (SVs). Haplotype phasing by co-barcoded reads increases the signal to noise ratio, and barcode sharing profiles are used to filter out false positives. We integrate the short read SV caller smoove for smaller variants with stLFRsv. The integrated pipeline was evaluated on the well-characterized genome HG002/NA24385, and 74.5% precision and a 22.4% recall rate were obtained for deletions. stLFRsv revealed some large variants not included in the benchmark set that were verified by long reads or assembly. For the HG001/NA12878 genome, stLFRsv also achieved the best performance for both resource usage and the detection of large variants. Our work indicates that co-barcoded read technology has the potential to improve genome completeness.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Fuman Jiang ◽  
Weiqiang Liu ◽  
Longmei Zhang ◽  
Yulai Guo ◽  
Min Chen ◽  
...  

AbstractNoninvasive prenatal testing (NIPT) for single gene disorders remains challenging. One approach that allows for accurate detection of the slight increase of the maternally inherited allele is the relative haplotype dosage (RHDO) analysis, which requires the construction of parental haplotypes. Recently, the nanopore sequencing technologies have become available and may be an ideal tool for direct construction of haplotypes. Here, we explored the feasibility of combining nanopore sequencing with the RHDO analysis in NIPT of β-thalassemia. Thirteen families at risk for β-thalassemia were recruited. Targeted region of parental genomic DNA was amplified by long-range PCR of 10 kb and 20 kb amplicons. Parental haplotypes were constructed using nanopore sequencing and next generation sequencing data. Fetal inheritance of parental haplotypes was classified by the RHDO analysis using data from maternal plasma DNA sequencing. Haplotype phasing was achieved in 12 families using data from 10 kb library. While data from the 20 kb library gave a better performance that haplotype phasing was achieved in all 13 families. Fetal status was correctly classified in 12 out of 13 families. Thus, targeted nanopore sequencing combined with the RHDO analysis is feasible to NIPT for β-thalassemia.


2020 ◽  
Vol 19 ◽  
pp. 162-173
Author(s):  
Nenad Svrzikapa ◽  
Kenneth A. Longo ◽  
Nripesh Prasad ◽  
Ramakrishna Boyanapalli ◽  
Jeffrey M. Brown ◽  
...  
Keyword(s):  

Author(s):  
Shabbeer Hassan ◽  
Ida Surakka ◽  
Marja-Riitta Taskinen ◽  
Veikko Salomaa ◽  
Aarno Palotie ◽  
...  

AbstractPrevious research has shown that using population-specific reference panels has a significant effect on downstream population genomic analyses like haplotype phasing, genotype imputation, and association, especially in the context of population isolates. Here, we developed a high-resolution recombination rate mapping at 10 and 50 kb scale using high-coverage (20–30×) whole-genome sequenced data of 55 family trios from Finland and compared it to recombination rates of non-Finnish Europeans (NFE). We tested the downstream effects of the population-specific recombination rates in statistical phasing and genotype imputation in Finns as compared to the same analyses performed by using the NFE-based recombination rates. We found that Finnish recombination rates have a moderately high correlation (Spearman’s ρ = 0.67–0.79) with NFE, although on average (across all autosomal chromosomes), Finnish rates (2.268 ± 0.4209 cM/Mb) are 12–14% lower than NFE (2.641 ± 0.5032 cM/Mb). Finnish recombination map was found to have no significant effect in haplotype phasing accuracy (switch error rates ~2%) and average imputation concordance rates (97–98% for common, 92–96% for low frequency and 78–90% for rare variants). Our results suggest that haplotype phasing and genotype imputation mostly depend on population-specific contexts like appropriate reference panels and their sample size, but not on population-specific recombination maps. Even though recombination rate estimates had some differences between the Finnish and NFE populations, haplotyping and imputation had not been noticeably affected by the recombination map used. Therefore, the currently available HapMap recombination maps seem robust for population-specific phasing and imputation pipelines, even in the context of relatively isolated populations like Finland.


Author(s):  
Ziad Al Bkhetan ◽  
Gursharan Chana ◽  
Kotagiri Ramamohanarao ◽  
Karin Verspoor ◽  
Benjamin Goudey

Abstract Haplotype phasing is a critical step for many genetic applications but incorrect estimates of phase can negatively impact downstream analyses. One proposed strategy to improve phasing accuracy is to combine multiple independent phasing estimates to overcome the limitations of any individual estimate. However, such a strategy is yet to be thoroughly explored. This study provides a comprehensive evaluation of consensus strategies for haplotype phasing. We explore the performance of different consensus paradigms, and the effect of specific constituent tools, across several datasets with different characteristics and their impact on the downstream task of genotype imputation. Based on the outputs of existing phasing tools, we explore two different strategies to construct haplotype consensus estimators: voting across outputs from multiple phasing tools and multiple outputs of a single non-deterministic tool. We find that the consensus approach from multiple tools reduces SE by an average of 10% compared to any constituent tool when applied to European populations and has the highest accuracy regardless of population ethnicity, sample size, variant density or variant frequency. Furthermore, the consensus estimator improves the accuracy of the downstream task of genotype imputation carried out by the widely used Minimac3, pbwt and BEAGLE5 tools. Our results provide guidance on how to produce the most accurate phasing estimates and the trade-offs that a consensus approach may have. Our implementation of consensus haplotype phasing, consHap, is available freely at https://github.com/ziadbkh/consHap. Supplementary information: Supplementary data are available at Briefings in Bioinformatics online.


Author(s):  
Hiroki Konishi ◽  
Rui Yamaguchi ◽  
Kiyoshi Yamaguchi ◽  
Yoichi Furukawa ◽  
Seiya Imoto

Abstract Motivation In recent years, nanopore sequencing technology has enabled inexpensive long-read sequencing, which promises reads longer than a few thousand bases. Such long-read sequences contribute to the precise detection of structural variations and accurate haplotype phasing. However, deciphering precise DNA sequences from noisy and complicated nanopore raw signals remains a crucial demand for downstream analyses based on higher-quality nanopore sequencing, although various basecallers have been introduced to date. Results To address this need, we developed a novel basecaller, Halcyon, that incorporates neural-network techniques frequently used in the field of machine translation. Our model employs monotonic-attention mechanisms to learn semantic correspondences between nucleotides and signal levels without any pre-segmentation against input signals. We evaluated performance with a human whole-genome sequencing dataset and demonstrated that Halcyon outperformed existing third-party basecallers and achieved competitive performance against the latest Oxford Nanopore Technologies’ basecallers. Availabilityand implementation The source code (halcyon) can be found at https://github.com/relastle/halcyon. Contact [email protected]


Sign in / Sign up

Export Citation Format

Share Document