scholarly journals Clinical Exome Performance for Reporting Secondary Genetic Findings

2015 ◽  
Vol 61 (1) ◽  
pp. 213-220 ◽  
Author(s):  
Jason Y Park ◽  
Peter Clark ◽  
Eric Londin ◽  
Marialuisa Sponziello ◽  
Larry J Kricka ◽  
...  

Abstract BACKGROUND Reporting clinically actionable incidental genetic findings in the course of clinical exome testing is recommended by the American College of Medical Genetics and Genomics (ACMG). However, the performance of clinical exome methods for reporting small subsets of genes has not been previously reported. METHODS In this study, 57 exome data sets performed as clinical (n = 12) or research (n = 45) tests were retrospectively analyzed. Exome sequencing data was examined for adequacy in the detection of potentially pathogenic variant locations in the 56 genes described in the ACMG incidental findings recommendation. All exons of the 56 genes were examined for adequacy of sequencing coverage. In addition, nucleotide positions annotated in HGMD (Human Gene Mutation Database) were examined. RESULTS The 56 ACMG genes have 18 336 nucleotide variants annotated in HGMD. None of the 57 exome data sets possessed a HGMD variant. The clinical exome test had inadequate coverage for >50% of HGMD variant locations in 7 genes. Six exons from 6 different genes had consistent failure across all 3 test methods; these exons had high GC content (76%–84%). CONCLUSIONS The use of clinical exome sequencing for the interpretation and reporting of subsets of genes requires recognition of the substantial possibility of inadequate depth and breadth of sequencing coverage at clinically relevant locations. Inadequate depth of coverage may contribute to false-negative clinical exome results.

2017 ◽  
Vol 14 (3) ◽  
Author(s):  
Jamie Alnasir ◽  
Hugh P. Shanahan

AbstractDetecting sources of bias in transcriptomic data is essential to determine signals of Biological significance. We outline a novel method to detect sequence specific bias in short read Next Generation Sequencing data. This is based on determining intra-exon correlations between specific motifs. This requires a mild assumption that short reads sampled from specific regions from the same exon will be correlated with each other. This has been implemented on Apache Spark and used to analyse two D. melanogaster eye-antennal disc data sets generated at the same laboratory. The wild type data set in drosophila indicates a variation due to motif GC content that is more significant than that found due to exon GC content. The software is available online and could be applied for cross-experiment transcriptome data analysis in eukaryotes.


Author(s):  
Michael I. Love ◽  
Alena Myšičková ◽  
Ruping Sun ◽  
Vera Kalscheuer ◽  
Martin Vingron ◽  
...  

Varying depth of high-throughput sequencing reads along a chromosome makes it possible to observe copy number variants (CNVs) in a sample relative to a reference. In exome and other targeted sequencing projects, technical factors increase variation in read depth while reducing the number of observed locations, adding difficulty to the problem of identifying CNVs. We present a hidden Markov model for detecting CNVs from raw read count data, using background read depth from a control set as well as other positional covariates such as GC-content. The model, exomeCopy, is applied to a large chromosome X exome sequencing project identifying a list of large unique CNVs. CNVs predicted by the model and experimentally validated are then recovered using a cross-platform control set from publicly available exome sequencing data. Simulations show high sensitivity for detecting heterozygous and homozygous CNVs, outperforming normalization and state-of-the-art segmentation methods.


2017 ◽  
Author(s):  
Paul D. Blischak ◽  
Laura S. Kubatko ◽  
Andrea D. Wolfe

AbstractMotivation:Genotyping and parameter estimation using high throughput sequencing data are everyday tasks for population geneticists, but methods developed for diploids are typically not applicable to polyploid taxa. This is due to their duplicated chromosomes, as well as the complex patterns of allelic exchange that often accompany whole genome duplication (WGD) events. For WGDs within a single lineage (auto polyploids), inbreeding can result from mixed mating and/or double reduction. For WGDs that involve hybridization (allopolyploids), alleles are typically inherited through independently segregating subgenomes.Results:We present two new models for estimating genotypes and population genetic parameters from genotype likelihoods for auto- and allopolyploids. We then use simulations to compare these models to existing approaches at varying depths of sequencing coverage and ploidy levels. These simulations show that our models typically have lower levels of estimation error for genotype and parameter estimates, especially when sequencing coverage is low. Finally, we also apply these models to two empirical data sets from the literature. Overall, we show that the use of genotype likelihoods to model non-standard inheritance patterns is a promising approach for conducting population genomic inferences in polyploids.Availability:A C++ program, EBG, is provided to perform inference using the models we describe. It is available under the GNU GPLv3 on GitHub:https://github.com/pblischak/polyploid-genotyping.Contact: [email protected].


2020 ◽  
Author(s):  
Furkan Özden ◽  
Can Alkan ◽  
A. Ercüment Çiçek

AbstractAccurate and efficient detection of copy number variants (CNVs) is of critical importance due to their significant association with complex genetic diseases. Although algorithms working on whole genome sequencing (WGS) data provide stable results with mostly-valid statistical assumptions, copy number detection on whole exome sequencing (WES) data has mostly been a losing game with extremely high false discovery rates. This is unfortunate as WES data is cost efficient, compact and is relatively ubiquitous. The bottleneck is primarily due to non-contiguous nature of the targeted capture: biases in targeted genomic hybridization, GC content, targeting probes, and sample batching during sequencing. Here, we present a novel deep learning model, DECoNT, which uses the matched WES and WGS data and learns to correct the copy number variations reported by any over-the-shelf WES-based germline CNV caller. We train DECoNT on the 1000 Genomes Project data, and we show that (i) we can efficiently triple the duplication call precision and double the deletion call precisions of the state-of-the-art algorithms. We also show that model consistently improves the performance in a (i) sequencing technology, (ii) exome capture kit and (iii) CNV caller independent manner. Using DECoNT as a universal exome CNV call polisher has the potential to improve the reliability of germline CNV detection on WES data sets and surge its application. The code and the models are available at https://github.com/ciceklab/DECoNT.


2019 ◽  
Author(s):  
Kexue Li ◽  
Lili Wang ◽  
Lizhen Shi ◽  
Li Deng ◽  
Zhong Wang

ABSTRACTMotivationMetagenome assembly from short next-generation sequencing data is a challenging process due to its large scale and computational complexity. Clustering short reads before assembly offers a unique opportunity for parallel downstream assembly of genomes with individualized optimization. However, current read clustering methods suffer either false negative (under-clustering) or false positive (over-clustering) problems.ResultsBased on a previously developed scalable read clustering method on Apache Spark, SpaRC, that has very low false positives, here we extended its capability by adding a new method to further cluster small clusters. This method exploits statistics derived from multiple samples in a dataset to reduce the under-clustering problem. Using a synthetic dataset from mouse gut microbiomes we show that this method has the potential to cluster almost all of the reads from genomes with sufficient sequencing coverage. We also explored several clustering parameters that deferentially affect genomes with various sequencing coverage.Availabilityhttps://bitbucket.org/berkeleylab/jgi-sparc/[email protected]


2018 ◽  
Author(s):  
Vijay Kumar Pounraja ◽  
Gopal Jayakar ◽  
Matthew Jensen ◽  
Neil Kelkar ◽  
Santhosh Girirajan

ABSTRACTCopy-number variants (CNVs) are a major cause of several genetic disorders, making their detection an essential component of genetic analysis pipelines. Current methods for detecting CNVs from exome sequencing data are limited by high false positive rates and low concordance due to the inherent biases of individual algorithms. To overcome these issues, calls generated by two or more algorithms are often intersected using Venn-diagram approaches to identify “high-confidence” CNVs. However, this approach is inadequate, as it misses potentially true calls that do not have consensus from multiple callers. Here, we present CN-Learn, a machine-learning framework (https://github.com/girirajanlab/CN_Learn) that integrates calls from multiple CNV detection algorithms and learns to accurately identify true CNVs using caller-specific and genomic features from a small subset of validated CNVs. Using CNVs predicted by four exome-based CNV callers (CANOES, CODEX, XHMM and CLAMMS) from 503 samples, we demonstrate that CN-Learn identifies true CNVs at higher precision (~90%) and recall (~85%) rates while maintaining robust performance even when trained with minimal data (~30 samples). CN-Learn recovers twice as many CNVs compared to individual callers or Venn diagram-based approaches, with features such as exome capture probe count, caller concordance and GC content providing the most discriminatory power. In fact, about 58% of all true CNVs recovered by CN-Learn were either singletons or calls that lacked support from at least one caller. Our study underscores the limitations of current approaches for CNV identification and provides an effective method that yields high-quality CNVs for application in clinical diagnostics.


Transfusion ◽  
2016 ◽  
Vol 56 (11) ◽  
pp. 2744-2749 ◽  
Author(s):  
Keolu Fox ◽  
Jill M. Johnsen ◽  
Bradley P. Coe ◽  
Chris D. Frazar ◽  
Alexander P. Reiner ◽  
...  

Author(s):  
Amal Elfatih ◽  
Idris Mohammed ◽  
Doua Abdelrahman ◽  
Borbala Mifsud

The application of whole genome/exome sequencing technologies in clinical genetics and research has resulted in the discovery of incidental findings unrelated to the primary purpose of genetic testing. The American College of Medical Genetics and Genomics published guidelines for reporting pathogenic and likely pathogenic variants that are deemed to be medically actionable, which allowed us to learn about the epidemiology of incidental findings in different populations. However, consensus guidelines for variant reporting and classification are still lacking. We conducted a systematic literature review of incidental findings in whole genome/exome sequencing studies to obtain a comprehensive understanding of variable reporting and classification methods for variants that are deemed to be medically actionable across different populations. The review highlights the elements that demand further consideration or adjustment.


2017 ◽  
Vol 2017 ◽  
pp. 1-11 ◽  
Author(s):  
Jinhwa Kong ◽  
Jaemoon Shin ◽  
Jungim Won ◽  
Keonbae Lee ◽  
Unjoo Lee ◽  
...  

Copy number variations (CNVs) are structural variants associated with human diseases. Recent studies verified that disease-related genes are based on the extraction of rare de novo and transmitted CNVs from exome sequencing data. The need for more efficient and accurate methods has increased, which still remains a challenging problem due to coverage biases, as well as the sparse, small-sized, and noncontinuous nature of exome sequencing. In this study, we developed a new CNV detection method, ExCNVSS, based on read coverage depth evaluation and scale-space filtering to resolve these problems. We also developed the method ExCNVSS_noRatio, which is a version of ExCNVSS, for applying to cases with an input of test data only without the need to consider the availability of a matched control. To evaluate the performance of our method, we tested it with 11 different simulated data sets and 10 real HapMap samples’ data. The results demonstrated that ExCNVSS outperformed three other state-of-the-art methods and that our method corrected for coverage biases and detected all-sized CNVs even without matched control data.


1999 ◽  
Vol 79 (4) ◽  
pp. 625-632 ◽  
Author(s):  
Warren K. Coleman ◽  
G. C. C. Tai

The capacity of a colour chart and a reflectance photometer (Agtron) to accurately determine chipping quality of potato tubers was assessed using data sets taken over a 4-yr period for 17–32 cultivars. Both tests gave a high diagnostic accuracy for chipping quality regardless of sampling time from storage or the occurrence of high temperature reconditioning when evaluated by receiver operating characteristic (ROC) curve analysis. Receiver operating characteristic curve analysis showed that both tuber glucose content and chip colour provided good diagnostic performance in correctly separating processing from non-processing tubers over a range of growing and storage conditions. Identification of chipping from non-chipping tuber samples from a 13 °C storage across a range of cultivars and growing conditions occurred with a minimum chipping colour threshold range of 41–47 or a maximum glucose concentration range of 4.3–5.4 mmol L−1 of tuber cell sap. The practical value of a test can depend on such factors as prevalence of chippers in a tuber population as well as the cost of misclassifications, i.e., costs associated with false positive or false negative test results and expressed in relative terms as the unit cost ratio. An examination of Prevalence-Value-Accuracy (PVA) plots for one of the data sets indicated that total misclassification costs could increase rapidly, depending on the prevalence of chipping tubers and the relative amounts of false negative and false positive costs. Maximum costs were consistently associated with a prevalence of 50% chippers and a unit cost ratio of 0.5. In a tuber sample containing a high prevalence of chippers (50–70%) and a low unit cost ratio (<0.2), an acceptable colour threshold determined by PVA-Threshold (PVAT) plots would be approximately 40 to 50 from the Agriculture and Agri-Food Canada colour chart. However, if the colour chart was used for screening tuber samples with a low prevalence (20–40%) of chippers and a unit cost ratio >0.20, a threshold between 60 and 65 would be optimum. The latter range would be conservative and agrees with, and supports, current industry standards, which reside at 60 or better. Since a good diagnostic test should be repeatable and subject to minimal inter-observer variation, the more objective glucose or reflectance photometric tests may be preferable and provide acceptable diagnostic accuracy for processing quality. However, the present study indicates that all three test methods are acceptable for accurately separating chipping from non-chipping tubers regardless of sampling or storage protocols. Key words: Potato, colour chart, reflectance colorimetry, glucose content


Sign in / Sign up

Export Citation Format

Share Document