scholarly journals Frequency and management of medically actionable incidental findings from genome and exome sequencing data; A systematic review

Author(s):  
Amal Elfatih ◽  
Idris Mohammed ◽  
Doua Abdelrahman ◽  
Borbala Mifsud

The application of whole genome/exome sequencing technologies in clinical genetics and research has resulted in the discovery of incidental findings unrelated to the primary purpose of genetic testing. The American College of Medical Genetics and Genomics published guidelines for reporting pathogenic and likely pathogenic variants that are deemed to be medically actionable, which allowed us to learn about the epidemiology of incidental findings in different populations. However, consensus guidelines for variant reporting and classification are still lacking. We conducted a systematic literature review of incidental findings in whole genome/exome sequencing studies to obtain a comprehensive understanding of variable reporting and classification methods for variants that are deemed to be medically actionable across different populations. The review highlights the elements that demand further consideration or adjustment.

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Chong Chu ◽  
Rebeca Borges-Monroy ◽  
Vinayak V. Viswanadham ◽  
Soohyun Lee ◽  
Heng Li ◽  
...  

AbstractTransposable elements (TEs) help shape the structure and function of the human genome. When inserted into some locations, TEs may disrupt gene regulation and cause diseases. Here, we present xTea (x-Transposable element analyzer), a tool for identifying TE insertions in whole-genome sequencing data. Whereas existing methods are mostly designed for short-read data, xTea can be applied to both short-read and long-read data. Our analysis shows that xTea outperforms other short read-based methods for both germline and somatic TE insertion discovery. With long-read data, we created a catalogue of polymorphic insertions with full assembly and annotation of insertional sequences for various types of retroelements, including pseudogenes and endogenous retroviruses. Notably, we find that individual genomes have an average of nine groups of full-length L1s in centromeres, suggesting that centromeres and other highly repetitive regions such as telomeres are a significant yet unexplored source of active L1s. xTea is available at https://github.com/parklab/xTea.


2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Kelley Paskov ◽  
Jae-Yoon Jung ◽  
Brianna Chrisman ◽  
Nate T. Stockham ◽  
Peter Washington ◽  
...  

Abstract Background As next-generation sequencing technologies make their way into the clinic, knowledge of their error rates is essential if they are to be used to guide patient care. However, sequencing platforms and variant-calling pipelines are continuously evolving, making it difficult to accurately quantify error rates for the particular combination of assay and software parameters used on each sample. Family data provide a unique opportunity for estimating sequencing error rates since it allows us to observe a fraction of sequencing errors as Mendelian errors in the family, which we can then use to produce genome-wide error estimates for each sample. Results We introduce a method that uses Mendelian errors in sequencing data to make highly granular per-sample estimates of precision and recall for any set of variant calls, regardless of sequencing platform or calling methodology. We validate the accuracy of our estimates using monozygotic twins, and we use a set of monozygotic quadruplets to show that our predictions closely match the consensus method. We demonstrate our method’s versatility by estimating sequencing error rates for whole genome sequencing, whole exome sequencing, and microarray datasets, and we highlight its sensitivity by quantifying performance increases between different versions of the GATK variant-calling pipeline. We then use our method to demonstrate that: 1) Sequencing error rates between samples in the same dataset can vary by over an order of magnitude. 2) Variant calling performance decreases substantially in low-complexity regions of the genome. 3) Variant calling performance in whole exome sequencing data decreases with distance from the nearest target region. 4) Variant calls from lymphoblastoid cell lines can be as accurate as those from whole blood. 5) Whole-genome sequencing can attain microarray-level precision and recall at disease-associated SNV sites. Conclusion Genotype datasets from families are powerful resources that can be used to make fine-grained estimates of sequencing error for any sequencing platform and variant-calling methodology.


2020 ◽  
Vol 36 (Supplement_1) ◽  
pp. i186-i193
Author(s):  
Matthew A Myers ◽  
Simone Zaccaria ◽  
Benjamin J Raphael

Abstract Motivation Recent single-cell DNA sequencing technologies enable whole-genome sequencing of hundreds to thousands of individual cells. However, these technologies have ultra-low sequencing coverage (<0.5× per cell) which has limited their use to the analysis of large copy-number aberrations (CNAs) in individual cells. While CNAs are useful markers in cancer studies, single-nucleotide mutations are equally important, both in cancer studies and in other applications. However, ultra-low coverage sequencing yields single-nucleotide mutation data that are too sparse for current single-cell analysis methods. Results We introduce SBMClone, a method to infer clusters of cells, or clones, that share groups of somatic single-nucleotide mutations. SBMClone uses a stochastic block model to overcome sparsity in ultra-low coverage single-cell sequencing data, and we show that SBMClone accurately infers the true clonal composition on simulated datasets with coverage at low as 0.2×. We applied SBMClone to single-cell whole-genome sequencing data from two breast cancer patients obtained using two different sequencing technologies. On the first patient, sequenced using the 10X Genomics CNV solution with sequencing coverage ≈0.03×, SBMClone recovers the major clonal composition when incorporating a small amount of additional information. On the second patient, where pre- and post-treatment tumor samples were sequenced using DOP-PCR with sequencing coverage ≈0.5×, SBMClone shows that tumor cells are present in the post-treatment sample, contrary to published analysis of this dataset. Availability and implementation SBMClone is available on the GitHub repository https://github.com/raphael-group/SBMClone. Supplementary information Supplementary data are available at Bioinformatics online.


2016 ◽  
Vol 44 (2) ◽  
pp. 292-308 ◽  
Author(s):  
Maya Sabatello ◽  
Paul S. Appelbaum

Whole genome and exome sequencing (WGS/WES) techniques raise hope for a new scale of diagnosis, prevention, and prediction of genetic conditions, and improved care for children. For these hopes to materialize, extensive genomic research with children will be needed. However, the use of WGS/WES in pediatric research settings raises considerable challenges for families, researchers, and policy development. In particular, the possibility that these techniques will generate genetic findings unrelated to the primary goal of sequencing has stirred intense debate about whether, which, how, and when these secondary or incidental findings (SFs) should be returned to parents and minors. The debate is even more pronounced when the subjects are adolescents, for whom decisions about return of SFs may have particular implications. In this paper, we consider the rise of “genomic citizenship” and the main challenges that arise for these stakeholders: adolescents' involvement in decisions relating to return of genomic SFs, the types of SFs that should be offered, privacy protections, and communication between researchers and adolescents about SFs. We argue that adolescents' involvement in genomic SF-related decisions acknowledges their status as valuable stakeholders without detracting from broader familial interests, and promotes more informed genomic citizens.


2021 ◽  
Vol 12 ◽  
Author(s):  
Davide Bolognini ◽  
Alberto Magi

Structural variants (SVs) are genomic rearrangements that involve at least 50 nucleotides and are known to have a serious impact on human health. While prior short-read sequencing technologies have often proved inadequate for a comprehensive assessment of structural variation, more recent long reads from Oxford Nanopore Technologies have already been proven invaluable for the discovery of large SVs and hold the potential to facilitate the resolution of the full SV spectrum. With many long-read sequencing studies to follow, it is crucial to assess factors affecting current SV calling pipelines for nanopore sequencing data. In this brief research report, we evaluate and compare the performances of five long-read SV callers across four long-read aligners using both real and synthetic nanopore datasets. In particular, we focus on the effects of read alignment, sequencing coverage, and variant allele depth on the detection and genotyping of SVs of different types and size ranges and provide insights into precision and recall of SV callsets generated by integrating the various long-read aligners and SV callers. The computational pipeline we propose is publicly available at https://github.com/davidebolo1993/EViNCe and can be adjusted to further evaluate future nanopore sequencing datasets.


2018 ◽  
Author(s):  
Li Fang ◽  
Charlly Kao ◽  
Michael V Gonzalez ◽  
Fernanda A Mafra ◽  
Renata Pellegrino da Silva ◽  
...  

AbstractLinked-read sequencing provides long-range information on short-read sequencing data by barcoding reads originating from the same DNA molecule, and can improve the detection and breakpoint identification for structural variants (SVs). We present LinkedSV for SV detection on linked-read sequencing data. LinkedSV considers barcode overlapping and enriched fragment endpoints as signals to detect large SVs, while it leverages read depth, paired-end signals and local assembly to detect small SVs. Benchmarking studies demonstrates that LinkedSV outperforms existing tools, especially on exome data and on somatic SVs with low variant allele frequencies. We demonstrate clinical cases where LinkedSV identifies disease causal SVs from linked-read exome sequencing data missed by conventional exome sequencing, and show examples where LinkedSV identifies SVs missed by high-coverage long-read sequencing. In summary, LinkedSV can detect SVs missed by conventional short-read and long-read sequencing approaches, and may resolve negative cases from clinical genome/exome sequencing studies.


2020 ◽  
Author(s):  
Yingxi Yang ◽  
Yuchen Yang ◽  
Le Huang ◽  
Jai G. Broome ◽  
Adolfo Correa ◽  
...  

AbstractWith advances in whole genome sequencing (WGS) technology, multiple statistical methods for aggregate association testing have been developed. Many common approaches aggregate variants in a given genomic window of a fixed/varying size and are not reliant on existing knowledge to define appropriate test units, resulting in most identified regions not being clearly linked to genes, limiting biological understanding. Functional information from new technologies (such as Hi-C and its derivatives), which can help link enhancers to the genes they affect, can be leveraged to predefine variant sets for aggregate testing in WGS. Therefore, in this paper we propose the eSCAN (Scan the Enhancers) method for genome-wide assessment of enhancer regions in sequencing studies, combining the advantages of dynamic window selection in SCANG with the advantages of increased incorporation of genomic annotation. eSCAN searches biologically meaningful searching windows, increasing power and aiding biological interpretation, as demonstrated by simulation studies under a wide range of scenarios. We also apply eSCAN for association analysis of blood cell traits using TOPMed WGS data from Women’s Health Initiative (WHI) and Jackson Heart Study (JHS). Results from this real data example show that eSCAN is able to capture more significant signals, and these signals are of shorter length and drive association of larger regions detected by other methods.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Yongmei Zhao ◽  
Li Tai Fang ◽  
Tsai-wei Shen ◽  
Sulbha Choudhari ◽  
Keyur Talsania ◽  
...  

AbstractWith the rapid advancement of sequencing technologies, next generation sequencing (NGS) analysis has been widely applied in cancer genomics research. More recently, NGS has been adopted in clinical oncology to advance personalized medicine. Clinical applications of precision oncology require accurate tests that can distinguish tumor-specific mutations from artifacts introduced during NGS processes or data analysis. Therefore, there is an urgent need to develop best practices in cancer mutation detection using NGS and the need for standard reference data sets for systematically measuring accuracy and reproducibility across platforms and methods. Within the SEQC2 consortium context, we established paired tumor-normal reference samples and generated whole-genome (WGS) and whole-exome sequencing (WES) data using sixteen library protocols, seven sequencing platforms at six different centers. We systematically interrogated somatic mutations in the reference samples to identify factors affecting detection reproducibility and accuracy in cancer genomes. These large cross-platform/site WGS and WES datasets using well-characterized reference samples will represent a powerful resource for benchmarking NGS technologies, bioinformatics pipelines, and for the cancer genomics studies.


Plants ◽  
2019 ◽  
Vol 8 (10) ◽  
pp. 376
Author(s):  
Peterson W. Wambugu ◽  
Marie-Noelle Ndjiondjop ◽  
Robert Henry

African rice (Oryza glaberrima) has a pool of genes for resistance to diverse biotic and abiotic stresses, making it an important genetic resource for rice improvement. African rice has potential for breeding for climate resilience and adapting rice cultivation to climate change. Over the last decade, there have been tremendous technological and analytical advances in genomics that have dramatically altered the landscape of rice research. Here we review the remarkable advances in knowledge that have been witnessed in the last few years in the area of genetics and genomics of African rice. Advances in cheap DNA sequencing technologies have fuelled development of numerous genomic and transcriptomic resources. Genomics has been pivotal in elucidating the genetic architecture of important traits thereby providing a basis for unlocking important trait variation. Whole genome re-sequencing studies have provided great insights on the domestication process, though key studies continue giving conflicting conclusions and theories. However, the genomic resources of African rice appear to be under-utilized as there seems to be little evidence that these vast resources are being productively exploited for example in practical rice improvement programmes. Challenges in deploying African rice genetic resources in rice improvement and the genomics efforts made in addressing them are highlighted.


Author(s):  
Raquel Neves ◽  
David J. Tester ◽  
Michael A. Simpson ◽  
Elijah R. Behr ◽  
Michael J. Ackerman ◽  
...  

Background: Sudden cardiac arrest (SCA) and sudden unexplained death (SUD) are feared sequelae of many genetic heart diseases. In rare circumstances, pathogenic variants in cardiomyopathy-susceptibility genes may result in electrical instability leading to SCA/SUD before any structural manifestations of underlying cardiomyopathy are evident. Methods: Collectively, 38 unexplained SCA survivors (21 males; mean age at SCA 26.4±13.1 years), 68 autopsy-inconclusive SUD cases (49 males; mean age at death 20.4±9.0 years) without disease-causative variants in the channelopathy genes, and 973 ostensibly healthy controls were included. Following exome sequencing, ultrarare (minor allele frequency ≤0.00005 in any ethnic group within Genome Aggregation Database [gnomAD, n=141 456 individuals]) nonsynonymous variants identified in 24 ClinGen adjudicated definitive/strong evidence cardiomyopathy-susceptibility genes were analyzed. Eligible variants were adjudicated as pathogenic, likely pathogenic, or variant of uncertain significance in accordance with current American College of Medical Genetics and Genomics guidelines. Results: Overall, 7 out of 38 (18.4%) SCA survivors and 14 out of 68 (20.5%) autopsy-inconclusive, channelopathic-negative SUD cases had at least one pathogenic/likely pathogenic or a variant of uncertain significance nonsynonymous variant within a strong evidence, cardiomyopathy-susceptibility gene. Following American College of Medical Genetics and Genomics criterion variant adjudication, a pathogenic or likely pathogenic variant was identified in 3 out of 38 (7.9%; P =0.05) SCA survivors and 8 out of 68 (11.8%; P =0.0002) autopsy-inconclusive SUD cases compared to 20 out of 973 (2.1%) European controls. Interestingly, the yield of pathogenic/likely pathogenic variants was significantly greater in autopsy-inconclusive SUD cases with documented interstitial fibrosis (4/11, 36%) compared with only 4 out of 57 (7%, P <0.02) SUD cases without ventricular fibrosis. Conclusion: Our data further supports the inclusion of strongevidence cardiomyopathy-susceptibility genes on the genetic testing panels used to evaluate unexplained SCA survivors and autopsy-inconclusive/negative SUD decedents. However, to avoid diagnostic miscues, the careful interpretation of genetic test results in patients without overt phenotypes is vital.


Sign in / Sign up

Export Citation Format

Share Document