Evaluation of Whole Exome Sequencing as an Alternative of BeadChip and Whole Genome Sequencing in Human Population Genetic Analysis

ABSTRACTUnderstanding the underlying genetic structure of human populations is of fundamental interest to both biological and social sciences. Advances in high-throughput genotyping technology have markedly improved our understanding of global patterns of human genetic variation. The most widely used methods for collecting variant information at the DNA-level include whole genome sequencing, which continues to remain costly, and the more economical solution of array-based techniques, as these are capable of simultaneously genotyping a pre-selected set of variable DNA sites in the human genome. The largest publicly accessible set of human genomic sequence data available today originates from exome sequencing that comprises around 1.2% of the whole genome (approximately 30 million base pairs). In this study, we compared the application of the exome dataset to the array-based dataset and to the gold standard whole genome dataset using the same population genetic analysis methods. Our results draw attention to some of the inherent problems that arise from using pre-selected SNP sets for population genetic analysis. Additionally, we demonstrate that exome sequencing provides a better alternative to the array-based methods for population genetic analysis. In this study, we propose a strategy for unbiased variant collection from exome data and offer a bioinformatics protocol for proper data processing.

Download Full-text

Evaluation of whole exome sequencing as an alternative to BeadChip and whole genome sequencing in human population genetic analysis

BMC Genomics ◽

10.1186/s12864-018-5168-x ◽

2018 ◽

Vol 19 (1) ◽

Cited By ~ 2

Author(s):

Zoltán Maróti ◽

Zsolt Boldogkői ◽

Dóra Tombácz ◽

Michael Snyder ◽

Tibor Kalmár

Keyword(s):

Genetic Analysis ◽

Whole Genome Sequencing ◽

Exome Sequencing ◽

Whole Exome Sequencing ◽

Genome Sequencing ◽

Population Genetic ◽

Human Population ◽

Whole Genome ◽

Population Genetic Analysis ◽

Whole Exome

Download Full-text

Phylogenetic and population genetic analysis of Salmonella enterica subsp. enterica serovar Infantis strains isolated in Japan using whole genome sequence data

Infection Genetics and Evolution ◽

10.1016/j.meegid.2014.06.012 ◽

2014 ◽

Vol 27 ◽

pp. 62-68 ◽

Cited By ~ 12

Author(s):

Eiji Yokoyama ◽

Koichi Murakami ◽

Yuh Shiwa ◽

Taichiro Ishige ◽

Naoshi Ando ◽

...

Keyword(s):

Genetic Analysis ◽

Genome Sequence ◽

Salmonella Enterica ◽

Population Genetic ◽

Sequence Data ◽

Whole Genome Sequence ◽

Whole Genome ◽

Population Genetic Analysis ◽

Genome Sequence Data

Download Full-text

Estimating sequencing error rates using families

BioData Mining ◽

10.1186/s13040-021-00259-6 ◽

2021 ◽

Vol 14 (1) ◽

Author(s):

Kelley Paskov ◽

Jae-Yoon Jung ◽

Brianna Chrisman ◽

Nate T. Stockham ◽

Peter Washington ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Exome Sequencing ◽

Genome Sequencing ◽

Variant Calling ◽

Error Rates ◽

Sequencing Error ◽

Whole Genome ◽

Sequencing Data ◽

Sequencing Platform ◽

Whole Exome

Abstract Background As next-generation sequencing technologies make their way into the clinic, knowledge of their error rates is essential if they are to be used to guide patient care. However, sequencing platforms and variant-calling pipelines are continuously evolving, making it difficult to accurately quantify error rates for the particular combination of assay and software parameters used on each sample. Family data provide a unique opportunity for estimating sequencing error rates since it allows us to observe a fraction of sequencing errors as Mendelian errors in the family, which we can then use to produce genome-wide error estimates for each sample. Results We introduce a method that uses Mendelian errors in sequencing data to make highly granular per-sample estimates of precision and recall for any set of variant calls, regardless of sequencing platform or calling methodology. We validate the accuracy of our estimates using monozygotic twins, and we use a set of monozygotic quadruplets to show that our predictions closely match the consensus method. We demonstrate our method’s versatility by estimating sequencing error rates for whole genome sequencing, whole exome sequencing, and microarray datasets, and we highlight its sensitivity by quantifying performance increases between different versions of the GATK variant-calling pipeline. We then use our method to demonstrate that: 1) Sequencing error rates between samples in the same dataset can vary by over an order of magnitude. 2) Variant calling performance decreases substantially in low-complexity regions of the genome. 3) Variant calling performance in whole exome sequencing data decreases with distance from the nearest target region. 4) Variant calls from lymphoblastoid cell lines can be as accurate as those from whole blood. 5) Whole-genome sequencing can attain microarray-level precision and recall at disease-associated SNV sites. Conclusion Genotype datasets from families are powerful resources that can be used to make fine-grained estimates of sequencing error for any sequencing platform and variant-calling methodology.

Download Full-text

Whole Genome Sequencing and Complete Genetic Analysis Reveals Novel Pathways to Glycopeptide Resistance in Staphylococcus aureus

PLoS ONE ◽

10.1371/journal.pone.0021577 ◽

2011 ◽

Vol 6 (6) ◽

pp. e21577 ◽

Cited By ~ 45

Author(s):

Adriana Renzoni ◽

Diego O. Andrey ◽

Ambre Jousselin ◽

Christine Barras ◽

Antoinette Monod ◽

...

Keyword(s):

Staphylococcus Aureus ◽

Genetic Analysis ◽

Whole Genome Sequencing ◽

Genome Sequencing ◽

Whole Genome ◽

Glycopeptide Resistance

Download Full-text

Whole genome sequencing provides better diagnostic yield and future value than whole exome sequencing

The Medical Journal of Australia ◽

10.5694/mja17.01176 ◽

2018 ◽

Vol 209 (5) ◽

pp. 197-199 ◽

Cited By ~ 10

Author(s):

John S Mattick ◽

Marcel Dinger ◽

Nicole Schonrock ◽

Mark Cowley

Keyword(s):

Whole Genome Sequencing ◽

Exome Sequencing ◽

Whole Exome Sequencing ◽

Genome Sequencing ◽

Diagnostic Yield ◽

Whole Genome ◽

Whole Exome ◽

Future Value

Download Full-text

Whole-genome sequencing offers additional but limited clinical utility compared with reanalysis of whole-exome sequencing

Genetics in Medicine ◽

10.1038/gim.2018.41 ◽

2018 ◽

Vol 20 (11) ◽

pp. 1328-1333 ◽

Cited By ~ 44

Author(s):

Ahmed Alfares ◽

Taghrid Aloraini ◽

Lamia Al subaie ◽

Abdulelah Alissa ◽

Ahmed Al Qudsi ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Exome Sequencing ◽

Whole Exome Sequencing ◽

Genome Sequencing ◽

Clinical Utility ◽

Whole Genome ◽

Whole Exome

Download Full-text

Divergent variant patterns among 19 patients with Rubinstein‐Taybi syndrome uncovered by comprehensive genetic analysis including whole genome sequencing

Clinical Genetics ◽

10.1111/cge.14103 ◽

2021 ◽

Author(s):

Yumi Enomoto ◽

Takayuki Yokoi ◽

Yoshinori Tsurusaki ◽

Hiroaki Murakami ◽

Makiko Tominaga ◽

...

Keyword(s):

Genetic Analysis ◽

Whole Genome Sequencing ◽

Genome Sequencing ◽

Whole Genome

Download Full-text

Whole-genome sequencing of Chinese centenarians reveals important genetic variants in aging WGS of centenarian for genetic analysis of aging

Human Genomics ◽

10.1186/s40246-020-00271-7 ◽

2020 ◽

Vol 14 (1) ◽

Cited By ~ 1

Author(s):

Shuhua Shen ◽

Chao Li ◽

Luwei Xiao ◽

Xiaoming Wang ◽

Hang Lv ◽

...

Keyword(s):

Genetic Analysis ◽

Whole Genome Sequencing ◽

Genome Sequencing ◽

Genetic Variants ◽

Whole Genome

Download Full-text

Genome Sequencing of Polydrug-, Multidrug-, and Extensively Drug-Resistant Mycobacterium tuberculosis Strains from South India

Microbiology Resource Announcements ◽

10.1128/mra.01388-18 ◽

2019 ◽

Vol 8 (12) ◽

Author(s):

Sivakumar Shanmugam ◽

Narender Kumar ◽

Dina Nair ◽

Mohan Natrajan ◽

Srikanth Prasad Tripathy ◽

...

Keyword(s):

Mycobacterium Tuberculosis ◽

Whole Genome Sequencing ◽

Genome Sequencing ◽

South India ◽

Sequence Data ◽

Whole Genome ◽

Drug Resistant ◽

Resistance Mutations ◽

Content Type ◽

Extensively Drug Resistant

The genomes of 16 clinical Mycobacterium tuberculosis isolates were subjected to whole-genome sequencing to identify mutations related to resistance to one or more anti-Mycobacterium drugs. The sequence data will help in understanding the genomic characteristics of M. tuberculosis isolates and their resistance mutations prevalent in South India.

Download Full-text