A Statistical Framework for Mapping Risk Genes from De Novo Mutations in Whole-Genome-Sequencing Studies

AbstractAnalysis of de novo mutations (DNMs) from sequencing data of nuclear families has identified risk genes for many complex diseases, including multiple neurodevelopmental and psychiatric disorders. Most of these efforts have focused on mutations in protein-coding sequences. Evidence from genome-wide association studies (GWAS) strongly suggests that variants important to human diseases often lie in non-coding regions. Extending DNM-based approaches to non-coding sequences is, however, challenging because the functional significance of non-coding mutations is difficult to predict. We propose a new statistical framework for analyzing DNMs from whole-genome sequencing (WGS) data. This method, TADA-Annotations (TADA-A), is a major advance of the TADA method we developed earlier for DNM analysis in coding regions. TADA-A is able to incorporate many functional annotations such as conservation and enhancer marks, learn from data which annotations are informative of pathogenic mutations and combine both coding and non-coding mutations at the gene level to detect risk genes. It also supports meta-analysis of multiple DNM studies, while adjusting for study-specific technical effects. We applied TADA-A to WGS data of ∼300 autism family trios across five studies, and discovered several new autism risk genes. The software is freely available for all research uses.

Download Full-text

Detection and phasing of single base de novo mutations in biopsies from human in vitro fertilized embryos by advanced whole-genome sequencing

Genome Research ◽

10.1101/gr.181255.114 ◽

2015 ◽

Vol 25 (3) ◽

pp. 426-434 ◽

Cited By ~ 31

Author(s):

Brock A. Peters ◽

Bahram G. Kermani ◽

Oleg Alferov ◽

Misha R. Agarwal ◽

Mark A. McElwain ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

De Novo ◽

Whole Genome ◽

De Novo Mutations ◽

Single Base

Download Full-text

Whole-genome Sequencing Reveals De-novo Mutations Associated with Nonsyndromic Cleft Lip/Palate

10.21203/rs.3.rs-1064924/v1 ◽

2021 ◽

Author(s):

Waheed Awotoye ◽

Peter A. Mossey ◽

Jacqueline B. Hetmanski ◽

Lord Jephthah Joojo Gowans ◽

Mekonen A. Eshete ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Cleft Lip ◽

De Novo ◽

Association Studies ◽

Whole Genome ◽

Loss Of Function ◽

De Novo Mutations ◽

Pathogenic Variants ◽

Cleft Lip Palate

Abstract The majority (85%) of nonsyndromic cleft lip with or without cleft palate (nsCL/P) cases occur sporadically, suggesting a role for de novo mutations (DNMs) in the etiology of nsCL/P. To identify high impact DNMs that contribute to the risk of nsCL/P, we conducted whole genome sequencing (WGS) analyses in 130 African case-parent trios (affected probands and unaffected parents). We identified 162 high confidence protein-altering DNMs that contribute to the risk of nsCL/P. These include novel loss-of-function DNMs in the ACTL6A, ARHGAP10, MINK1, TMEM5 and TTN genes; as well as missense variants in ACAN, DHRS3, DLX6, EPHB2, FKBP10, KMT2D, RECQL4, SEMA3C, SEMA4D, SHH, TP63, and TULP4. Experimental evidence showed that ACAN, DHRS3, DLX6, EPHB2, FKBP10, KMT2D, MINK1, RECQL4, SEMA3C, SEMA4D, SHH, TP63, and TTN genes contribute to facial development and mutations in these genes could contribute to CL/P. Association studies have identified TULP4 as a potential cleft candidate gene, while ARHGAP10 interacts with CTNNB1 to control WNT signaling. DLX6, EPHB2, SEMA3C and SEMA4D harbor novel damaging DNMs that may affect their role in neural crest migration and palatal development. This discovery of pathogenic DNMs also confirms the power of WGS analysis of trios in the discovery of potential pathogenic variants.

Download Full-text

Whole Genome Sequencing To Identify De Novo Mutations In Bipolar Disorder

European Neuropsychopharmacology ◽

10.1016/j.euroneuro.2016.09.417 ◽

2017 ◽

Vol 27 ◽

pp. S384

Author(s):

Fernando Goes ◽

Mehdi Pirooznia ◽

Martin Tehan ◽

Paula Wolyniec ◽

John McGrath ◽

...

Keyword(s):

Bipolar Disorder ◽

Whole Genome Sequencing ◽

Genome Sequencing ◽

De Novo ◽

Whole Genome ◽

De Novo Mutations

Download Full-text

Whole Genome Sequencing of a Vietnamese Family from a Dioxin Contamination Hotspot Reveals Novel Variants in the Son with Undiagnosed Intellectual Disability

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph15122629 ◽

2018 ◽

Vol 15 (12) ◽

pp. 2629 ◽

Cited By ~ 1

Author(s):

Dang Nguyen ◽

Hai Nguyen ◽

Thuy Nguyen ◽

Thi Nguyen ◽

Kaoru Nakano ◽

...

Keyword(s):

Intellectual Disability ◽

Whole Genome Sequencing ◽

Genome Sequencing ◽

De Novo ◽

Genetic Alterations ◽

Whole Genome ◽

Missense Mutations ◽

De Novo Mutations ◽

Bioinformatics Analyses ◽

Dioxin Contamination

Although it has been a half-century since dioxin-contaminated herbicides were used to defoliate the landscape during the Vietnam War, dioxin contamination “hotspots” still remain in Vietnam. Environmental and health impacts of these hotspots need to be evaluated. Intellectual disability (ID) is one of the diseases found in the children of people exposed to the herbicides. This study aims to identify genetic alterations of a patient whose family lived in a dioxin hotspot. The patient’s father had a highly elevated dioxin concentration. He was affected with undiagnosed moderate ID. To analyze de novo mutations and genetic variations, and to identify causal gene(s) for ID, we performed whole genome sequencing (WGS) of the proband and his parents. Two de novo missense mutations were detected, each one in ETS2 and ZNF408 genes, respectively. Compound heterozygosity was identified in CENPF and TTN genes. Existing knowledge on the genes and bioinformatics analyses suggest that EST2, ZNF408, and CENPF might be promising candidates for ID causative genes.

Download Full-text

Whole genome sequencing in multiplex families reveals novel inherited and de novo genetic risk in autism

10.1101/338855 ◽

2018 ◽

Cited By ~ 5

Author(s):

Elizabeth K. Ruzzo ◽

Laura Pérez-Cano ◽

Jae-Yoon Jung ◽

Lee-kai Wang ◽

Dorna Kashef-Haghighi ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

De Novo ◽

Autism Spectrum ◽

Whole Genome ◽

Single Nucleotide Variants ◽

Risk Genes ◽

Protein Protein Interaction ◽

Genetics Research ◽

Multiplex Families

AbstractGenetic studies of autism spectrum disorder (ASD) have revealed a complex, heterogeneous architecture, in which the contribution of rare inherited variation remains relatively un-explored. We performed whole-genome sequencing (WGS) in 2,308 individuals from families containing multiple affected children, including analysis of single nucleotide variants (SNV) and structural variants (SV). We identified 16 new ASD-risk genes, including many supported by inherited variation, and provide statistical support for 69 genes in total, including previously implicated genes. These risk genes are enriched in pathways involving negative regulation of synaptic transmission and organelle organization. We identify a significant protein-protein interaction (PPI) network seeded by inherited, predicted damaging variants disrupting highly constrained genes, including members of the BAF complex and established ASD risk genes. Analysis of WGS also identified SVs effecting non-coding regulatory regions in developing human brain, implicating NR3C2 and a recurrent 2.5Kb deletion within the promoter of DLG2. These data lend support to studying multiplex families for identifying inherited risk for ASD. We provide these data through the Hartwell Autism Research and Technology Initiative (iHART), an open access cloud-computing repository for ASD genetics research.

Download Full-text

A whole-genome sequencing–based novel preimplantation genetic testing method for de novo mutations combined with chromosomal balanced translocations

Journal of Assisted Reproduction and Genetics ◽

10.1007/s10815-020-01921-4 ◽

2020 ◽

Vol 37 (10) ◽

pp. 2525-2533

Author(s):

Ping Yuan ◽

Jun Xia ◽

Songbang Ou ◽

Ping Liu ◽

Tao Du ◽

...

Keyword(s):

Genetic Testing ◽

Whole Genome Sequencing ◽

Genome Sequencing ◽

De Novo ◽

Whole Genome ◽

De Novo Mutations ◽

Preimplantation Genetic Testing ◽

Testing Method ◽

Balanced Translocations

Download Full-text

De novo mutations discovered in 8 Mexican American families through whole genome sequencing

BMC Proceedings ◽

10.1186/1753-6561-8-s1-s24 ◽

2014 ◽

Vol 8 (S1) ◽

Cited By ~ 7

Author(s):

Heming Wang ◽

Xiaofeng Zhu

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Mexican American ◽

De Novo ◽

Whole Genome ◽

De Novo Mutations ◽

Mexican American Families ◽

American Families

Download Full-text

De novo mutations identified by whole-genome sequencing implicate chromatin modifications in obsessive-compulsive disorder

Science Advances ◽

10.1126/sciadv.abi6180 ◽

2022 ◽

Vol 8 (2) ◽

Author(s):

Guan Ning Lin ◽

Weichen Song ◽

Weidi Wang ◽

Pei Wang ◽

Huan Yu ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

De Novo ◽

Chromatin Modification ◽

Whole Genome ◽

Chromatin Modifications ◽

De Novo Mutations ◽

Obsessive Compulsive ◽

Compulsive Disorder

Trio-based whole-genome sequencing identified the role of chromatin modification in OCD pathology.

Download Full-text

Effective variant filtering and expected candidate variant yield in studies of rare human disease

npj Genomic Medicine ◽

10.1038/s41525-021-00227-3 ◽

2021 ◽

Vol 6 (1) ◽

Author(s):

Brent S. Pedersen ◽

Joe M. Brown ◽

Harriet Dashnow ◽

Amelia D. Wallace ◽

Matt Velinder ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Rare Disease ◽

Genome Sequencing ◽

Autosomal Dominant ◽

De Novo ◽

Autosomal Dominant Inheritance ◽

Compound Heterozygous ◽

Whole Genome ◽

Dominant Inheritance ◽

Family Based

AbstractIn studies of families with rare disease, it is common to screen for de novo mutations, as well as recessive or dominant variants that explain the phenotype. However, the filtering strategies and software used to prioritize high-confidence variants vary from study to study. In an effort to establish recommendations for rare disease research, we explore effective guidelines for variant (SNP and INDEL) filtering and report the expected number of candidates for de novo dominant, recessive, and autosomal dominant modes of inheritance. We derived these guidelines using two large family-based cohorts that underwent whole-genome sequencing, as well as two family cohorts with whole-exome sequencing. The filters are applied to common attributes, including genotype-quality, sequencing depth, allele balance, and population allele frequency. The resulting guidelines yield ~10 candidate SNP and INDEL variants per exome, and 18 per genome for recessive and de novo dominant modes of inheritance, with substantially more candidates for autosomal dominant inheritance. For family-based, whole-genome sequencing studies, this number includes an average of three de novo, ten compound heterozygous, one autosomal recessive, four X-linked variants, and roughly 100 candidate variants following autosomal dominant inheritance. The slivar software we developed to establish and rapidly apply these filters to VCF files is available at https://github.com/brentp/slivar under an MIT license, and includes documentation and recommendations for best practices for rare disease analysis.

Download Full-text