pathogenicity prediction Latest Research Papers

Next generation sequencing technologies both boost the discovery of variants in the human genome and exacerbate the challenges of pathogenic variant identification. In this study, we developed mvPPT (Pathogenicity Prediction Tool for missense variants), a highly sensitive and accurate missense variant classifier based on gradient boosting. MvPPT adopts high-confidence training sets with a wide spectrum of variant profiles, and extracts three categories of features, including scores from existing prediction tools, allele, amino acid and genotype frequencies, and genomic context. Compared with established predictors, mvPPT achieved superior performance in all test sets, regardless of data source. In addition, our study also provides guidance for training set and feature selection strategies, as well as reveals highly relevant features, which may further provide biological insights of variant pathogenicity.

Analysis of coding variants in the human FTO gene from the gnomAD database

PLoS ONE ◽

10.1371/journal.pone.0248610 ◽

2022 ◽

Vol 17 (1) ◽

pp. e0248610

Author(s):

Mauro Lúcio Ferreira Souza Junior ◽

Jaime Viana de Sousa ◽

João Farias Guerreiro

Keyword(s):

Human Populations ◽

Nucleotide Polymorphisms ◽

Single Nucleotide ◽

Functional Studies ◽

Rare Mutations ◽

Functional Variants ◽

Pathogenicity Prediction ◽

Complete Sequencing ◽

Coding Variants

Single nucleotide polymorphisms (SNPs) in the first intron of the FTO gene reported in 2007 continue to be the known variants with the greatest effect on adiposity in different human populations. Coding variants in the FTO gene, on the other hand, have been little explored, although data from complete sequencing of the exomes of various populations are available in public databases and provide an excellent opportunity to investigate potential functional variants in FTO. In this context, this study aimed to track nonsynonymous variants in the exons of the FTO gene in different population groups employing the gnomAD database and analyze the potential functional impact of these variants on the FTO protein using five publicly available pathogenicity prediction programs. The findings revealed 345 rare mutations, of which 321 are missense (93%), 19 are stop gained (5.6%) and five mutations are located in the splice region (1.4%). Of these, 134 (38.8%) were classified as pathogenic, 144 (41.7%) as benign and 67 (19.5%) as unknown. The available data, however, suggest that these variants are probably not associated with BMI and obesity, but instead, with other diseases. Functional studies are, therefore, required to identify the role of these variants in disease genesis.

A Rare Mutation in LMNB2 Associated with Lipodystrophy Drives Premature Cell Senescence

Cells ◽

10.3390/cells11010050 ◽

2021 ◽

Vol 11 (1) ◽

pp. 50

Author(s):

Alice-Anaïs Varlet ◽

Camille Desgrouas ◽

Cécile Jebane ◽

Nathalie Bonello-Palot ◽

Patrice Bourgeois ◽

...

Keyword(s):

Nuclear Envelope ◽

High Throughput Sequencing ◽

Fat Distribution ◽

Heterozygous Mutation ◽

Lamin A ◽

Premature Senescence ◽

Partial Lipodystrophy ◽

Pathogenicity Prediction ◽

Android Fat ◽

Cellular Phenotypes

Many proteins are causative for inherited partial lipodystrophies, including lamins, the essential constituents of the nuclear envelope scaffold called the lamina. By performing high throughput sequencing on a panel of genes involved in lipodystrophies, we identified a heterozygous mutation in LMNB2 gene (c.700C > T p.(Arg234Trp)) in a female patient presenting early onset type II diabetes, hypertriglyceridemia, and android fat distribution. This mutation is rare in the general population (frequency 0.013% in GnomAD) and was predicted pathogenic by a set of pathogenicity prediction software. Patient-derived fibroblasts showed nuclear shape abnormalities and premature senescence features, which are two typical cellular phenotypes associated with laminopathies. Moreover, we observed an atypical aggregation of lamin B2 in nucleoplasm, which co-distributes with emerin and lamin A/C, along with an abnormal distribution of lamin A/C at the nuclear envelope. Finally, reducing lamin B2 expression level by siRNA targeted toward LMNB2 transcripts resulted in decreased nuclear anomalies and senescence-associated beta-galactosidase, suggesting a role of the mutated protein in the occurrence of the observed cellular phenotype. Altogether, these results suggest that mutations in lamin B2 could produce premature senescence and partial lipodystrophy features as observed with certain mutants of lamin A/C.

Improved pathogenicity prediction for rare human missense variants

The American Journal of Human Genetics ◽

10.1016/j.ajhg.2021.11.010 ◽

2021 ◽

Vol 108 (12) ◽

pp. 2389

Author(s):

Yingzhou Wu ◽

Hanqing Liu ◽

Roujia Li ◽

Song Sun ◽

Jochen Weile ◽

...

Keyword(s):

Missense Variants ◽

Pathogenicity Prediction

Improved pathogenicity prediction for rare human missense variants

The American Journal of Human Genetics ◽

10.1016/j.ajhg.2021.08.012 ◽

2021 ◽

Author(s):

Yingzhou Wu ◽

Roujia Li ◽

Song Sun ◽

Jochen Weile ◽

Frederick P. Roth

Keyword(s):

Missense Variants ◽

Pathogenicity Prediction

X-CNV: genome-wide prediction of the pathogenicity of copy number variations

Genome Medicine ◽

10.1186/s13073-021-00945-4 ◽

2021 ◽

Vol 13 (1) ◽

Author(s):

Li Zhang ◽

Jingru Shi ◽

Jian Ouyang ◽

Riquan Zhang ◽

Yiran Tao ◽

...

Keyword(s):

Ethnic Groups ◽

Copy Number ◽

Association Studies ◽

Gene Copy Number ◽

Copy Number Variations ◽

Gene Copy ◽

Computational Framework ◽

Genome Wide ◽

Pathogenicity Prediction ◽

A Genome

Abstract Background Gene copy number variations (CNVs) contribute to genetic diversity and disease prevalence across populations. Substantial efforts have been made to decipher the relationship between CNVs and pathogenesis but with limited success. Results We have developed a novel computational framework X-CNV (www.unimd.org/XCNV), to predict the pathogenicity of CNVs by integrating more than 30 informative features such as allele frequency (AF), CNV length, CNV type, and some deleterious scores. Notably, over 14 million CNVs across various ethnic groups, covering nearly 93% of the human genome, were unified to calculate the AF. X-CNV, which yielded area under curve (AUC) values of 0.96 and 0.94 in training and validation sets, was demonstrated to outperform other available tools in terms of CNV pathogenicity prediction. A meta-voting prediction (MVP) score was developed to quantitively measure the pathogenic effect, which is based on the probabilistic value generated from the XGBoost algorithm. The proposed MVP score demonstrated a high discriminative power in determining pathogenetic CNVs for inherited traits/diseases in different ethnic groups. Conclusions The ability of the X-CNV framework to quantitatively prioritize functional, deleterious, and disease-causing CNV on a genome-wide basis outperformed current CNV-annotation tools and will have broad utility in population genetics, disease-association studies, and diagnostic screening.

SvAnna: efficient and accurate pathogenicity prediction for coding and regulatory structural variants in long-read genome sequencing

10.1101/2021.07.14.452267 ◽

2021 ◽

Author(s):

Daniel Danis ◽

Julius O.B. Jacobsen ◽

Parithi Balachandran ◽

Qihui Zhu ◽

Feyza Yilmaz ◽

...

Keyword(s):

Case Reports ◽

Regulatory Sequences ◽

Structural Variants ◽

Variant Annotation ◽

Mendelian Diseases ◽

Topologically Associating Domains ◽

Technological Advances ◽

Pathogenicity Prediction ◽

Phenotype Data ◽

Long Read

Structural variants (SVs) are implicated in the etiology of Mendelian diseases but have been systematically underascertained owing to limitations of existing technology. Recent technological advances such as long-read sequencing (LRS) enable more comprehensive detection of SVs, but approaches for clinical prioritization of candidate SVs are needed. Existing computational approaches do not specifically target LRS data, thereby missing a substantial proportion of candidate SVs, and do not provide a unified computational model for assessing all types of SVs. Structural Variant Annotation and Analysis (SvAnna) assesses all classes of SV and their intersection with transcripts and regulatory sequences in the context of topologically associating domains, relating predicted effects on gene function with clinical phenotype data. We show with a collection of 182 published case reports with pathogenic SVs that SvAnna places over 90% of pathogenic SVs in the top ten ranks. The interpretable prioritizations provided by SvAnna will facilitate the widespread adoption of LRS in diagnostic genomics.

Towards a New, Endophenotype-Based Strategy for Pathogenicity Prediction in BRCA1 and BRCA2: In Silico Modeling of the Outcome of HDR/SGE Assays for Missense Variants

International Journal of Molecular Sciences ◽

10.3390/ijms22126226 ◽

2021 ◽

Vol 22 (12) ◽

pp. 6226

Author(s):

Selen Özkan ◽

Natàlia Padilla ◽

Xavier de la Cruz

Keyword(s):

Ovarian Cancer ◽

Regression Models ◽

Molecular Level ◽

Negative Consequences ◽

Breast And Ovarian Cancer ◽

Brca1 And Brca2 ◽

Missense Variants ◽

Functional Assays ◽

Pathogenicity Prediction ◽

In Silico Modeling

The present limitations in the pathogenicity prediction of BRCA1 and BRCA2 (BRCA1/2) missense variants constitute an important problem with negative consequences for the diagnosis of hereditary breast and ovarian cancer. However, it has been proposed that the use of endophenotype predictions, i.e., computational estimates of the outcomes of functional assays, can be a good option to address this bottleneck. The application of this idea to the BRCA1/2 variants in the CAGI 5-ENIGMA international challenge has shown promising results. Here, we developed this approach, exploring the predictive performances of the regression models applied to the BRCA1/2 variants for which the values of the homology-directed DNA repair and saturation genome editing assays are available. Our results first showed that we can generate endophenotype estimates using a few molecular-level properties. Second, we show that the accuracy of these estimates is enough to obtain pathogenicity predictions comparable to those of many standard tools. Third, endophenotype-based predictions are complementary to, but do not outperform, those of a Random Forest model trained using variant pathogenicity annotations instead of endophenotype values. In summary, our results confirmed the usefulness of the endophenotype approach for the pathogenicity prediction of the BRCA1/2 missense variants, suggesting different options for future improvements.

Protein structural consequences of DNA mutational signatures: A meta-analysis of somatic variants and deep mutational scanning data

10.1101/2021.05.27.445950 ◽

2021 ◽

Author(s):

Joseph Chi-Fung Ng ◽

Franca Fraternali

Keyword(s):

Meta Analysis ◽

Protein Structures ◽

Three Dimensional ◽

Purifying Selection ◽

Physicochemical Characteristics ◽

Mutational Signatures ◽

Protein Core ◽

Dna Motif ◽

Pathogenicity Prediction ◽

The Stability

Signatures of DNA motifs associated with distinct mutagenic exposures have been defined for somatic variants, but little is known about the consequences different mutational processes pose to the cancer cell, particularly the distribution of the resulting variants in the implied proteins and their structural regions (surface, core, interacting interface). Here we first compare the protein-level consequences of six mutational signatures (Aging, APOBEC, POLE, UV, 5-FU and Platinum) characterised by clear DNA motif preferences. By mapping individual substitution events observed in tumours to three-dimensional protein structures, we show that these common somatic mutational signatures are biased against the protein core, consistent with the lower tolerability of substitutions at such structurally important regions. On the other hand, deep mutational scanning (DMS) data allow us to probe the "dark matter" of somatic mutational landscape, exploring variants which are otherwise removed in purifying selection. A computational DMS analysis identifies mutational contexts (5'-G/C[T>G]A/G-3') which are associated with damaging mutations, by altering physicochemical characteristics of amino acids at the protein core. We argue that comprehensive DMS analysis can contribute to classification of variants according to their true impact to the stability/activity of the affected protein, decoupling this from pathogenicity prediction offered by conventional variant impact classifiers.

pathogenicity prediction
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Machine learning techniques for pathogenicity prediction of non-synonymous single nucleotide polymorphisms in human body

MvPPT: a highly efficient and sensitive pathogenicity prediction tool for missense variants

Analysis of coding variants in the human FTO gene from the gnomAD database

A Rare Mutation in LMNB2 Associated with Lipodystrophy Drives Premature Cell Senescence

Improved pathogenicity prediction for rare human missense variants

Improved pathogenicity prediction for rare human missense variants

X-CNV: genome-wide prediction of the pathogenicity of copy number variations

SvAnna: efficient and accurate pathogenicity prediction for coding and regulatory structural variants in long-read genome sequencing

Towards a New, Endophenotype-Based Strategy for Pathogenicity Prediction in BRCA1 and BRCA2: In Silico Modeling of the Outcome of HDR/SGE Assays for Missense Variants

Protein structural consequences of DNA mutational signatures: A meta-analysis of somatic variants and deep mutational scanning data

Export Citation Format

pathogenicity predictionRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Machine learning techniques for pathogenicity prediction of non-synonymous single nucleotide polymorphisms in human body

MvPPT: a highly efficient and sensitive pathogenicity prediction tool for missense variants

Analysis of coding variants in the human FTO gene from the gnomAD database

A Rare Mutation in LMNB2 Associated with Lipodystrophy Drives Premature Cell Senescence

Improved pathogenicity prediction for rare human missense variants

Improved pathogenicity prediction for rare human missense variants

X-CNV: genome-wide prediction of the pathogenicity of copy number variations

SvAnna: efficient and accurate pathogenicity prediction for coding and regulatory structural variants in long-read genome sequencing

Towards a New, Endophenotype-Based Strategy for Pathogenicity Prediction in BRCA1 and BRCA2: In Silico Modeling of the Outcome of HDR/SGE Assays for Missense Variants

Protein structural consequences of DNA mutational signatures: A meta-analysis of somatic variants and deep mutational scanning data

pathogenicity prediction
Recently Published Documents