Common, low-frequency, rare, and ultra-rare coding variants contribute to COVID-19 severity

AbstractThe combined impact of common and rare exonic variants in COVID-19 host genetics is currently insufficiently understood. Here, common and rare variants from whole-exome sequencing data of about 4000 SARS-CoV-2-positive individuals were used to define an interpretable machine-learning model for predicting COVID-19 severity. First, variants were converted into separate sets of Boolean features, depending on the absence or the presence of variants in each gene. An ensemble of LASSO logistic regression models was used to identify the most informative Boolean features with respect to the genetic bases of severity. The Boolean features selected by these logistic models were combined into an Integrated PolyGenic Score that offers a synthetic and interpretable index for describing the contribution of host genetics in COVID-19 severity, as demonstrated through testing in several independent cohorts. Selected features belong to ultra-rare, rare, low-frequency, and common variants, including those in linkage disequilibrium with known GWAS loci. Noteworthily, around one quarter of the selected genes are sex-specific. Pathway analysis of the selected genes associated with COVID-19 severity reflected the multi-organ nature of the disease. The proposed model might provide useful information for developing diagnostics and therapeutics, while also being able to guide bedside disease management.

Download Full-text

Common, low-frequency, rare, and ultra-rare coding variants contribute to COVID-19 severity.

10.1101/2021.09.03.21262611 ◽

2021 ◽

Author(s):

Chiara Fallerini ◽

Nicola Picchiotti ◽

Margherita Baldassarri ◽

Kristina Zguro ◽

Sergio Daga ◽

...

Keyword(s):

Rare Variants ◽

Low Frequency ◽

Logistic Models ◽

Sequencing Data ◽

Host Genetics ◽

Logistic Regression Models ◽

Whole Exome ◽

Machine Learning Model ◽

Whole Exome Sequencing Data ◽

Coding Variants

The combined impact of common and rare exonic variants in COVID-19 host genetics is currently insufficiently understood. Here, common and rare variants from whole exome sequencing data of about 4,000 SARS-CoV-2-positive individuals were used to define an interpretable machine learning model for predicting COVID-19 severity. Firstly, variants were converted into separate sets of Boolean features, depending on the absence or the presence of variants in each gene. An ensemble of LASSO logistic regression models was used to identify the most informative Boolean features with respect to the genetic bases of severity. The Boolean features selected by these logistic models were combined into an Integrated PolyGenic Score that offers a synthetic and interpretable index for describing the contribution of host genetics in COVID-19 severity, as demonstrated through testing in several independent cohorts. Selected features belong to ultra-rare, rare, low-frequency, and common variants, including those in linkage disequilibrium with known GWAS loci. Noteworthly, around one quarter of the selected genes are sex-specific. Pathway analysis of the selected genes associated with COVID-19 severity reflected the multi-organ nature of the disease. The proposed model might provide useful information for developing diagnostics and therapeutics, while also being able to guide bedside disease management.

Download Full-text

Quantifying the impact of rare and ultra-rare coding variation across the phenotypic spectrum

10.1101/148247 ◽

2017 ◽

Cited By ~ 1

Author(s):

Andrea Ganna ◽

Kyle F. Satterstrom ◽

Seyedeh M Zekavat ◽

Indraniel Das ◽

Mitja I. Kurki ◽

...

Keyword(s):

Complex Traits ◽

Sequencing Data ◽

Multiple Phenotypes ◽

Gene Sets ◽

Whole Exome ◽

Increased Risk ◽

Whole Exome Sequencing Data ◽

Lower Education ◽

Coding Variants ◽

The Impact

AbstractThere is a limited understanding about the impact of rare protein truncating variants across multiple phenotypes. We explore the impact of this class of variants on 13 quantitative traits and 10 diseases using whole-exome sequencing data from 100,296 individuals. Protein truncating variants in genes intolerant to this class of mutations increased risk of autism, schizophrenia, bipolar disorder, intellectual disability, ADHD. In individuals without these disorders, there was an association with shorter height, lower education, increased hospitalization and reduced age. Gene sets implicated from GWAS did not show a significant protein truncating variants-burden beyond what captured by established Mendelian genes. In conclusion, we provide the most thorough investigation to date of the impact of rare deleterious coding variants on complex traits, suggesting widespread pleiotropic risk.Main abbreviationsPTV= Protein Truncating VariantsPI= Protein Truncating IntolerantPI-PTV= Protein Truncating Variant in genes that are Intolerant to Protein Truncating Variants

Download Full-text

Biallelic novel mutations of the COL27A1 gene in a patient with Steel syndrome

Human Genome Variation ◽

10.1038/s41439-021-00149-7 ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Jong Seop Kim ◽

Hyoungseok Jeon ◽

Hyeran Lee ◽

Jung Min Ko ◽

Yonghwan Kim ◽

...

Keyword(s):

Hip Dysplasia ◽

Large Deletion ◽

Compound Heterozygous ◽

Radial Head Dislocation ◽

Sequencing Data ◽

Novel Mutations ◽

Exome Sequencing Data ◽

Whole Exome ◽

Whole Exome Sequencing Data ◽

Carpal Coalition

AbstractAn 11-year-old Korean boy presented with short stature, hip dysplasia, radial head dislocation, carpal coalition, genu valgum, and fixed patellar dislocation and was clinically diagnosed with Steel syndrome. Scrutinizing the trio whole-exome sequencing data revealed novel compound heterozygous mutations of COL27A1 (c.[4229_4233dup]; [3718_5436del], p.[Gly1412Argfs*157];[Gly1240_Lys1812del]) in the proband, which were inherited from heterozygous parents. The maternal mutation was a large deletion encompassing exons 38–60, which was challenging to detect.

Download Full-text

ETumorMetastasis: A Network-based Algorithm Predicts Clinical Outcomes Using Whole-exome Sequencing Data of Cancer Patients

Genomics Proteomics & Bioinformatics ◽

10.1016/j.gpb.2020.06.009 ◽

2021 ◽

Cited By ~ 1

Author(s):

Jean-Sébastien Milanese ◽

Chabane Tibiche ◽

Naif Zaman ◽

Jinfeng Zou ◽

Pengyong Han ◽

...

Keyword(s):

Exome Sequencing ◽

Cancer Patients ◽

Clinical Outcomes ◽

Whole Exome Sequencing ◽

Sequencing Data ◽

Exome Sequencing Data ◽

Whole Exome ◽

Whole Exome Sequencing Data

Download Full-text

Long runs of homozygosity are associated with Alzheimer’s disease

Translational Psychiatry ◽

10.1038/s41398-020-01145-1 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Sonia Moreno-Grau ◽

◽

Maria Victoria Fernández ◽

Itziar de Rojas ◽

Pablo Garcia-González ◽

...

Keyword(s):

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

European Ancestry ◽

Runs Of Homozygosity ◽

Sequencing Data ◽

Outbred Population ◽

Whole Exome ◽

Whole Exome Sequencing Data ◽

Outbred Populations ◽

Recessive Effects

AbstractLong runs of homozygosity (ROH) are contiguous stretches of homozygous genotypes, which are a footprint of inbreeding and recessive inheritance. The presence of recessive loci is suggested for Alzheimer’s disease (AD); however, their search has been poorly assessed to date. To investigate homozygosity in AD, here we performed a fine-scale ROH analysis using 10 independent cohorts of European ancestry (11,919 AD cases and 9181 controls.) We detected an increase of homozygosity in AD cases compared to controls [βAVROH (CI 95%) = 0.070 (0.037–0.104); P = 3.91 × 10−5; βFROH (CI95%) = 0.043 (0.009–0.076); P = 0.013]. ROHs increasing the risk of AD (OR > 1) were significantly overrepresented compared to ROHs increasing protection (p < 2.20 × 10−16). A significant ROH association with AD risk was detected upstream the HS3ST1 locus (chr4:11,189,482‒11,305,456), (β (CI 95%) = 1.09 (0.48 ‒ 1.48), p value = 9.03 × 10−4), previously related to AD. Next, to search for recessive candidate variants in ROHs, we constructed a homozygosity map of inbred AD cases extracted from an outbred population and explored ROH regions in whole-exome sequencing data (N = 1449). We detected a candidate marker, rs117458494, mapped in the SPON1 locus, which has been previously associated with amyloid metabolism. Here, we provide a research framework to look for recessive variants in AD using outbred populations. Our results showed that AD cases have enriched homozygosity, suggesting that recessive effects may explain a proportion of AD heritability.

Download Full-text

A Survey of Computational Tools to Analyze and Interpret Whole Exome Sequencing Data

International Journal of Genomics ◽

10.1155/2016/7983236 ◽

2016 ◽

Vol 2016 ◽

pp. 1-16 ◽

Cited By ~ 16

Author(s):

Jennifer D. Hintzsche ◽

William A. Robinson ◽

Aik Choon Tan

Keyword(s):

Exome Sequencing ◽

Whole Exome Sequencing ◽

Sequencing Data ◽

Disease Treatment ◽

Computational Tools ◽

Whole Exome ◽

Data Production ◽

Whole Exome Sequencing Data ◽

Computationally Intensive ◽

Generation Technology

Whole Exome Sequencing (WES) is the application of the next-generation technology to determine the variations in the exome and is becoming a standard approach in studying genetic variants in diseases. Understanding the exomes of individuals at single base resolution allows the identification of actionable mutations for disease treatment and management. WES technologies have shifted the bottleneck in experimental data production to computationally intensive informatics-based data analysis. Novel computational tools and methods have been developed to analyze and interpret WES data. Here, we review some of the current tools that are being used to analyze WES data. These tools range from the alignment of raw sequencing reads all the way to linking variants to actionable therapeutics. Strengths and weaknesses of each tool are discussed for the purpose of helping researchers make more informative decisions on selecting the best tools to analyze their WES data.

Download Full-text

UTILIZATION OF WHOLE EXOME SEQUENCING DATA TO IDENTIFY CLINICALLY RELEVANT PHARMACOGENOMIC VARIANTS IN INFLAMMATORY BOWEL DISEASE

Gastroenterology ◽

10.1053/j.gastro.2021.01.119 ◽

2021 ◽

Vol 160 (3) ◽

pp. S43

Author(s):

Daniel Mulder ◽

Sam Khalouei ◽

Neil Warner ◽

Claudia Gonzaga-Jauregui ◽

Peter Church ◽

...

Keyword(s):

Inflammatory Bowel Disease ◽

Exome Sequencing ◽

Whole Exome Sequencing ◽

Bowel Disease ◽

Sequencing Data ◽

Exome Sequencing Data ◽

Whole Exome ◽

Whole Exome Sequencing Data ◽

Inflammatory Bowel

Download Full-text

EthSEQ: ethnicity annotation from whole exome sequencing data

Bioinformatics ◽

10.1093/bioinformatics/btx165 ◽

2017 ◽

Vol 33 (15) ◽

pp. 2402-2404 ◽

Cited By ~ 7

Author(s):

Alessandro Romanel ◽

Tuo Zhang ◽

Olivier Elemento ◽

Francesca Demichelis

Keyword(s):

Exome Sequencing ◽

Whole Exome Sequencing ◽

Sequencing Data ◽

Exome Sequencing Data ◽

Whole Exome ◽

Whole Exome Sequencing Data

Download Full-text

Identification of modifier genes of Pompe disease phenotype by variant analysis of whole exome sequencing data

Molecular Genetics and Metabolism ◽

10.1016/j.ymgme.2015.12.368 ◽

2016 ◽

Vol 117 (2) ◽

pp. S83

Author(s):

Mari Mori ◽

Zoheb B. Kazi ◽

Xiaolin Zhu ◽

Katie Barrier ◽

Stephanie Austin ◽

...

Keyword(s):

Exome Sequencing ◽

Whole Exome Sequencing ◽

Pompe Disease ◽

Modifier Genes ◽

Disease Phenotype ◽

Sequencing Data ◽

Exome Sequencing Data ◽

Whole Exome ◽

Whole Exome Sequencing Data ◽

Variant Analysis

Download Full-text

Assessing the contribution of rare-to-common protein-coding variants to circulating metabolic biomarker levels via 412,394 UK Biobank exome sequences

10.1101/2021.12.24.21268381 ◽

2021 ◽

Author(s):

Abhishek Nag ◽

Lawrence Middleton ◽

Ryan S Dhindsa ◽

Dimitrios Vitsios ◽

Eleanor M Wigmore ◽

...

Keyword(s):

Gene Networks ◽

Rare Variants ◽

Association Studies ◽

Low Frequency ◽

Genome Wide Association Studies ◽

Uk Biobank ◽

Protein Coding ◽

The Uk ◽

Metabolic Biomarkers ◽

Coding Variants

Genome-wide association studies have established the contribution of common and low frequency variants to metabolic biomarkers in the UK Biobank (UKB); however, the role of rare variants remains to be assessed systematically. We evaluated rare coding variants for 198 metabolic biomarkers, including metabolites assayed by Nightingale Health, using exome sequencing in participants from four genetically diverse ancestries in the UKB (N=412,394). Gene-level collapsing analysis, that evaluated a range of genetic architectures, identified a total of 1,303 significant relationships between genes and metabolic biomarkers (p<1x10-8), encompassing 207 distinct genes. These include associations between rare non-synonymous variants in GIGYF1 and glucose and lipid biomarkers, SYT7 and creatinine, and others, which may provide insights into novel disease biology. Comparing to a previous microarray-based genotyping study in the same cohort, we observed that 40% of gene-biomarker relationships identified in the collapsing analysis were novel. Finally, we applied Gene-SCOUT, a novel tool that utilises the gene-biomarker association statistics from the collapsing analysis to identify genes having similar biomarker fingerprints and thus expand our understanding of gene networks.

Download Full-text