scholarly journals Systems Biology Guided Gene Enrichment Approaches Improve Prediction of Chronic Post-surgical Pain After Spine Fusion

2021 ◽  
Vol 12 ◽  
Author(s):  
Vidya Chidambaran ◽  
Valentina Pilipenko ◽  
Anil G. Jegga ◽  
Kristie Geisler ◽  
Lisa J. Martin

ObjectivesIncorporation of genetic factors in psychosocial/perioperative models for predicting chronic postsurgical pain (CPSP) is key for personalization of analgesia. However, single variant associations with CPSP have small effect sizes, making polygenic risk assessment important. Unfortunately, pediatric CPSP studies are not sufficiently powered for unbiased genome wide association (GWAS). We previously leveraged systems biology to identify candidate genes associated with CPSP. The goal of this study was to use systems biology prioritized gene enrichment to generate polygenic risk scores (PRS) for improved prediction of CPSP in a prospectively enrolled clinical cohort.MethodsIn a prospectively recruited cohort of 171 adolescents (14.5 ± 1.8 years, 75.4% female) undergoing spine fusion, we collected data about anesthesia/surgical factors, childhood anxiety sensitivity (CASI), acute pain/opioid use, pain outcomes 6–12 months post-surgery and blood (for DNA extraction/genotyping). We previously prioritized candidate genes using computational approaches based on similarity for functional annotations with a literature-derived “training set.” In this study, we tested ranked deciles of 1336 prioritized genes for increased representation of variants associated with CPSP, compared to 10,000 randomly selected control sets. Penalized regression (LASSO) was used to select final variants from enriched variant sets for calculation of PRS. PRS incorporated regression models were compared with previously published non-genetic models for predictive accuracy.ResultsIncidence of CPSP in the prospective cohort was 40.4%. 33,104 case and 252,590 control variants were included for association analyses. The smallest gene set enriched for CPSP had 80/1010 variants associated with CPSP (p < 0.05), significantly higher than in 10,000 randomly selected control sets (p = 0.0004). LASSO selected 20 variants for calculating weighted PRS. Model adjusted for covariates including PRS had AUROC of 0.96 (95% CI: 0.92–0.99) for CPSP prediction, compared to 0.70 (95% CI: 0.59–0.82) for non-genetic model (p < 0.001). Odds ratios and positive regression coefficients for the final model were internally validated using bootstrapping: PRS [OR 1.98 (95% CI: 1.21–3.22); β 0.68 (95% CI: 0.19–0.74)] and CASI [OR 1.33 (95% CI: 1.03–1.72); β 0.29 (0.03–0.38)].DiscussionSystems biology guided PRS improved predictive accuracy of CPSP risk in a pediatric cohort. They have potential to serve as biomarkers to guide risk stratification and tailored prevention. Findings highlight systems biology approaches for deriving PRS for phenotypes in cohorts less amenable to large scale GWAS.

Neurogenetics ◽  
2020 ◽  
Vol 21 (3) ◽  
pp. 205-215
Author(s):  
Roel R. I. van Reij ◽  
Jan Willem Voncken ◽  
Elbert A. J. Joosten ◽  
Nynke J. van den Hoogen

2020 ◽  
Vol 11 ◽  
Author(s):  
Julianne Duhazé ◽  
Rodolphe Jantzen ◽  
Yves Payette ◽  
Thibault De Malliard ◽  
Catherine Labbé ◽  
...  

Author(s):  
Vivek Kaimal ◽  
Divya Sardana ◽  
Eric E. Bardes ◽  
Ranga Chandra Gudivada ◽  
Jing Chen ◽  
...  

2021 ◽  
Author(s):  
Paul O’Reilly ◽  
Shing Choi ◽  
Judit Garcia-Gonzalez ◽  
Yunfeng Ruan ◽  
Hei Man Wu ◽  
...  

Abstract Polygenic risk scores (PRSs) have been among the leading advances in biomedicine in recent years. As a proxy of genetic liability, PRSs are utilised across multiple fields and applications. While numerous statistical and machine learning methods have been developed to optimise their predictive accuracy, all of these distil genetic liability to a single number based on aggregation of an individual’s genome-wide alleles. This results in a key loss of information about an individual’s genetic profile, which could be critical given the functional sub-structure of the genome and the heterogeneity of complex disease. Here we evaluate the performance of pathway-based PRSs, in which polygenic scores are calculated across genomic pathways for each individual, and we introduce a software, PRSet, for computing and analysing pathway PRSs. We find that pathway PRSs have similar power for evaluating pathway enrichment of GWAS signal as the leading methods, with the distinct advantage of providing estimates of pathway genetic liability at the individual-level. Exemplifying their utility, we demonstrate that pathway PRSs can stratify diseases into subtypes in the UK Biobank with substantially greater power than genome-wide PRSs. Compared to genome-wide PRSs, we expect pathway-based PRSs to offer greater insights into the heterogeneity of complex disease and treatment response, generate more biologically tractable therapeutic targets, and provide a more powerful path to precision medicine.


2018 ◽  
Author(s):  
Florian Privé ◽  
Hugues Aschard ◽  
Michael G.B. Blum

AbstractPolygenic Risk Scores (PRS) consist in combining the information across many single-nucleotide polymorphisms (SNPs) in a score reflecting the genetic risk of developing a disease. PRS might have a major impact on public health, possibly allowing for screening campaigns to identify high-genetic risk individuals for a given disease. The “Clumping+Thresholding” (C+T) approach is the most common method to derive PRS. C+T uses only univariate genome-wide association studies (GWAS) summary statistics, which makes it fast and easy to use. However, previous work showed that jointly estimating SNP effects for computing PRS has the potential to significantly improve the predictive performance of PRS as compared to C+T.In this paper, we present an efficient method to jointly estimate SNP effects, allowing for practical application of penalized logistic regression (PLR) on modern datasets including hundreds of thousands of individuals. Moreover, our implementation of PLR directly includes automatic choices for hyper-parameters. The choice of hyper-parameters for a predictive model is very important since it can dramatically impact its predictive performance. As an example, AUC values range from less than 60% to 90% in a model with 30 causal SNPs, depending on the p-value threshold in C+T.We compare the performance of PLR, C+T and a derivation of random forests using both real and simulated data. PLR consistently achieves higher predictive performance than the two other methods while being as fast as C+T. We find that improvement in predictive performance is more pronounced when there are few effects located in nearby genomic regions with correlated SNPs; for instance, AUC values increase from 83% with the best prediction of C+T to 92.5% with PLR. We confirm these results in a data analysis of a case-control study for celiac disease where PLR and the standard C+T method achieve AUC of 89% and of 82.5%.In conclusion, our study demonstrates that penalized logistic regression can achieve more discriminative polygenic risk scores, while being applicable to large-scale individual-level data thanks to the implementation we provide in the R package bigstatsr.


2021 ◽  
pp. 112067212199177
Author(s):  
Yang Liu ◽  
David Wei ◽  
Tao Bai ◽  
Jie Luo ◽  
Jennifer Wood ◽  
...  

Objective: To predict post-operative depth of focus (DoF) using machine learning techniques after cataract surgery with Tecnis Symfony implantation and determine associated impact factors. Methods: This was a retrospective cohort study among patients receiving Tecnis Symfony implantation, an extended-range-of-vision intraocular lens, during October 2016–January 2020 at Daqing Oilfield General Hospital, China. Four different predictive models were used to predict good post-operative DoF (⩾2.5 D): Extreme Gradient Boost (XGBoost), random forest (RF), LASSO penalized regression, and multivariable logistic regression (MLR). Apriori algorithm was employed to further explore the association between patient attributes and DoF. Results: A total of 182 unique cases (143 patients) were included. The XGBoost model produced the best predictive accuracy compared to RF, LASSO, and MLR models. Overall performance of the best fitting XGBoost model was as follows: accuracy = 70.3%, AUC = 80.2%, sensitivity = 65.5%, and specificity = 87.5%. The Apriori algorithm identified six preoparative attributes with substantial effects on good post-operative DoF: low anterior chamber depth (ACD) (1.9 to <2.5 mm), smaller pupil size (1.7 to <2.5 mm), low-to-mid axial length (21 to <23 mm), minimum astigmatism degree (−0.2 to 0 diopter), low IOP (9 to <12 mmHg), and medium lens target refractive error (−0.5 to <−0.25 diopter). Conclusions: Machine Learning models were able to predict good post-operative DoF among cataract patients receiving a Tecnis Symfony ocular lens implantation. The accuracy of the model was above 70%. The Apriori algorithm identified six preoperative attributes with a strong association with post-operative DoF.


2021 ◽  
Vol 17 (3) ◽  
pp. e1008831
Author(s):  
Denis A. Shah ◽  
Erick D. De Wolf ◽  
Pierce A. Paul ◽  
Laurence V. Madden

Ensembling combines the predictions made by individual component base models with the goal of achieving a predictive accuracy that is better than that of any one of the constituent member models. Diversity among the base models in terms of predictions is a crucial criterion in ensembling. However, there are practical instances when the available base models produce highly correlated predictions, because they may have been developed within the same research group or may have been built from the same underlying algorithm. We investigated, via a case study on Fusarium head blight (FHB) on wheat in the U.S., whether ensembles of simple yet highly correlated models for predicting the risk of FHB epidemics, all generated from logistic regression, provided any benefit to predictive performance, despite relatively low levels of base model diversity. Three ensembling methods were explored: soft voting, weighted averaging of smaller subsets of the base models, and penalized regression as a stacking algorithm. Soft voting and weighted model averages were generally better at classification than the base models, though not universally so. The performances of stacked regressions were superior to those of the other two ensembling methods we analyzed in this study. Ensembling simple yet correlated models is computationally feasible and is therefore worth pursuing for models of epidemic risk.


Sign in / Sign up

Export Citation Format

Share Document