scholarly journals Using DNA to predict behaviour problems from preschool to adulthood

Author(s):  
Agnieszka Gidziela ◽  
Kaili Rimfeld ◽  
Margherita Malanchini ◽  
Andrea G. Allegrini ◽  
Andrew McMillan ◽  
...  

AbstractBackgroundOne goal of the DNA revolution is to predict problems in order to prevent them. We tested here if the prediction of behaviour problems from genome-wide polygenic scores (GPS) can be improved by creating composites across ages and across raters and by using a multi-GPS approach that includes GPS for adult psychiatric disorders as well as for childhood behaviour problems.MethodOur sample included 3,065 genotyped unrelated individuals from the Twins Early Development Study who were assessed longitudinally for hyperactivity, conduct, emotional problems and peer problems as rated by parents, teachers and children themselves. GPS created from 15 genome-wide association studies were used separately and jointly to test the prediction of behaviour problems composites (general behaviour problems, externalizing and internalizing) across ages (from age 2 to age 21) and across raters in penalized regression models. Based on the regression weights, we created multi-trait GPS reflecting the best prediction of behaviour problems. We compared GPS prediction to twin heritability using the same sample and measures.ResultsMulti-GPS prediction of behaviour problems increased from less than 2% of the variance for observed traits to up to 6% for cross-age and cross-rater composites. Twin study estimates of heritability mirrored patterns of multi-GPS prediction as they increased from less than 40% to up to 83%.ConclusionsThe ability of GPS to predict behaviour problems can be improved by using multiple GPS, cross-age composites and cross-rater composites, although the effect sizes remain modest, up to 6%. Our results can be used in any genotyped sample to create multi-trait GPS predictors of behaviour problems that will be more predictive than polygenic scores based on a single age, rater or GPS.Key pointsGenome-wide polygenic scores (GPS) can be used to predict behaviour problems in childhood, but the effect sizes are generally less than 3.5%.DNA-based prediction models of achieve greater accuracy if holistic approaches are employed, that is cross-trait, longitudinal and trans-situational approaches.The prediction of childhood behaviour problems can be improved by using multiple GPS to predict composites that aggregate behaviour problems across ages and across raters.Our results yield weights that can be applied to GPS in any study to create multi-trait GPS predictors of behaviour problems based on cross-age and cross-rater composites.As compared to individuals in the lowest multi-trait GPS decile, nearly three times as many individuals in the highest internalizing multi-trait GPS decile were diagnosed with anxiety disorder and 25% more individuals in the highest general behaviour problems and externalizing multi-trait GPS deciles have taken medication for mental health.

2021 ◽  
Vol 23 (8) ◽  
Author(s):  
Germán D. Carrasquilla ◽  
Malene Revsbech Christiansen ◽  
Tuomas O. Kilpeläinen

Abstract Purpose of Review Hypertriglyceridemia is a common dyslipidemia associated with an increased risk of cardiovascular disease and pancreatitis. Severe hypertriglyceridemia may sometimes be a monogenic condition. However, in the vast majority of patients, hypertriglyceridemia is due to the cumulative effect of multiple genetic risk variants along with lifestyle factors, medications, and disease conditions that elevate triglyceride levels. In this review, we will summarize recent progress in the understanding of the genetic basis of hypertriglyceridemia. Recent Findings More than 300 genetic loci have been identified for association with triglyceride levels in large genome-wide association studies. Studies combining the loci into polygenic scores have demonstrated that some hypertriglyceridemia phenotypes previously attributed to monogenic inheritance have a polygenic basis. The new genetic discoveries have opened avenues for the development of more effective triglyceride-lowering treatments and raised interest towards genetic screening and tailored treatments against hypertriglyceridemia. Summary The discovery of multiple genetic loci associated with elevated triglyceride levels has led to improved understanding of the genetic basis of hypertriglyceridemia and opened new translational opportunities.


2021 ◽  
pp. 1-11
Author(s):  
Valentina Escott-Price ◽  
Karl Michael Schmidt

<b><i>Background:</i></b> Genome-wide association studies (GWAS) were successful in identifying SNPs showing association with disease, but their individual effect sizes are small and require large sample sizes to achieve statistical significance. Methods of post-GWAS analysis, including gene-based, gene-set and polygenic risk scores, combine the SNP effect sizes in an attempt to boost the power of the analyses. To avoid giving undue weight to SNPs in linkage disequilibrium (LD), the LD needs to be taken into account in these analyses. <b><i>Objectives:</i></b> We review methods that attempt to adjust the effect sizes (β<i>-</i>coefficients) of summary statistics, instead of simple LD pruning. <b><i>Methods:</i></b> We subject LD adjustment approaches to a mathematical analysis, recognising Tikhonov regularisation as a framework for comparison. <b><i>Results:</i></b> Observing the similarity of the processes involved with the more straightforward Tikhonov-regularised ordinary least squares estimate for multivariate regression coefficients, we note that current methods based on a Bayesian model for the effect sizes effectively provide an implicit choice of the regularisation parameter, which is convenient, but at the price of reduced transparency and, especially in smaller LD blocks, a risk of incomplete LD correction. <b><i>Conclusions:</i></b> There is no simple answer to the question which method is best, but where interpretability of the LD adjustment is essential, as in research aiming at identifying the genomic aetiology of disorders, our study suggests that a more direct choice of mild regularisation in the correction of effect sizes may be preferable.


2015 ◽  
Author(s):  
Dominic Holland ◽  
Yunpeng Wang ◽  
Wesley K Thompson ◽  
Andrew Schork ◽  
Chi-Hua Chen ◽  
...  

Genome-wide Association Studies (GWAS) result in millions of summary statistics (``z-scores'') for single nucleotide polymorphism (SNP) associations with phenotypes. These rich datasets afford deep insights into the nature and extent of genetic contributions to complex phenotypes such as psychiatric disorders, which are understood to have substantial genetic components that arise from very large numbers of SNPs. The complexity of the datasets, however, poses a significant challenge to maximizing their utility. This is reflected in a need for better understanding the landscape of z-scores, as such knowledge would enhance causal SNP and gene discovery, help elucidate mechanistic pathways, and inform future study design. Here we present a parsimonious methodology for modeling effect sizes and replication probabilities that does not require raw genotype data, relying only on summary statistics from GWAS substudies, and a scheme allowing for direct empirical validation. We show that modeling z-scores as a mixture of Gaussians is conceptually appropriate, in particular taking into account ubiquitous non-null effects that are likely in the datasets due to weak linkage disequilibrium with causal SNPs. The four-parameter model allows for estimating the degree of polygenicity of the phenotype -- the proportion of SNPs (after uniform pruning, so that large LD blocks are not over-represented) likely to be in strong LD with causal/mechanistically associated SNPs -- and predicting the proportion of chip heritability explainable by genome wide significant SNPs in future studies with larger sample sizes. We apply the model to recent GWAS of schizophrenia (N=82,315) and additionally, for purposes of illustration, putamen volume (N=12,596), with approximately 9.3 million SNP z-scores in both cases. We show that, over a broad range of z-scores and sample sizes, the model accurately predicts expectation estimates of true effect sizes and replication probabilities in multistage GWAS designs. We estimate the degree to which effect sizes are over-estimated when based on linear regression association coefficients. We estimate the polygenicity of schizophrenia to be 0.037 and the putamen to be 0.001, while the respective sample sizes required to approach fully explaining the chip heritability are 106and 105. The model can be extended to incorporate prior knowledge such as pleiotropy and SNP annotation. The current findings suggest that the model is applicable to a broad array of complex phenotypes and will enhance understanding of their genetic architectures.


eLife ◽  
2016 ◽  
Vol 5 ◽  
Author(s):  
Christina Kiel ◽  
Hannah Benisty ◽  
Veronica Lloréns-Rico ◽  
Luis Serrano

Many driver mutations in cancer are specific in that they occur at significantly higher rates than – presumably – functionally alternative mutations. For example, V600E in the BRAF hydrophobic activation segment (AS) pocket accounts for >95% of all kinase mutations. While many hypotheses tried to explain such significant mutation patterns, conclusive explanations are lacking. Here, we use experimental and in silico structure-energy statistical analyses, to elucidate why the V600E mutation, but no other mutation at this, or any other positions in BRAF’s hydrophobic pocket, is predominant. We find that BRAF mutation frequencies depend on the equilibrium between the destabilization of the hydrophobic pocket, the overall folding energy, the activation of the kinase and the number of bases required to change the corresponding amino acid. Using a random forest classifier, we quantitatively dissected the parameters contributing to BRAF AS cancer frequencies. These findings can be applied to genome-wide association studies and prediction models.


2019 ◽  
Vol 28 (1) ◽  
pp. 82-90 ◽  
Author(s):  
Daniel W. Belsky ◽  
K. Paige Harden

Genome-wide association studies (GWASs) have identified specific genetic variants associated with complex human traits and behaviors, such as educational attainment, mental disorders, and personality. However, small effect sizes for individual variants, uncertainty regarding the biological function of discovered genotypes, and potential “outside-the-skin” environmental mechanisms leave a translational gulf between GWAS results and scientific understanding that will improve human health and well-being. We propose a set of social, behavioral, and brain-science research activities that map discovered genotypes to neural, developmental, and social mechanisms and call this research program phenotypic annotation. Phenotypic annotation involves (a) elaborating the nomological network surrounding discovered genotypes, (b) shifting focus from individual genes to whole genomes, and (c) testing how discovered genotypes affect life-span development. Phenotypic-annotation research is already advancing the understanding of GWAS discoveries for educational attainment and schizophrenia. We review examples and discuss methodological considerations for psychologists taking up the phenotypic-annotation approach.


2020 ◽  
Vol 36 (18) ◽  
pp. 4749-4756 ◽  
Author(s):  
Alexey A Shadrin ◽  
Oleksandr Frei ◽  
Olav B Smeland ◽  
Francesco Bettella ◽  
Kevin S O'Connell ◽  
...  

Abstract Motivation Determining the relative contributions of functional genetic categories is fundamental to understanding the genetic etiology of complex human traits and diseases. Here, we present Annotation Informed-MiXeR, a likelihood-based method for estimating the number of variants influencing a phenotype and their effect sizes across different functional annotation categories of the genome using summary statistics from genome-wide association studies. Results Extensive simulations demonstrate that the model is valid for a broad range of genetic architectures. The model suggests that complex human phenotypes substantially differ in the number of causal variants, their localization in the genome and their effect sizes. Specifically, the exons of protein-coding genes harbor more than 90% of variants influencing type 2 diabetes and inflammatory bowel disease, making them good candidates for whole-exome studies. In contrast, &lt;10% of the causal variants for schizophrenia, bipolar disorder and attention-deficit/hyperactivity disorder are located in protein-coding exons, indicating a more substantial role of regulatory mechanisms in the pathogenesis of these disorders. Availability and implementation The software is available at: https://github.com/precimed/mixer. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Author(s):  
Christopher Hübel ◽  
Mohamed Abdulkadir ◽  
Moritz Herle ◽  
Ruth J.F. Loos ◽  
Gerome Breen ◽  
...  

AbstractObjectiveGenome-wide association studies have identified multiple genomic regions associated with anorexia nervosa. Relatively few or no genome-wide studies of other eating disorders, such as bulimia nervosa and binge-eating disorder, have been performed, despite their substantial heritability. Exploratively, we aimed to identify traits that are genetically associated with binge-type eating disorders.MethodWe calculated genome-wide polygenic scores for 269 trait and disease outcomes using PRSice v2.2 and their association with anorexia nervosa, bulimia nervosa, and binge-eating disorder in up to 640 cases and 17,050 controls from the UK Biobank. Significant associations were tested for replication in the Avon Longitudinal Study of Parents and Children (up to 217 cases and 3018 controls).ResultsIndividuals with binge-type eating disorders had higher polygenic scores than controls for other psychiatric disorders, including depression, schizophrenia, and attention deficit hyperactivity disorder, and higher polygenic scores for body mass index.DiscussionOur findings replicate some of the known comorbidities of eating disorders on a genomic level and motivate a deeper investigation of shared and unique genomic factors across the three primary eating disorders.


2015 ◽  
Author(s):  
Guo-Bo Chen ◽  
Sang Hong Lee ◽  
Matthew R Robinson ◽  
Maciej Trzaskowski ◽  
Zhi-Xiang Zhu ◽  
...  

Genome-wide association studies (GWASs) have been successful in discovering replicable SNP-trait associations for many quantitative traits and common diseases in humans. Typically the effect sizes of SNP alleles are very small and this has led to large genome-wide association meta-analyses (GWAMA) to maximize statistical power. A trend towards ever-larger GWAMA is likely to continue, yet dealing with summary statistics from hundreds of cohorts increases logistical and quality control problems, including unknown sample overlap, and these can lead to both false positive and false negative findings. In this study we propose a new set of metrics and visualization tools for GWAMA, using summary statistics from cohort-level GWASs. We proposed a pair of methods in examining the concordance between demographic information and summary statistics. In method I, we use the population genetics Fststatistic to verify the genetic origin of each cohort and their geographic location, and demonstrate using GWAMA data from the GIANT Consortium that geographic locations of cohorts can be recovered and outlier cohorts can be detected. In method II, we conduct principal component analysis based on reported allele frequencies, and is able to recover the ancestral information for each cohort. In addition, we propose a new statistic that uses the reported allelic effect sizes and their standard errors to identify significant sample overlap or heterogeneity between pairs of cohorts. Finally, to quantify unknown sample overlap across all pairs of cohorts we propose a method that uses randomly generated genetic predictors that does not require the sharing of individual-level genotype data and does not breach individual privacy.


2018 ◽  
Author(s):  
Ping Zeng ◽  
Xinjie Hao ◽  
Xiang Zhou

AbstractMotivationGenome-wide association studies (GWASs) have identified many genetic loci associated with complex traits. A substantial fraction of these identified loci are associated with multiple traits – a phenomena known as pleiotropy. Identification of pleiotropic associations can help characterize the genetic relationship among complex traits and can facilitate our understanding of disease etiology. Effective pleiotropic association mapping requires the development of statistical methods that can jointly model multiple traits with genome-wide SNPs together.ResultsWe develop a joint modeling method, which we refer to as the integrative MApping of Pleiotropic association (iMAP). iMAP models summary statistics from GWASs, uses a multivariate Gaussian distribution to account for phenotypic correlation, simultaneously infers genome-wide SNP association pattern using mixture modeling, and has the potential to reveal causal relationship between traits. Importantly, iMAP integrates a large number of SNP functional annotations to substantially improve association mapping power, and, with a sparsity-inducing penalty, is capable of selecting informative annotations from a large, potentially noninformative set. To enable scalable inference of iMAP to association studies with hundreds of thousands of individuals and millions of SNPs, we develop an efficient expectation maximization algorithm based on an approximate penalized regression algorithm. With simulations and comparisons to existing methods, we illustrate the benefits of iMAP both in terms of high association mapping power and in terms of accurate estimation of genome-wide SNP association patterns. Finally, we apply iMAP to perform a joint analysis of 48 traits from 31 GWAS consortia together with 40 tissue-specific SNP annotations generated from the Roadmap Project. iMAP is freely available at www.xzlab.org/software.html.


Sign in / Sign up

Export Citation Format

Share Document