scholarly journals Rare variants association testing for a binary outcome when pooling individual level data from heterogeneous studies

2020 ◽  
Author(s):  
Tamar Sofer ◽  
Na Guo

AbstractWhole genome and exome sequencing studies are used to test the association of rare genetic variants with health traits. Many existing WGS efforts now aggregate data from heterogeneous groups, e.g. combining sets of individuals of European and African ancestries. We here investigate the statistical implications on rare variant association testing with a binary trait when combining together heterogeneous studies, defined as studies with potentially different disease proportion and different frequency of variant carriers. We study and compare in simulations the type 1 error control and power of the naïve Score test, the saddlepoint approximation to the score test (SPA test), and the BinomiRare test in a range of settings, focusing on low numbers of variant carriers. We show that type 1 error control and power patterns depend on both the number of carriers of the rare allele and on disease prevalence in each of the studies. We develop recommendations for association analysis of rare genetic variants. (1) The Score test is preferred when the case proportion in the sample is 50%. (2) Do not down-sample controls to balance case-control ratio, because it reduces power. Rather, use a test that controls the type 1 error. (3) Conduct stratified analysis in parallel with combined analysis. Aggregated testing may have lower power when the variant effect size differs between strata.

2017 ◽  
Author(s):  
Douglas R. Smith ◽  
Christine M. Stanley ◽  
Theodore Foss ◽  
Richard G. Boles ◽  
Kevin McKernan

AbstractRare genetic variants in the core endocannabinoid system genes CNR1, CNR2, DAGLA, MGLL and FAAH were identified in molecular testing data from up to 6.032 patients with a broad spectrum of neurological disorders. The variants were evaluated for association with phenotypes similar to those observed in the orthologous gene knockouts in mice. Heterozygous rare coding variants in CNR1, which encodes the type 1 cannabinoid receptor (CB1), were found to be significantly associated with pain sensitivity (especially migraine), sleep and memory disorders - alone or in combination with anxiety - compared to a set of controls without such CNR1 variants. Similarly, heterozygous rare variants in DAGLA, which encodes diacylglycerol lipase alpha, were found to be significantly associated with seizures and developmental disorders, including abnormalities of brain morphology, compared to controls. Rare variants in MGLL, FAAH and CNR2 were not associated with any neurological phenotypes in the patients tested. Diacylglycerol lipase alpha synthesizes the endocannabinoid 2-AG in the brain, which interacts with CB1 receptors. The phenotypes associated with rare CNR1 variants are reminiscent of those implicated in the theory of clinical endocannabinoid deficiency syndrome. The severe phenotypes associated with rare DAGLA variants underscore the critical role of rapid 2-AG synthesis and the endocannabinoid system in regulating neurological function and development. Mapping of the variants to the 3D structure of the type 1 cannabinoid receptor, or primary structure of diacylglycerol lipase alpha, reveals clustering of variants in certain structural regions and is consistent with impacts to function.


2021 ◽  
Vol 6 (1) ◽  
Author(s):  
Muhammad Aslam ◽  
Nirosiya Kandasamy ◽  
Anwar Ullah ◽  
Nagarajan Paramasivam ◽  
Mehmet Ali Öztürk ◽  
...  

AbstractRare variants in the beta-glucocerebrosidase gene (GBA1) are common genetic risk factors for alpha synucleinopathy, which often manifests clinically as GBA-associated Parkinson’s disease (GBA-PD). Clinically, GBA-PD closely mimics idiopathic PD, but it may present at a younger age and often aggregates in families. Most carriers of GBA variants are, however, asymptomatic. Moreover, symptomatic PD patients without GBA variant have been reported in families with seemingly GBA-PD. These observations obscure the link between GBA variants and PD pathogenesis and point towards a role for unidentified additional genetic and/or environmental risk factors or second hits in GBA-PD. In this study, we explored whether rare genetic variants may be additional risk factors for PD in two families segregating the PD-associated GBA1 variants c.115+1G>A (ClinVar ID: 93445) and p.L444P (ClinVar ID: 4288). Our analysis identified rare genetic variants of the HSP70 co-chaperone DnaJ homolog subfamily B member 6 (DNAJB6) and lysosomal protein prosaposin (PSAP) as additional factors possibly influencing PD risk in the two families. In comparison to the wild-type proteins, variant DNAJB6 and PSAP proteins show altered functions in the context of cellular alpha-synuclein homeostasis when expressed in reporter cells. Furthermore, the segregation pattern of the rare variants in the genes encoding DNAJB6 and PSAP indicated a possible association with PD in the respective families. The occurrence of second hits or additional PD cosegregating rare variants has important implications for genetic counseling in PD families with GBA1 variant carriers and for the selection of PD patients for GBA targeted treatments.


2012 ◽  
Vol 36 (6) ◽  
pp. 642-651 ◽  
Author(s):  
Niall J. Cardin ◽  
Joel A. Mefford ◽  
John S. Witte

2015 ◽  
Vol 18 (2) ◽  
pp. 117-125 ◽  
Author(s):  
Michelle Luciano ◽  
Victoria Svinti ◽  
Archie Campbell ◽  
Riccardo E. Marioni ◽  
Caroline Hayward ◽  
...  

Variation in human cognitive ability is of consequence to a large number of health and social outcomes and is substantially heritable. Genetic linkage, genome-wide association, and copy number variant studies have investigated the contribution of genetic variation to individual differences in normal cognitive ability, but little research has considered the role of rare genetic variants. Exome sequencing studies have already met with success in discovering novel trait-gene associations for other complex traits. Here, we use exome sequencing to investigate the effects of rare variants on general cognitive ability. Unrelated Scottish individuals were selected for high scores on a general component of intelligence (g). The frequency of rare genetic variants (in n = 146) was compared with those from Scottish controls (total n = 486) who scored in the lower to middle range of the g distribution or on a proxy measure of g. Biological pathway analysis highlighted enrichment of the mitochondrial inner membrane component and apical part of cell gene ontology terms. Global burden analysis showed a greater total number of rare variants carried by high g cases versus controls, which is inconsistent with a mutation load hypothesis whereby mutations negatively affect g. The general finding of greater non-synonymous (vs. synonymous) variant effects is in line with evolutionary hypotheses for g. Given that this first sequencing study of high g was small, promising results were found, suggesting that the study of rare variants in larger samples would be worthwhile.


2018 ◽  
Author(s):  
James Liley ◽  
Chris Wallace

AbstractHigh-dimensional hypothesis testing is ubiquitous in the biomedical sciences, and informative covariates may be employed to improve power. The conditional false discovery rate (cFDR) is widely-used approach suited to the setting where the covariate is a set of p-values for the equivalent hypotheses for a second trait. Although related to the Benjamini-Hochberg procedure, it does not permit any easy control of type-1 error rate, and existing methods are over-conservative. We propose a new method for type-1 error rate control based on identifying mappings from the unit square to the unit interval defined by the estimated cFDR, and splitting observations so that each map is independent of the observations it is used to test. We also propose an adjustment to the existing cFDR estimator which further improves power. We show by simulation that the new method more than doubles potential improvement in power over unconditional analyses compared to existing methods. We demonstrate our method on transcriptome-wide association studies, and show that the method can be used in an iterative way, enabling the use of multiple covariates successively. Our methods substantially improve the power and applicability of cFDR analysis.


2020 ◽  
Author(s):  
Janet Aisbett ◽  
Daniel Lakens ◽  
Kristin Sainani

Magnitude based inference (MBI) was widely adopted by sport science researchers as an alternative to null hypothesis significance tests. It has been criticized for lacking a theoretical framework, mixing Bayesian and frequentist thinking, and encouraging researchers to run small studies with high Type 1 error rates. MBI terminology describes the position of confidence intervals in relation to smallest meaningful effect sizes. We show these positions correspond to combinations of one-sided tests of hypotheses about the presence or absence of meaningful effects, and formally describe MBI as a multiple decision procedure. MBI terminology operates as if tests are conducted at multiple alpha levels. We illustrate how error rates can be controlled by limiting each one-sided hypothesis test to a single alpha level. To provide transparent error control in a Neyman-Pearson framework and encourage the use of standard statistical software, we recommend replacing MBI with one-sided tests against smallest meaningful effects, or pairs of such tests as in equivalence testing. Researchers should pre-specify their hypotheses and alpha levels, perform a priori sample size calculations, and justify all assumptions. Our recommendations show researchers what tests to use and how to design and report their statistical analyses to accord with standard frequentist practice.


Diabetes ◽  
2020 ◽  
Vol 69 (4) ◽  
pp. 784-795
Author(s):  
Vincenzo Forgetta ◽  
Despoina Manousaki ◽  
Roman Istomine ◽  
Stephanie Ross ◽  
Marie-Catherine Tessier ◽  
...  

2021 ◽  
Vol 12 ◽  
Author(s):  
Hiroshi Okamura ◽  
Hirohisa Nakamae ◽  
Takero Shindo ◽  
Katsuki Ohtani ◽  
Yoshihiko Hidaka ◽  
...  

Transplant-associated thrombotic microangiopathy (TA-TMA) is a fatal complication after allogeneic hematopoietic stem cell transplantation (allo-HSCT). Previous reports suggest that TA-TMA is caused by complement activation by complement-related genetic variants; however, this needs to be verified, especially in adults. Here, we performed a nested case-control study of allo-HSCT-treated adults at a single center. Fifteen TA-TMA patients and 15 non-TA-TMA patients, matched according to the propensity score, were enrolled. Based on a previous report showing an association between complement-related genes and development of TA-TMA, we first sequenced these 17 genes. Both cohorts harbored several genetic variants with rare allele frequencies; however, there was no difference in the percentage of patients in the TA-TMA and non-TA-TMA groups with the rare variants, or in the average number of rare variants per patient. Second, we measured plasma concentrations of complement proteins. Notably, levels of Ba protein on Day 7 following allo-HSCT were abnormally and significantly higher in TA-TMA than in non-TA-TMA cases, suggesting that complement activation via the alternative pathway contributes to TA-TMA. All other parameters, including soluble C5b-9, on Day 7 were similar between the groups. The levels of C3, C4, CH50, and complement factors H and I in the TA-TMA group after Day 28 were significantly lower than those in the non-TA-TMA group. Complement-related genetic variants did not predict TA-TMA development. By contrast, abnormally high levels of Ba on Day 7 did predict development of TA-TMA and non-relapse mortality. Thus, Ba levels on Day 7 after allo-HSCT are a sensitive and prognostic biomarker of TA-TMA.


2021 ◽  
Author(s):  
Essi Laajala ◽  
Viivi Halla-aho ◽  
Toni Grönroos ◽  
Ubaid Ullah ◽  
Mari Vähä-Mäkilä ◽  
...  

Background: The aim of this study was to detect differential methylation in umbilical cord blood that is associated with maternal and pregnancy-related variables, such as maternal age and gestational weight gain. These have been studied earlier with 450K microarrays but not with bisulfite sequencing. Methods: Reduced representation bisulfite sequencing (RRBS) analysis was performed on 200 umbilical cord blood samples. Altogether 24 clinical and technical covariates were included in a binomial mixed effects model, which was fit separately for each high-coverage CpG site, followed by spatial and multiple testing adjustment of P values. Inflation of spatially adjusted P values was discovered in a permutation analysis, which was then applied for empirical type 1 error control. Results: Empirical type 1 error control decreased the number of findings associated with each covariate to zero or a small fraction of the number that would have been discovered with standard cutoffs. In this collection of samples, some differential methylation was associated with sex, the usage of epidural anesthetic during delivery, 1 minute Apgar points, maternal age and height, gestational weight gain, maternal smoking, and maternal insulin-treated diabetes, but not with the birth weight of the newborn infant, maternal pre-pregnancy BMI, the number of earlier miscarriages, the mode of delivery, labor induction, or the cosine transformed month of birth. Conclusions: The autocorrelation-adjusted Z-test is a convenient tool for detecting differentially methylated regions, but the significance should either be determined empirically or before the spatial adjustment. With appropriate significance thresholds, the detected differentially methylated regions were reproducible across studies, technologies, and statistical models. Our RRBS data analysis workflow is available in https://github.com/EssiLaajala/RRBS_workflow. Keywords: DNA methylation, bisulfite sequencing, RRBS, umbilical cord blood, pregnancy, sex, spatial correlation, type 1 error, differential methylation, analysis workflow


Author(s):  
Heather C. Mefford

Psychiatric disorders in children, including autism, intellectual disability, attention deficit hyperactivity disorder, childhood-onset schizophrenia and bipolar disorder carry a significant financial and social burden for affected individuals and their families. It is clear that genetic factors play an important role in the etiology of many psychiatric illnesses. However, the inheritance pattern of each of these disorders is not straightforward, and therefore the identification of specific causative genes has been difficult. Recent technological advances facilitate genome-wide studies to identify both common and rare genetic variants in large numbers of individuals. In this chapter, we will review evidence that suggests that rare genetic variants, including both single nucleotide and copy number variants, contribute to the genetic risk of childhood-onset psychiatric disease.


Sign in / Sign up

Export Citation Format

Share Document