scholarly journals An exact, unifying framework for region-based association testing in family-based designs, including higher criticism approaches, SKATs, multivariate and burden tests

2019 ◽  
Author(s):  
Julian Hecker ◽  
F. William Townes ◽  
Priyadarshini Kachroo ◽  
Jessica Lasky-Su ◽  
John Ziniti ◽  
...  

AbstractAnalysis of rare variants in family-based studies remains a challenge. To perform a region/set-based association analysis of rare variants in family-based studies, we propose a general methodological framework that integrates higher criticism, maximum, SKATs, and burden approaches into the family-based association testing (FBAT) framework. Using the haplotype algorithm for FBATs to compute the conditional genotype distribution under the null hypothesis of Mendelian transmissions, virtually any association test statistics can be implemented in our approach and simulation-based or exact p-values can be computed without the need for asymptotic settings. Using simulations, we compare the features of the proposed test statistics in our framework with the existing region-based methodology for family-based studies under various scenarios. The tests of our framework outperform the existing approaches. We provide general guidelines for which scenarios, e.g., sparseness of the signals or local LD structure, which test statistic will have distinct power advantages over the others. We also illustrate our approach in an application to a whole-genome sequencing dataset with 897 asthmatic trios.

Author(s):  
Julian Hecker ◽  
F William Townes ◽  
Priyadarshini Kachroo ◽  
Cecelia Laurie ◽  
Jessica Lasky-Su ◽  
...  

Abstract Motivation Analysis of rare variants in family-based studies remains a challenge. Transmission-based approaches provide robustness against population stratification, but the evaluation of the significance of test statistics based on asymptotic theory can be imprecise. In addition, power will depend heavily on the choice of the test statistic and on the underlying genetic architecture of the locus, which will be generally unknown. Results In our proposed framework, we utilize the FBAT haplotype algorithm to obtain the conditional offspring genotype distribution under the null hypothesis given the sufficient statistic. Based on this conditional offspring genotype distribution, the significance of virtually any association test statistic can be evaluated based on simulations or exact computations, without the need for asymptotic approximations. Besides standard linear burden-type statistics, this enables our approach to also evaluate other test statistics such as SKATs, higher criticism approaches, and maximum-single-variant-statistics, where asymptotic theory might be involved or does not provide accurate approximations for rare variant data. Based on the p-values, combined test statistics such as the aggregated Cauchy association test (ACAT) can also be utilized. In simulation studies, we show that our framework outperforms existing approaches for family-based studies in several scenarios. We also applied our methodology to a TOPMed whole-genome sequencing dataset with 897 asthmatic trios from Costa Rica. Availability FBAT software is available at https://sites.google.com/view/fbatwebpage. Simulation code is available at https://github.com/julianhecker/FBAT_rare_variant_test_simulations. Supplementary information Supplementary data are available at Bioinformatics online.


1999 ◽  
Vol 9 (3) ◽  
pp. 234-241 ◽  
Author(s):  
Jun Teng ◽  
Neil Risch

In this paper we consider test statistics based on individual genotyping. For sibships without parents, but with unaffected as well as affected sibs, we introduce a new test statistic (referred to asTDS), which contrasts the allele frequency in affected sibs versus that estimated for the parents from the entire sibship. For sibships without parents, this test is analogous to the TDT and is completely robust to nonrandom mating patterns. The efficiency of the TDS test is comparable to that of the THS test (which compares affected vs. unaffected sibs and was based on DNA pooling), for sibships with one affected child. However, as the number of affected sibs in the sibship grows, the relative efficiency of the TDS test versus theTHS test also increases. For example, for sibships with three affected, one-third fewer families are required; for families with four affected, nearly half as many are required. Thus, when sibships contain multiple affected individuals, theTDS test provides both an increase in power and robustness to nonrandom mating.


Author(s):  
Anna L Tyler ◽  
Baha El Kassaby ◽  
Georgi Kolishovski ◽  
Jake Emerson ◽  
Ann E Wells ◽  
...  

Abstract It is well understood that variation in relatedness among individuals, or kinship, can lead to false genetic associations. Multiple methods have been developed to adjust for kinship while maintaining power to detect true associations. However, relatively unstudied, are the effects of kinship on genetic interaction test statistics. Here we performed a survey of kinship effects on studies of six commonly used mouse populations. We measured inflation of main effect test statistics, genetic interaction test statistics, and interaction test statistics reparametrized by the Combined Analysis of Pleiotropy and Epistasis (CAPE). We also performed linear mixed model (LMM) kinship corrections using two types of kinship matrix: an overall kinship matrix calculated from the full set of genotyped markers, and a reduced kinship matrix, which left out markers on the chromosome(s) being tested. We found that test statistic inflation varied across populations and was driven largely by linkage disequilibrium. In contrast, there was no observable inflation in the genetic interaction test statistics. CAPE statistics were inflated at a level in between that of the main effects and the interaction effects. The overall kinship matrix overcorrected the inflation of main effect statistics relative to the reduced kinship matrix. The two types of kinship matrices had similar effects on the interaction statistics and CAPE statistics, although the overall kinship matrix trended toward a more severe correction. In conclusion, we recommend using a LMM kinship correction for both main effects and genetic interactions and further recommend that the kinship matrix be calculated from a reduced set of markers in which the chromosomes being tested are omitted from the calculation. This is particularly important in populations with substantial population structure, such as recombinant inbred lines in which genomic replicates are used.


2021 ◽  
pp. jmedgenet-2021-107747
Author(s):  
Riku Katainen ◽  
Iikki Donner ◽  
Maritta Räisänen ◽  
Davide Berta ◽  
Anna Kuosmanen ◽  
...  

BackgroundGenes involved in epigenetic regulation are central for chromatin structure and gene expression. Specific mutations in these might promote carcinogenesis in several tissue types.MethodsWe used exome, whole-genome and Sanger sequencing to detect rare variants shared by seven affected individuals in a striking early-onset multi-cancer family. The only variant that segregated with malignancy resided in a histone demethylase KDM4C. Consequently, we went on to study the epigenetic landscape of the mutation carriers with ATAC, ChIP (chromatin immunoprecipitation) and RNA-sequencing from lymphoblastoid cell lines to identify possible pathogenic effects.ResultsA novel variant in KDM4C, encoding a H3K9me3 histone demethylase and transcription regulator, was found to segregate with malignancy in the family. Based on Roadmap Epigenomics Project data, differentially accessible chromatin regions between the variant carriers and controls enrich to normally H3K9me3-marked chromatin. We could not detect a difference in global H3K9 trimethylation levels. However, carriers of the variant seemed to have more trimethylated H3K9 at transcription start sites. Pathway analyses of ChIP-seq and differential gene expression data suggested that genes regulated through KDM4C interaction partner EZH2 and its interaction partner PLZF are aberrantly expressed in mutation carriers.ConclusionsThe apparent dysregulation of H3K9 trimethylation and KDM4C-associated genes in lymphoblastoid cells supports the hypothesis that the KDM4C variant is causative of the multi-cancer susceptibility in the family. As the variant is ultrarare, located in the conserved catalytic JmjC domain and predicted pathogenic by the majority of available in silico tools, further studies on the role of KDM4C in cancer predisposition are warranted.


2021 ◽  
Vol 16 (1) ◽  
Author(s):  
Alessandro Gialluisi ◽  
Mafalda Giovanna Reccia ◽  
Nicola Modugno ◽  
Teresa Nutile ◽  
Alessia Lombardi ◽  
...  

Abstract Background Parkinson’s disease (PD) is a neurodegenerative movement disorder affecting 1–5% of the general population for which neither effective cure nor early diagnostic tools are available that could tackle the pathology in the early phase. Here we report a multi-stage procedure to identify candidate genes likely involved in the etiopathogenesis of PD. Methods The study includes a discovery stage based on the analysis of whole exome data from 26 dominant late onset PD families, a validation analysis performed on 1542 independent PD patients and 706 controls from different cohorts and the assessment of polygenic variants load in the Italian cohort (394 unrelated patients and 203 controls). Results Family-based approach identified 28 disrupting variants in 26 candidate genes for PD including PARK2, PINK1, DJ-1(PARK7), LRRK2, HTRA2, FBXO7, EIF4G1, DNAJC6, DNAJC13, SNCAIP, AIMP2, CHMP1A, GIPC1, HMOX2, HSPA8, IMMT, KIF21B, KIF24, MAN2C1, RHOT2, SLC25A39, SPTBN1, TMEM175, TOMM22, TVP23A and ZSCAN21. Sixteen of them have not been associated to PD before, were expressed in mesencephalon and were involved in pathways potentially deregulated in PD. Mutation analysis in independent cohorts disclosed a significant excess of highly deleterious variants in cases (p = 0.0001), supporting their role in PD. Moreover, we demonstrated that the co-inheritance of multiple rare variants (≥ 2) in the 26 genes may predict PD occurrence in about 20% of patients, both familial and sporadic cases, with high specificity (> 93%; p = 4.4 × 10− 5). Moreover, our data highlight the fact that the genetic landmarks of late onset PD does not systematically differ between sporadic and familial forms, especially in the case of small nuclear families and underline the importance of rare variants in the genetics of sporadic PD. Furthermore, patients carrying multiple rare variants showed higher risk of manifesting dyskinesia induced by levodopa treatment. Conclusions Besides confirming the extreme genetic heterogeneity of PD, these data provide novel insights into the genetic of the disease and may be relevant for its prediction, diagnosis and treatment.


Sign in / Sign up

Export Citation Format

Share Document