A unifying framework for rare variant association testing in family-based designs, including higher criticism approaches, SKATs, and burden tests

Bioinformatics ◽

10.1093/bioinformatics/btaa1055 ◽

2020 ◽

Author(s):

Julian Hecker ◽

F William Townes ◽

Priyadarshini Kachroo ◽

Cecelia Laurie ◽

Jessica Lasky-Su ◽

...

Keyword(s):

Rare Variant ◽

Asymptotic Theory ◽

Association Test ◽

Supplementary Information ◽

Genotype Distribution ◽

Test Statistics ◽

Test Statistic ◽

Higher Criticism ◽

Offspring Genotype ◽

Family Based

Abstract Motivation Analysis of rare variants in family-based studies remains a challenge. Transmission-based approaches provide robustness against population stratification, but the evaluation of the significance of test statistics based on asymptotic theory can be imprecise. In addition, power will depend heavily on the choice of the test statistic and on the underlying genetic architecture of the locus, which will be generally unknown. Results In our proposed framework, we utilize the FBAT haplotype algorithm to obtain the conditional offspring genotype distribution under the null hypothesis given the sufficient statistic. Based on this conditional offspring genotype distribution, the significance of virtually any association test statistic can be evaluated based on simulations or exact computations, without the need for asymptotic approximations. Besides standard linear burden-type statistics, this enables our approach to also evaluate other test statistics such as SKATs, higher criticism approaches, and maximum-single-variant-statistics, where asymptotic theory might be involved or does not provide accurate approximations for rare variant data. Based on the p-values, combined test statistics such as the aggregated Cauchy association test (ACAT) can also be utilized. In simulation studies, we show that our framework outperforms existing approaches for family-based studies in several scenarios. We also applied our methodology to a TOPMed whole-genome sequencing dataset with 897 asthmatic trios from Costa Rica. Availability FBAT software is available at https://sites.google.com/view/fbatwebpage. Simulation code is available at https://github.com/julianhecker/FBAT_rare_variant_test_simulations. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

An exact, unifying framework for region-based association testing in family-based designs, including higher criticism approaches, SKATs, multivariate and burden tests

10.1101/815290 ◽

2019 ◽

Author(s):

Julian Hecker ◽

F. William Townes ◽

Priyadarshini Kachroo ◽

Jessica Lasky-Su ◽

John Ziniti ◽

...

Keyword(s):

Rare Variants ◽

Genotype Distribution ◽

Test Statistics ◽

Test Statistic ◽

Higher Criticism ◽

Association Testing ◽

Simulation Based ◽

The Family ◽

Family Based ◽

Burden Tests

AbstractAnalysis of rare variants in family-based studies remains a challenge. To perform a region/set-based association analysis of rare variants in family-based studies, we propose a general methodological framework that integrates higher criticism, maximum, SKATs, and burden approaches into the family-based association testing (FBAT) framework. Using the haplotype algorithm for FBATs to compute the conditional genotype distribution under the null hypothesis of Mendelian transmissions, virtually any association test statistics can be implemented in our approach and simulation-based or exact p-values can be computed without the need for asymptotic settings. Using simulations, we compare the features of the proposed test statistics in our framework with the existing region-based methodology for family-based studies under various scenarios. The tests of our framework outperform the existing approaches. We provide general guidelines for which scenarios, e.g., sparseness of the signals or local LD structure, which test statistic will have distinct power advantages over the others. We also illustrate our approach in an application to a whole-genome sequencing dataset with 897 asthmatic trios.

Download Full-text

Rare variant association test in family-based sequencing studies

Briefings in Bioinformatics ◽

10.1093/bib/bbw083 ◽

2016 ◽

pp. bbw083 ◽

Cited By ~ 1

Author(s):

Xuefeng Wang ◽

Zhenyu Zhang ◽

Nathan Morris ◽

Tianxi Cai ◽

Seunggeun Lee ◽

...

Keyword(s):

Rare Variant ◽

Association Test ◽

Rare Variant Association ◽

Sequencing Studies ◽

Family Based ◽

Rare Variant Association Test

Download Full-text

A rare variant association test in family-based designs and non-normal quantitative traits

Statistics in Medicine ◽

10.1002/sim.6750 ◽

2015 ◽

Vol 35 (6) ◽

pp. 905-921 ◽

Cited By ~ 6

Author(s):

Lajmi Lakhal-Chaieb ◽

Karim Oualkacha ◽

Brent J. Richards ◽

Celia M.T. Greenwood

Keyword(s):

Rare Variant ◽

Quantitative Traits ◽

Association Test ◽

Rare Variant Association ◽

Family Based ◽

Rare Variant Association Test

Download Full-text

FARVAT: a family-based rare variant association test

Bioinformatics ◽

10.1093/bioinformatics/btu496 ◽

2014 ◽

Vol 30 (22) ◽

pp. 3197-3205 ◽

Cited By ~ 19

Author(s):

Sungkyoung Choi ◽

Sungyoung Lee ◽

Sven Cichon ◽

Markus M. Nöthen ◽

Christoph Lange ◽

...

Keyword(s):

Rare Variant ◽

Association Test ◽

Rare Variant Association ◽

Family Based ◽

Rare Variant Association Test

Download Full-text

The Relative Power of Family-Based and Case-Control Designs for Linkage Disequilibrium Studies of Complex Human Diseases. II. Individual Genotyping

Genome Research ◽

10.1101/gr.9.3.234 ◽

1999 ◽

Vol 9 (3) ◽

pp. 234-241 ◽

Cited By ~ 1

Author(s):

Jun Teng ◽

Neil Risch

Keyword(s):

Relative Efficiency ◽

Case Control ◽

Test Statistics ◽

Dna Pooling ◽

Test Statistic ◽

Nonrandom Mating ◽

Mating Patterns ◽

Individual Genotyping ◽

Family Based ◽

Control Designs

In this paper we consider test statistics based on individual genotyping. For sibships without parents, but with unaffected as well as affected sibs, we introduce a new test statistic (referred to asTDS), which contrasts the allele frequency in affected sibs versus that estimated for the parents from the entire sibship. For sibships without parents, this test is analogous to the TDT and is completely robust to nonrandom mating patterns. The efficiency of the TDS test is comparable to that of the THS test (which compares affected vs. unaffected sibs and was based on DNA pooling), for sibships with one affected child. However, as the number of affected sibs in the sibship grows, the relative efficiency of the TDS test versus theTHS test also increases. For example, for sibships with three affected, one-third fewer families are required; for families with four affected, nearly half as many are required. Thus, when sibships contain multiple affected individuals, theTDS test provides both an increase in power and robustness to nonrandom mating.

Download Full-text

FARVATX: Family-Based Rare Variant Association Test for X-Linked Genes

Genetic Epidemiology ◽

10.1002/gepi.21979 ◽

2016 ◽

Vol 40 (6) ◽

pp. 475-485 ◽

Cited By ~ 2

Author(s):

Sungkyoung Choi ◽

Sungyoung Lee ◽

Dandi Qiao ◽

Megan Hardin ◽

Michael H. Cho ◽

...

Keyword(s):

Rare Variant ◽

Association Test ◽

Rare Variant Association ◽

Family Based ◽

Rare Variant Association Test

Download Full-text

Faculty Opinions recommendation of Family-based association test using both common and rare variants and accounting for directions of effects for sequencing data.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.718882382.793500875 ◽

2014 ◽

Author(s):

Melanie Bahlo

Keyword(s):

Rare Variants ◽

Association Test ◽

Sequencing Data ◽

Family Based

Download Full-text

Family-based gene-environment interaction using sequence kernel association test (FGE-SKAT) for complex quantitative traits

Scientific Reports ◽

10.1038/s41598-021-86871-2 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Chao-Yu Guo ◽

Reng-Hong Wang ◽

Hsin-Chou Yang

Keyword(s):

Complex Traits ◽

Association Studies ◽

Association Test ◽

Whole Genome Sequence ◽

Environment Interaction ◽

Genome Wide Association Studies ◽

Whole Genome ◽

Sequence Kernel Association Test ◽

Gene Environment ◽

Family Based

AbstractAfter the genome-wide association studies (GWAS) era, whole-genome sequencing is highly engaged in identifying the association of complex traits with rare variations. A score-based variance-component test has been proposed to identify common and rare genetic variants associated with complex traits while quickly adjusting for covariates. Such kernel score statistic allows for familial dependencies and adjusts for random confounding effects. However, the etiology of complex traits may involve the effects of genetic and environmental factors and the complex interactions between genes and the environment. Therefore, in this research, a novel method is proposed to detect gene and gene-environment interactions in a complex family-based association study with various correlated structures. We also developed an R function for the Fast Gene-Environment Sequence Kernel Association Test (FGE-SKAT), which is freely available as supplementary material for easy GWAS implementation to unveil such family-based joint effects. Simulation studies confirmed the validity of the new strategy and the superior statistical power. The FGE-SKAT was applied to the whole genome sequence data provided by Genetic Analysis Workshop 18 (GAW18) and discovered concordant and discordant regions compared to the methods without considering gene by environment interactions.

Download Full-text

Effects of kinship correction on inflation of genetic interaction statistics in commonly used mouse populations

G3 Genes|Genome|Genetics ◽

10.1093/g3journal/jkab131 ◽

2021 ◽

Author(s):

Anna L Tyler ◽

Baha El Kassaby ◽

Georgi Kolishovski ◽

Jake Emerson ◽

Ann E Wells ◽

...

Keyword(s):

Mixed Model ◽

Linear Mixed Model ◽

Genetic Interaction ◽

Recombinant Inbred Lines ◽

Test Statistics ◽

Test Statistic ◽

Kinship Matrix ◽

Main Effect ◽

Main Effects ◽

Interaction Test

Abstract It is well understood that variation in relatedness among individuals, or kinship, can lead to false genetic associations. Multiple methods have been developed to adjust for kinship while maintaining power to detect true associations. However, relatively unstudied, are the effects of kinship on genetic interaction test statistics. Here we performed a survey of kinship effects on studies of six commonly used mouse populations. We measured inflation of main effect test statistics, genetic interaction test statistics, and interaction test statistics reparametrized by the Combined Analysis of Pleiotropy and Epistasis (CAPE). We also performed linear mixed model (LMM) kinship corrections using two types of kinship matrix: an overall kinship matrix calculated from the full set of genotyped markers, and a reduced kinship matrix, which left out markers on the chromosome(s) being tested. We found that test statistic inflation varied across populations and was driven largely by linkage disequilibrium. In contrast, there was no observable inflation in the genetic interaction test statistics. CAPE statistics were inflated at a level in between that of the main effects and the interaction effects. The overall kinship matrix overcorrected the inflation of main effect statistics relative to the reduced kinship matrix. The two types of kinship matrices had similar effects on the interaction statistics and CAPE statistics, although the overall kinship matrix trended toward a more severe correction. In conclusion, we recommend using a LMM kinship correction for both main effects and genetic interactions and further recommend that the kinship matrix be calculated from a reduced set of markers in which the chromosomes being tested are omitted from the calculation. This is particularly important in populations with substantial population structure, such as recombinant inbred lines in which genomic replicates are used.

Download Full-text

Parallelized calculation of permutation tests

Bioinformatics ◽

10.1093/bioinformatics/btaa1007 ◽

2020 ◽

Author(s):

Markus Ekvall ◽

Michael Höhle ◽

Lukas Käll

Keyword(s):

Dynamic Programming ◽

Sample Size ◽

Permutation Test ◽

Statistical Tests ◽

Permutation Tests ◽

Supplementary Information ◽

Attractive Alternative ◽

Test Statistic ◽

Sample Distribution ◽

Running Time

Abstract Motivation Permutation tests offer a straightforward framework to assess the significance of differences in sample statistics. A significant advantage of permutation tests are the relatively few assumptions about the distribution of the test statistic are needed, as they rely on the assumption of exchangeability of the group labels. They have great value, as they allow a sensitivity analysis to determine the extent to which the assumed broad sample distribution of the test statistic applies. However, in this situation, permutation tests are rarely applied because the running time of naïve implementations is too slow and grows exponentially with the sample size. Nevertheless, continued development in the 1980s introduced dynamic programming algorithms that compute exact permutation tests in polynomial time. Albeit this significant running time reduction, the exact test has not yet become one of the predominant statistical tests for medium sample size. Here, we propose a computational parallelization of one such dynamic programming-based permutation test, the Green algorithm, which makes the permutation test more attractive. Results Parallelization of the Green algorithm was found possible by non-trivial rearrangement of the structure of the algorithm. A speed-up—by orders of magnitude—is achievable by executing the parallelized algorithm on a GPU. We demonstrate that the execution time essentially becomes a non-issue for sample sizes, even as high as hundreds of samples. This improvement makes our method an attractive alternative to, e.g. the widely used asymptotic Mann-Whitney U-test. Availabilityand implementation In Python 3 code from the GitHub repository https://github.com/statisticalbiotechnology/parallelPermutationTest under an Apache 2.0 license. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text