scholarly journals Mixed Model with Correction for Case-Control Ascertainment Increases Association Power

2014 ◽  
Author(s):  
tristan hayeck ◽  
Noah Zaitlen ◽  
Po-Ru Loh ◽  
Bjarni Vilhjalmsson ◽  
Samuela Pollack ◽  
...  

We introduce a Liability Threshold Mixed Linear Model (LTMLM) association statistic for ascertained case-control studies that increases power vs. existing mixed model methods, with a well-controlled false-positive rate. Recent work has shown that existing mixed model methods suffer a loss in power under case-control ascertainment, but no solution has been proposed. Here, we solve this problem using a chi-square score statistic computed from posterior mean liabilities (PML) under the liability threshold model. Each individual’s PML is conditional not only on that individual’s case-control status, but also on every individual’s case-control status and on the genetic relationship matrix obtained from the data. The PML are estimated using a multivariate Gibbs sampler, with the liability-scale phenotypic covariance matrix based on the genetic relationship matrix (GRM) and a heritability parameter estimated via Haseman-Elston regression on case-control phenotypes followed by transformation to liability scale. In simulations of unrelated individuals, the LTMLM statistic was correctly calibrated and achieved higher power than existing mixed model methods in all scenarios tested, with the magnitude of the improvement depending on sample size and severity of case-control ascertainment. In a WTCCC2 multiple sclerosis data set with >10,000 samples, LTMLM was correctly calibrated and attained a 4.1% improvement (P=0.007) in chi-square statistics (vs. existing mixed model methods) at 75 known associated SNPs, consistent with simulations. Larger increases in power are expected at larger sample sizes. In conclusion, an increase in power over existing mixed model methods is available for ascertained case-control studies of diseases with low prevalence.

2009 ◽  
Vol 41 (1) ◽  
Author(s):  
Alison M Kelly ◽  
Brian R Cullis ◽  
Arthur R Gilmour ◽  
John A Eccleston ◽  
Robin Thompson

1980 ◽  
Vol 19 (04) ◽  
pp. 215-219 ◽  
Author(s):  
H. Scherg

The problem of getting per chance significant differences in case—control studies mostly arises when using a long questionnaire or when comparing a large number of characteristics. The number of chance significances increases with the number of statistical tests carried out. Chance significance is demonstrated on a set of empirically collected data divided into two nonsense groups. The variation of chance significances is shown in several random samples from this data set. Chi square analysis of 2 × 2 contingency tables and stepwise discriminant analysis are applied. Suggestions for avoiding chance significances are made. Attention is called to the hazards of stepwise discriminant analysis.


2016 ◽  
Author(s):  
Tristan Hayeck ◽  
Noah A. Zaitlen ◽  
Po-Ru Loh ◽  
Samuela Pollack ◽  
Alexander Gusev ◽  
...  

Mixed models have become the tool of choice for genetic association studies; however, standard mixed model methods may be poorly calibrated or underpowered under family sampling bias and/or case-control ascertainment. Previously, we introduced a liability threshold based mixed model association statistic (LTMLM) to address case-control ascertainment in unrelated samples. Here, we consider family-biased case-control ascertainment, where cases and controls are ascertained non-randomly with respect to family relatedness. Previous work has shown that this type of ascertainment can severely bias heritability estimates; we show here that it also impacts mixed model association statistics. We introduce a family-based association statistic (LT-Fam) that is robust to this problem. Similar to LTMLM, LT-Fam is computed from posterior mean liabilities (PML) under a liability threshold model; however, LT-Fam uses published narrow-sense heritability estimates to avoid the problem of biased heritability estimation, enabling correct calibration. In simulations with family-biased case-control ascertainment, LT-Fam was correctly calibrated (average χ2 = 1.00), whereas Armitage Trend Test (ATT) and standard mixed model association (MLM) were mis-calibrated (e.g. average χ2 = 0.50-0.67 for MLM). LT-Fam also attained higher power in these simulations, with increases of up to 8% vs. ATT and 3% vs. MLM after correcting for mis-calibration. In 1,269 type 2 diabetes cases and 5,819 controls from the CARe cohort, downsampled to induce family-biased ascertainment, LT-Fam was correctly calibrated whereas ATT and MLM were again mis-calibrated (e.g. average χ2 = 0.60-0.82 for MLM). Our results highlight the importance of modeling family sampling bias in case-control data sets with related samples.


2021 ◽  
Vol 12 ◽  
Author(s):  
Ting Xu ◽  
Guo-An Qi ◽  
Jun Zhu ◽  
Hai-Ming Xu ◽  
Guo-Bo Chen

The estimation of heritability has been an important question in statistical genetics. Due to the clear mathematical properties, the modified Haseman–Elston regression has been found a bridge that connects and develops various parallel heritability estimation methods. With the increasing sample size, estimating heritability for biobank-scale data poses a challenge for statistical computation, in particular that the calculation of the genetic relationship matrix is a huge challenge in statistical computation. Using the Haseman–Elston framework, in this study we explicitly analyzed the mathematical structure of the key term tr(KTK), the trace of high-order term of the genetic relationship matrix, a component involved in the estimation procedure. In this study, we proposed two estimators, which can estimate tr(KTK) with greatly reduced sampling variance compared to the existing method under the same computational complexity. We applied this method to 81 traits in UK Biobank data and compared the chromosome-wise partition heritability with the whole-genome heritability, also as an approach for testing polygenicity.


2009 ◽  
Vol 12 (04n05) ◽  
pp. 513-531 ◽  
Author(s):  
WENTIAN LI ◽  
YANING YANG

Two noninteger parameters are defined for MAX statistics, which are maxima of d simpler test statistics. The first parameter, d MAX , is the fractional number of tests, representing the equivalent numbers of independent tests in MAX. If the d tests are dependent, d MAX < d. The second parameter is the fractional degrees of freedom k of the chi-square distribution [Formula: see text] that fits the MAX null distribution. These two parameters, d MAX and k, can be independently defined, and k can be noninteger even if d MAX is an integer. We illustrate these two parameters using the examples of MAX2 and MAX3 statistics in genetic case-control studies. We speculate that k is related to the amount of ambiguity of the model inferred by the test. In the case-control genetic association, tests with low k (e.g. k = 1) are able to provide definitive information about the disease model, as versus tests with high k (e.g. k = 2) that are completely uncertain about the disease model. Similar to Heisenberg's uncertain principle, the ability to infer disease model and the ability to detect significant association may not be simultaneously optimized, and k seems to measure the level of their balance.


Rheumatology ◽  
2020 ◽  
Author(s):  
Jinghui Huang ◽  
Nien Yee Kow ◽  
Hui Yin Lee ◽  
Anna-Marie Fairhurst ◽  
Anselm Mak

Abstract Objectives To identify and quantify the level of CD34+ CD133+ CD309+ circulating angiogenic cells (CAC) and explore factors associated with the level of CAC in patients with systemic lupus erythematosus (SLE). Methods The peripheral blood mononuclear cells of consecutive SLE patients and demographically matched healthy controls (HC) were extracted and identified, enumerated and compared for CAC levels by multi-colour flow cytometry based on the European League Against Rheumatism Scleroderma Trials and Research (EUSTAR) recommendation. Meta-analyses by combining the current and previous case-control studies were performed, aiming to increase the statistical power to discern the difference in CAC level between SLE patients and HC. Mixed-model meta-regression was conducted to explore potential demographic and clinical factors which were associated with CAC level. Results A lower level of CAC was found in 29 SLE patients compared with 24 HC (10.76 ± 13.9 vs 24.58 ± 25.4 cells/ml, p= 0.015). Random-effects meta-analyses of the current and 6 previously published case-control studies involving 401 SLE patients and 228 HC revealed a lower CAC level compared with HC (SMD= -2.439, p= 0.001). Meta-regression analysis demonstrated that hydroxychloroquine use was associated with a more discrepant CAC level between both groups (p= 0.01115). Conclusion SLE patients had a significantly lower CD34+ CD133+ CD309+ CAC level than HC and hydroxychloroquine use was associated with a more discrepant CAC level between SLE patients and HC. This study triggers further observational, interventional and mechanistic studies to address the beneficial impact of hydroxychloroquine on the functionality of CAC in SLE patients.


2015 ◽  
Vol 22 (04) ◽  
pp. 460-465
Author(s):  
Muhammad Aslam ◽  
Muhammad Asif ◽  
Saima Altaf

Objective: To assess the risk of different cancer sites among the male smokersof the Southern Punjab, Pakistan. Study Design: Case-control design. Period: March - July2012. Setting: A data set of 596 males, belonging to the Southern Punjab was collectedfrom the Outdoor Ward of Cancer, Oncology Ward of Nishtar Hospital and Multan Institute ofNuclear Medicine and Radiotherapy (MINAR) Hospital. Method: Through a self-administeredquestionnaire, smoking status and respondent’s history and medical record of various typesof cancers were noted. The Chi-square test was used to assess the association betweentobacco smoking and cancer disease. For the risk analysis, odds ratios and attributable riskwere computed. Results: Among the respondents, 49.0% smoked tobacco. From the medicalrecord, 438 respondents were confirmed cancerous. The average age to start tobacco wasnoted to be 23.41 ± 4.85 while the age was 45.29 ± 12.24 years for tobacco cessation. Thepercentage of lung cancer among smokers is 24.01 which is highest among all the statedcancer sites. The risk of a smoker getting all types of the stated cancers is at least three times.The risk of lung cancer attributed to smoking is 17.65 and 50.7% of all the stated cancers.Conclusions: Smokers in the Southern Punjab can greatly reduce their risk (more than 50%)of cancer if they quit smoking.


Sign in / Sign up

Export Citation Format

Share Document