scholarly journals Collider bias undermines our understanding of COVID-19 disease risk and severity

2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Gareth J. Griffith ◽  
Tim T. Morris ◽  
Matthew J. Tudball ◽  
Annie Herbert ◽  
Giulia Mancano ◽  
...  

AbstractNumerous observational studies have attempted to identify risk factors for infection with SARS-CoV-2 and COVID-19 disease outcomes. Studies have used datasets sampled from patients admitted to hospital, people tested for active infection, or people who volunteered to participate. Here, we highlight the challenge of interpreting observational evidence from such non-representative samples. Collider bias can induce associations between two or more variables which affect the likelihood of an individual being sampled, distorting associations between these variables in the sample. Analysing UK Biobank data, compared to the wider cohort the participants tested for COVID-19 were highly selected for a range of genetic, behavioural, cardiovascular, demographic, and anthropometric traits. We discuss the mechanisms inducing these problems, and approaches that could help mitigate them. While collider bias should be explored in existing studies, the optimal way to mitigate the problem is to use appropriate sampling strategies at the study design stage.

Author(s):  
Gareth Griffith ◽  
Tim T Morris ◽  
Matt Tudball ◽  
Annie Herbert ◽  
Giulia Mancano ◽  
...  

StandfirstObservational data on COVID-19 including hypothesised risk factors for infection and progression are accruing rapidly. Here, we highlight the challenge of interpreting observational evidence from non-random samples of the population, which may be affected by collider bias. We illustrate these issues using data from the UK Biobank in which individuals tested for COVID-19 are highly selected for a wide range of genetic, behavioural, cardiovascular, demographic, and anthropometric traits. We discuss the sampling mechanisms that leave aetiological studies of COVID-19 infection and progression particularly susceptible to collider bias. We also describe several tools and strategies that could help mitigate the effects of collider bias in extant studies of COVID-19 and make available a web app for performing sensitivity analyses. While bias due to non-random sampling should be explored in existing studies, the optimal way to mitigate the problem is to use appropriate sampling strategies at the study design stage.Key messagesCollider bias can occur in studies that non-randomly sample people from the population of interest. This bias can distort associations between variables or induce spurious associations.It may be possible to estimate the underlying selection model or run sensitivity analyses to examine the credibility of the threat of collider bias, but it is difficult to prove that bias has been reduced or eliminated.Tested samples in the UK Biobank cohort are highly selected for a range of traits.Sampling strategies that are resilient to collider bias issues should be used at the design stage of data collection where possible.Where this is not possible, linkage or collection of data on the target population can help in sensitivity and validation analyses.


2019 ◽  
Author(s):  
Alexander J Mentzer ◽  
Nicole Brenner ◽  
Naomi Allen ◽  
Thomas J Littlejohns ◽  
Amanda Y Chong ◽  
...  

AbstractBackgroundCertain infectious agents are recognised causes of cancer and potentially other chronic diseases. Identifying associations and understanding pathological mechanisms involving infectious agents and subsequent chronic disease risk will be possible through measuring exposure to multiple infectious agents in large-scale prospective cohorts such as UK Biobank.MethodsFollowing expert consensus we designed a Multiplex Serology platform capable of simultaneously measuring quantitative antibody responses against 45 antigens from 20 infectious agents implicated in non-communicable diseases, including human herpes, hepatitis, polyoma, papilloma, and retroviruses, as well as Chlamydia trachomatis, Helicobacter pylori and Toxoplasma gondii. This panel was assayed in a random subset of UK Biobank participants (n=9,695) to test associations between infectious agents and recognised demographic and genetic risk factors and disease outcomes.FindingsSeroprevalence estimates for each infectious agent were consistent with those expected from the literature. The data confirmed epidemiological associations of infectious agent antibody responses with sociodemographic characteristics (e.g. lifetime sexual partners with C, trachomatis; P=1·8×10−149), genetic variants (e.g. rs6927022 with Epstein-Barr virus (EBV) EBNA1 antibodies, P=9·5×10−91) and disease outcomes including human papillomavirus-16 seropositivity and cervical intraepithelial neoplasia (odds ratio 2·28, 95% confidence interval 1·38-3·63), and quantitative EBV viral capsid antigen responses and multiple sclerosis through genetic correlation (MHC rG=0·30, P=0·01).InterpretationThis dataset, intended as a pilot study to demonstrate applicability of Multiplex Serology in epidemiological studies, is itself one of the largest studies to date covering diverse infectious agents in a prospective UK cohort including those traditionally under-represented in population cohorts such as human immunodeficiency virus-1 and C. trachomatis. Our results emphasise the validity of our Multiplex Serology approach in large-scale epidemiological studies opening up opportunities for improving our understanding of host-pathogen-disease relationships. These data are available to researchers interested in examining the relationship between infectious agents and human health.


Genes ◽  
2021 ◽  
Vol 12 (7) ◽  
pp. 991
Author(s):  
Erik Widen ◽  
Timothy G. Raben ◽  
Louis Lello ◽  
Stephen D. H. Hsu

We use UK Biobank data to train predictors for 65 blood and urine markers such as HDL, LDL, lipoprotein A, glycated haemoglobin, etc. from SNP genotype. For example, our Polygenic Score (PGS) predictor correlates ∼0.76 with lipoprotein A level, which is highly heritable and an independent risk factor for heart disease. This may be the most accurate genomic prediction of a quantitative trait that has yet been produced (specifically, for European ancestry groups). We also train predictors of common disease risk using blood and urine biomarkers alone (no DNA information); we call these predictors biomarker risk scores, BMRS. Individuals who are at high risk (e.g., odds ratio of >5× population average) can be identified for conditions such as coronary artery disease (AUC∼0.75), diabetes (AUC∼0.95), hypertension, liver and kidney problems, and cancer using biomarkers alone. Our atherosclerotic cardiovascular disease (ASCVD) predictor uses ∼10 biomarkers and performs in UKB evaluation as well as or better than the American College of Cardiology ASCVD Risk Estimator, which uses quite different inputs (age, diagnostic history, BMI, smoking status, statin usage, etc.). We compare polygenic risk scores (risk conditional on genotype: PRS) for common diseases to the risk predictors which result from the concatenation of learned functions BMRS and PGS, i.e., applying the BMRS predictors to the PGS output.


Nutrients ◽  
2021 ◽  
Vol 13 (7) ◽  
pp. 2218
Author(s):  
Shuai Yuan ◽  
Paul Carter ◽  
Amy M. Mason ◽  
Stephen Burgess ◽  
Susanna C. Larsson

Coffee consumption has been linked to a lower risk of cardiovascular disease in observational studies, but whether the associations are causal is not known. We conducted a Mendelian randomization investigation to assess the potential causal role of coffee consumption in cardiovascular disease. Twelve independent genetic variants were used to proxy coffee consumption. Summary-level data for the relations between the 12 genetic variants and cardiovascular diseases were taken from the UK Biobank with up to 35,979 cases and the FinnGen consortium with up to 17,325 cases. Genetic predisposition to higher coffee consumption was not associated with any of the 15 studied cardiovascular outcomes in univariable MR analysis. The odds ratio per 50% increase in genetically predicted coffee consumption ranged from 0.97 (95% confidence interval (CI), 0.63, 1.50) for intracerebral hemorrhage to 1.26 (95% CI, 1.00, 1.58) for deep vein thrombosis in the UK Biobank and from 0.86 (95% CI, 0.50, 1.49) for subarachnoid hemorrhage to 1.34 (95% CI, 0.81, 2.22) for intracerebral hemorrhage in FinnGen. The null findings remained in multivariable Mendelian randomization analyses adjusted for genetically predicted body mass index and smoking initiation, except for a suggestive positive association for intracerebral hemorrhage (odds ratio 1.91; 95% CI, 1.03, 3.54) in FinnGen. This Mendelian randomization study showed limited evidence that coffee consumption affects the risk of developing cardiovascular disease, suggesting that previous observational studies may have been confounded.


2017 ◽  
Vol 76 (3) ◽  
pp. 308-315 ◽  
Author(s):  
Jayne V. Woodside ◽  
John Draper ◽  
Amanda Lloyd ◽  
Michelle C. McKinley

A high intake of fruit and vegetables (FV) has been associated with reduced risk of a number of chronic diseases, including CVD. The aim of this review is to describe the potential use of biomarkers to assess FV intake. Traditional methods of assessing FV intake have limitations, and this is likely to impact on observed associations with disease outcomes and markers of disease risk. Nutritional biomarkers may offer a more objective and reliable method of assessing dietary FV intake. Some single blood biomarkers, such as plasma vitamin C and serum carotenoids, are well established as indicators of FV intake. Combining potential biomarkers of intake may more accurately predict overall FV intake within intervention studies than the use of any single biomarker. Another promising approach is metabolomic analysis of biological fluids using untargeted approaches to identify potential new biomarkers of FV intake. Using biomarkers to measure FV intake may improve the accuracy of dietary assessment.


2021 ◽  
Author(s):  
Melis Anatürk ◽  
Raihaan Patel ◽  
Georgios Georgiopoulos ◽  
Danielle Newby ◽  
Anya Topiwala ◽  
...  

INTRODUCTION: Current prognostic models of dementia have had limited success in consistently identifying at-risk individuals. We aimed to develop and validate a novel dementia risk score (DRS) using the UK Biobank cohort.METHODS: After randomly dividing the sample into a training (n=166,487, 80%) and test set (n=41,621, 20%), logistic LASSO regression and standard logistic regression were used to develop the UKB-DRS.RESULTS: The score consisted of age, sex, education, apolipoprotein E4 genotype, a history of diabetes, stroke, and depression, and a family history of dementia. The UKB-DRS had good-to-strong discrimination accuracy in the UKB hold-out sample (AUC [95%CI]=0.79 [0.77, 0.82]) and in an external dataset (Whitehall II cohort, AUC [95%CI]=0.83 [0.79,0.87]). The UKB-DRS also significantly outperformed four published risk scores (i.e., Australian National University Alzheimer’s Disease Risk Index (ANU-ADRI), Cardiovascular Risk Factors, Aging, and Dementia score (CAIDE), Dementia Risk Score (DRS), and the Framingham Cardiovascular Risk Score (FRS) across both test sets.CONCLUSION: The UKB-DRS represents a novel easy-to-use tool that could be used for routine care or targeted selection of at-risk individuals into clinical trials.


Nutrients ◽  
2018 ◽  
Vol 10 (10) ◽  
pp. 1361 ◽  
Author(s):  
Sonia Vega-López ◽  
Bernard Venn ◽  
Joanne Slavin

Despite initial enthusiasm, the relationship between glycemic index (GI) and glycemic response (GR) and disease prevention remains unclear. This review examines evidence from randomized, controlled trials and observational studies in humans for short-term (e.g., satiety) and long-term (e.g., weight, cardiovascular disease, and type 2 diabetes) health effects associated with different types of GI diets. A systematic PubMed search was conducted of studies published between 2006 and 2018 with key words glycemic index, glycemic load, diabetes, cardiovascular disease, body weight, satiety, and obesity. Criteria for inclusion for observational studies and randomized intervention studies were set. The search yielded 445 articles, of which 73 met inclusion criteria. Results suggest an equivocal relationship between GI/GR and disease outcome. The strongest intervention studies typically find little relationship among GI/GR and physiological measures of disease risk. Even for observational studies, the relationship between GI/GR and disease outcomes is limited. Thus, it is unlikely that the GI of a food or diet is linked to disease risk or health outcomes. Other measures of dietary quality, such as fiber or whole grains may be more likely to predict health outcomes. Interest in food patterns as predictors of health benefits may be more fruitful for research to inform dietary guidance.


2019 ◽  
Vol 287 ◽  
pp. e92
Author(s):  
P. Ripatti ◽  
J.T. Rämö ◽  
S. Söderlund ◽  
I. Surakka ◽  
A.S. Havulinna ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document