scholarly journals Comparison of the Missing-Indicator Method and Conditional Logistic Regression in 1:m Matched Case-Control Studies with Missing Exposure Values

2004 ◽  
Vol 159 (6) ◽  
pp. 603-610 ◽  
Author(s):  
X. Li
Author(s):  
Fei Wan ◽  
Graham A Colditz ◽  
Siobhan Sutcliffe

Abstract Although the need for addressing matching in the analysis of matched case-control studies is well established, debate remains as to the most appropriate analytic method when matching on at least one continuous factor. We compare the bias and efficiency of unadjusted and adjusted conditional logistic regression (CLR) and unconditional logistic regression (ULR) in the setting of both exact and non-exact matching. To demonstrate that case-control matching distorts the association between the matching variables and the outcome in the matched sample relative to the target population, we derive the logit model for the matched case-control sample under exact matching. We conduct simulations to validate our theoretical conclusions and to explore different ways of adjusting for the matching variables in CLR and ULR to reduce biases. When matching is exact, CLR is unbiased in all settings. When matching is not exact, unadjusted CLR tends to be biased and this bias increases with increasing matching caliper size. Spline smoothing of the matching variables in CLR can alleviate biases. Regardless of exact or non-exact matching, adjusted ULR is generally biased unless the functional form of the matched factors is modelled correctly. The validity of adjusted ULR is vulnerable to model specification error. CLR should remain the primary analytic approach.


2021 ◽  
Author(s):  
Joshua N. Sampson ◽  
Paul S. Albert ◽  
Mark P. Purdue

Abstract Background: We consider the analysis of nested, matched, case-control studies that have multiple biomarker measurements per individual. We propose a simple approach for estimating the marginal relationship between a biomarker measured at a single time point and the risk of an event. We know of no other standard software package that can perform such analyses while explicitly accounting for the matching. Results: We propose an application of conditional logistic regression (CLR) that can include all measurements and uses a robust variance estimator. We compare our approach to other methods such as performing CLR with only the first measurement, CLR with an average of all measurements, and Generalized Estimating Equations. In simulations, our approach is significantly more powerful than CLR with one measurement or an average of all measurements, and has similar to power to GEE but correctly accounts for the matching. We then apply our approach to the CLUE cohort to show that an increased level of the immune marker sCD27 is associated with non‐Hodgkin lymphoma (NHL) and, by evaluating the strength of the association as a function of time until diagnosis, that the an increased level is likely an effect of the disease as opposed to a cause of the disease. The approach can be implemented by the R function clogitRV available at https://github.com/sampsonj74/clogitRV.Conclusion: We offered an approach and software for analyzing matched case-control studies with multiple measurements. We demonstrated that these methods are accurate, precise, and statistically powerful.


Nutrients ◽  
2019 ◽  
Vol 11 (3) ◽  
pp. 523 ◽  
Author(s):  
Carmen Amezcua-Prieto ◽  
Juan Martínez-Galiano ◽  
Naomi Cano-Ibáñez ◽  
Rocío Olmedo-Requena ◽  
Aurora Bueno-Cavanillas ◽  
...  

The objective of this study was to assess the relationship between consumption of different types of carbohydrates (CHO) during pregnancy and the risk of having a small for gestational age (SGA) newborn. A retrospective matched case–control design was carried out with a total of 518 mother-offspring pairs. A total of 137 validated items were included in the food frequency questionnaire (FFQ). Conditional logistic regression models were used to calculate crude odds ratios (cORs) and adjusted odds ratios (aORs) with 95% confidence intervals (CIs). Having more than 75 g/day of brown bread showed an inverse association with SGA (aOR = 0.64, CI 0.43–0.96). In contrast, an intake of industrial sweets more than once a day (aOR = 2.70, CI 1.42–5.13), or even 2–6 times a week (aOR = 1.84, CI 1.20–2.82), increased the odds of having a SGA newborn. During pregnancy, the higher the increase of wholegrain cereal and bread, the lower the possibility of having a SGA newborn, but the opposite occurred with refined sugar products—just consuming industrial bakery products or pastries twice a week increased the odds of having an SGA infant. Case–control studies cannot verify causality and only show associations, which may reflect residual confusion due to the presence of unknown factors. It is possible that a high consumption of sugary foods is a marker of a generally poor lifestyle.


Biostatistics ◽  
2020 ◽  
Author(s):  
Nadim Ballout ◽  
Cedric Garcia ◽  
Vivian Viallon

Summary The analysis of case–control studies with several disease subtypes is increasingly common, e.g. in cancer epidemiology. For matched designs, a natural strategy is based on a stratified conditional logistic regression model. Then, to account for the potential homogeneity among disease subtypes, we adapt the ideas of data shared lasso, which has been recently proposed for the estimation of stratified regression models. For unmatched designs, we compare two standard methods based on $L_1$-norm penalized multinomial logistic regression. We describe formal connections between these two approaches, from which practical guidance can be derived. We show that one of these approaches, which is based on a symmetric formulation of the multinomial logistic regression model, actually reduces to a data shared lasso version of the other. Consequently, the relative performance of the two approaches critically depends on the level of homogeneity that exists among disease subtypes: more precisely, when homogeneity is moderate to high, the non-symmetric formulation with controls as the reference is not recommended. Empirical results obtained from synthetic data are presented, which confirm the benefit of properly accounting for potential homogeneity under both matched and unmatched designs, in terms of estimation and prediction accuracy, variable selection and identification of heterogeneities. We also present preliminary results from the analysis of a case–control study nested within the EPIC (European Prospective Investigation into Cancer and nutrition) cohort, where the objective is to identify metabolites associated with the occurrence of subtypes of breast cancer.


2021 ◽  
Vol 39 (15_suppl) ◽  
pp. e18579-e18579
Author(s):  
Joanna Zurko ◽  
Aniko Szabo ◽  
Yee Chung Cheng ◽  
Sailaja Kamaraju ◽  
John Burfeind ◽  
...  

e18579 Background: Patients with cancer have increased risk of developing SARS-Cov-2 (COVID-19) infection. It is unknown if characteristics related to breast cancer increase the risk of COVID-19 infection. In this retrospective matched case control study, we aim to identify breast cancer related risk factors associated with developing COVID-19 and describe outcomes of patients with breast cancer diagnosed with COVID-19. Methods: Women with breast cancer treated at the Medical College of Wisconsin and diagnosed with COVID-19 between March and December 2020 served as cases. Women with breast cancer without COVID-19 diagnosis within the same time frame were identified as potential controls. Controls were chosen by matching for age (≥60 vs <60), obesity (BMI <30 vs ≥30), county (Milwaukee vs suburban), race (white vs non-white) and diabetes mellitus (DM) with 3:1 matching planned. Univariate comparisons between cases and controls were done via Rao-Scott stratified chi-square test for categorical outcomes and stratified t-test for continuous variables. Conditional logistic regression was done to evaluate the joint effect of multiple characteristics on the odds of being a COVID-19 case. Results: Twenty-five cases and 77 controls were identified. All cases were fully matched by age, obesity, county, and race with 3 cases not able to be matched for DM. Mean age was 54.6 vs 54.9 (p=0.88), BMI 31.0 vs 31.6 (p=0.69), 48% lived in Milwaukee county and 68% were white (cases 24% black & 8% American Indian; controls 32% black). Regarding COVID outcomes, 24.0% (n=6) of cases were hospitalized, median length of stay was 2 days, 8% (n=2) needed oxygen, 4% (n=1) were intubated and 4% (n=1) died due to COVID-19. COVID-19 led to treatment delays in 40% of cases. On univariate analysis of cases vs controls, 64 vs 75% were ER/PR+ (p=0.31), 6.5 vs 5.2% HER2+ (p=0.34), and 9.0 vs 4.2% triple negative (p=0.10). There were no significant differences in breast cancer stage. At time of COVID diagnosis (or last clinic contact if control), 16 vs 14% had active disease (p=0.81), 72 vs 74% were on active treatment (p=0.85), with 21 vs 4% being on chemotherapy (p=0.007), and 44 vs 52% on endocrine therapy (p=0.49). On conditional logistic regression, being on active chemotherapy (OR 5.8, p=0.043) significantly increased the likelihood of developing COVID with a trend seen for triple negative disease (OR 2.8, p=0.12). Conclusions: In this matched case control study of patients with breast cancer, active chemotherapy was significantly associated with an increased likelihood of developing COVID-19 with a trend seen for triple negative disease. Rates of death due to COVID-19 were overall low. Our analysis was limited by small numbers and an inability to fully match patients for DM. These findings support continued strict precautions for those on active chemotherapy and warrants further analysis in those with triple negative disease.


2020 ◽  
Author(s):  
Abu Sayeed ◽  
Satyajit Kundu ◽  
Hasan Al Banna ◽  
Enryka Christopher ◽  
M Tasdik Hasan ◽  
...  

Background: Individuals with certain pre-existing chronic health conditions have been identified as a high-risk group for fatalities of COVID-19. Therefore, it is likely that individuals with chronic diseases may worry during this pandemic to the detriment of their mental health. This study compares the mental health of Bangladeshi adults affected by chronic disease to a healthy, matched control group during the COVID-19 pandemic. Method: A matched case–control analysis was performed with data collected from 395 respondents with chronic diseases and 395 controls matched for age, gender and residence. Inclusion criteria for cases were respondents who self-reported having asthma, cardiovascular symptoms and/or diabetes. Respondents were recruited using an online survey, which included the DASS-21 measure to assess stress, anxiety, and depression. Chi-square tests, Fisher’s exact tests and a conditional logistic regression were performed to examine associations among variables.Results: The prevalence estimates of stress, anxiety and depression were significantly higher among cases (73.7%; 59%; 71.6%, respectively) than among controls (43.3%; 25.6%; 31.1%). Chi-square tests showed significant associations between having chronic diseases and mental health outcomes. A conditional logistic regression showed that respondents with asthma, diabetes, cardiovascular symptoms, or any combination of these diseases had higher odds of feeling stress, anxiety, and depression than healthy individuals. Conclusion: These results underscore a subpopulation vulnerable to mental health consequences during this pandemic and indicate the need for additional mental health resources to be available to those with chronic diseases.


Sign in / Sign up

Export Citation Format

Share Document