General Regression Models with Missing Values in One of Two Covariates

Author(s):  
Werner Vach
2021 ◽  
Author(s):  
Panagiotis Anagnostou ◽  
Sotiris Tasoulis ◽  
Aristidis G. Vrahatis ◽  
Spiros Georgakopoulos ◽  
Matthew Prina ◽  
...  

AbstractPreventive healthcare is a crucial pillar of health as it contributes to staying healthy and having immediate treatment when needed. Mining knowledge from longitudinal studies has the potential to significantly contribute to the improvement of preventive healthcare. Unfortunately, data originated from such studies are characterized by high complexity, huge volume and a plethora of missing values. Machine Learning, Data Mining and Data Imputation models are utilized as part of solving the aforementioned challenges, respectively. Towards this direction, we focus on the development of a complete methodology for the ATHLOS (Ageing Trajectories of Health: Longitudinal Opportunities and Synergies) Project - funded by the European Union’s Horizon 2020 Research and Innovation Program, which aims to achieve a better interpretation of the impact of aging on health. The inherent complexity of the provided dataset lie in the fact that the project includes 15 independent European and international longitudinal studies of aging. In this work, we particularly focus on the HealthStatus (HS) score, an index that estimates the human status of health, aiming to examine the effect of various data imputation models to the prediction power of classification and regression models. Our results are promising, indicating the critical importance of data imputation in enhancing preventive medicine’s crucial role.


1998 ◽  
Vol 217 (1) ◽  
Author(s):  
Hans Schneeweiß

ZusammenfassungNach einer kurzen Einführung in die Theorie der erwartungstreuen Schätzgleichungen für allgemeine Regressionsmodelle und der korrigierten Schätzgleichungen für Regressionsmodelle mit fehlerbehafteten Kovariablen wird die Approximationsgüte eines auf Reihenentwicklung basierenden Ansatzes von Stefanski diskutiert.


2010 ◽  
Vol 121-122 ◽  
pp. 346-349
Author(s):  
Yu Qin Sun ◽  
Yuan Ttao Jiang ◽  
Yong Ge Tian

One century ago (1910), the Hungarian mathematician Alfred Haar introduced the simplest wavelets in approximation theory, which are now known as the Haar wavelets. This type of wavelets can effectively be used to fit data in statistical applications. It is well known that for a general regression model, it is not easy to write estimations of its parameters in analytical forms. However, regression models generated from the Haar wavelets are easy to compute. In this article, we introduce how to use the Haar wavelets to formulate regression models and to fit data. In addition, we mention some variations of the Haar wavelets and their possible applications.


Author(s):  
Mohamed Reda Abonazel

This paper has reviewed two important problems in regression analysis (outliers and missing data), as well as some handling methods for these problems. Moreover, two applications have been introduced to understand and study these methods by R-codes. Practical evidence was provided to researchers to deal with those problems in regression modeling with R. Finally, we created a Monte Carlo simulation study to compare different handling methods of missing data in the regression model. Simulation results indicate that, under our simulation factors, the k-nearest neighbors method is the best method to estimate the missing values in regression models.


Circulation ◽  
2012 ◽  
Vol 125 (suppl_10) ◽  
Author(s):  
Cari J Clark ◽  
Qi Wang ◽  
Hongfei Guo ◽  
Joyce T Bromberger ◽  
Peter Mancuso ◽  
...  

Introduction: Depressive symptoms have been linked to CVD risk factors, including metabolic dysregulation. One pathway by which depression may influence CVD risk is via alterations in adiponectin, an abundant adipocytokine with anti-inflammatory effects. This mechanism has not been studied in population-based samples. Hypothesis: The relationship of depressive symptoms with metabolic syndrome (MetSyn) and Framingham Risk Score (FRS) will be partly mediated by adiponectin. Methods: Participants were 581 women (61.3% white; 38.7% black) from the Chicago and Pittsburgh sites of the Study of Women’s Health Across the Nation. Adiponectin was measured from stored serum specimens and assayed in duplicate using a commercially available enzyme linked immunosorbent assay and log transformed for analysis. Depressive symptoms were measured with the 20-item Center for Epidemiological Studies Depression Scale (CES-D); a standard cutoff (>16) was used to determine clinically significant symptoms. MetSyn was defined by ATP-III criteria and considered present if the participant had at least 3 of the following: waist circumference >88cm; triglycerides >150 mg/dl; HDL cholesterol < 50 mg/dl; blood pressure > 130 mmHg systolic and / or 85 mmHg diastolic; impaired fasting glucose (>110 mg/dl) or diabetes. The FRS was defined by the participant’s age, smoking status, blood pressure, cholesterol, and use of anti-hypertensives. Logistic regression models were constructed to examine the cross-sectional relationship between depressive symptoms and MetSyn controlling for age, race and study site. A subsequent model included adiponectin to evaluate whether it attenuated the observed association. Linear regression models were used to conduct the same analysis with FRS as the outcome. Due to missing values, analytic sample sizes were 558 for MetSyn and 568 for FRS. Results: 147 women (25.3%) had elevated CES-D scores and 113 (20.7%) met criteria for MetSyn. Average FRS was 8.7 (sd=4.6) and the mean, untransformed adiponectin value was 9.9 (sd=4.9) μ g/mL. In models adjusted for age, race, and study site, women with high CES-D scores had increased odds of MetSyn (OR=1.64; 95% CI=1.03, 2.60) and a higher FRS (estimate=0.98; se=0.41, p<.02). Separate bivariate analyses showed that adiponectin was inversely related to CES-D scores (p=.03), MetSyn (p<.001) and FRS (p<.001). Subsequently including adiponectin in the regression models attenuated the associations between CES-D and MetSyn (OR=1.45; 95% CI=0.89, 2.36) and FRS (estimate=0.76; se=0.41; p=.06). Conclusions: Adiponectin may partially explain the relation between depressive symptoms and measures of cardiometabolic health. Longitudinal studies are needed to more fully understand the temporality of these associations. Supported by NIH/DHHS grants HL091290, AG012505, AG012546, MH59770, AG17719.


Sign in / Sign up

Export Citation Format

Share Document