scholarly journals Spatial prediction of malaria prevalence in Papua New Guinea: a comparison of Bayesian decision network and multivariate regression modelling approaches for improved accuracy in prevalence prediction

2021 ◽  
Vol 20 (1) ◽  
Author(s):  
Eimear Cleary ◽  
Manuel W. Hetzel ◽  
Paul Siba ◽  
Colleen L. Lau ◽  
Archie C. A. Clements

Abstract Background Considerable progress towards controlling malaria has been made in Papua New Guinea through the national malaria control programme’s free distribution of long-lasting insecticidal nets, improved diagnosis with rapid diagnostic tests and improved access to artemisinin combination therapy. Predictive prevalence maps can help to inform targeted interventions and monitor changes in malaria epidemiology over time as control efforts continue. This study aims to compare the predictive performance of prevalence maps generated using Bayesian decision network (BDN) models and multilevel logistic regression models (a type of generalized linear model, GLM) in terms of malaria spatial risk prediction accuracy. Methods Multilevel logistic regression models and BDN models were developed using 2010/2011 malaria prevalence survey data collected from 77 randomly selected villages to determine associations of Plasmodium falciparum and Plasmodium vivax prevalence with precipitation, temperature, elevation, slope (terrain aspect), enhanced vegetation index and distance to the coast. Predictive performance of multilevel logistic regression and BDN models were compared by cross-validation methods. Results Prevalence of P. falciparum, based on results obtained from GLMs was significantly associated with precipitation during the 3 driest months of the year, June to August (β = 0.015; 95% CI = 0.01–0.03), whereas P. vivax infection was associated with elevation (β = − 0.26; 95% CI = − 0.38 to − 3.04), precipitation during the 3 driest months of the year (β = 0.01; 95% CI = − 0.01–0.02) and slope (β = 0.12; 95% CI = 0.05–0.19). Compared with GLM model performance, BDNs showed improved accuracy in prediction of the prevalence of P. falciparum (AUC = 0.49 versus 0.75, respectively) and P. vivax (AUC = 0.56 versus 0.74, respectively) on cross-validation. Conclusions BDNs provide a more flexible modelling framework than GLMs and may have a better predictive performance when developing malaria prevalence maps due to the multiple interacting factors that drive malaria prevalence in different geographical areas. When developing malaria prevalence maps, BDNs may be particularly useful in predicting prevalence where spatial variation in climate and environmental drivers of malaria transmission exists, as is the case in Papua New Guinea.

2016 ◽  
Vol 19 (3) ◽  
pp. 385-397 ◽  
Author(s):  
Sepedeh Gholizadeh ◽  
Abbas Moghimbeigi ◽  
Jalal Poorolajal ◽  
Mohammadali Khjeian ◽  
Fatemeh Bahramian ◽  
...  

Stats ◽  
2021 ◽  
Vol 4 (3) ◽  
pp. 665-681
Author(s):  
Luca Insolia ◽  
Ana Kenney ◽  
Martina Calovi ◽  
Francesca Chiaromonte

High-dimensional classification studies have become widespread across various domains. The large dimensionality, coupled with the possible presence of data contamination, motivates the use of robust, sparse estimation methods to improve model interpretability and ensure the majority of observations agree with the underlying parametric model. In this study, we propose a robust and sparse estimator for logistic regression models, which simultaneously tackles the presence of outliers and/or irrelevant features. Specifically, we propose the use of L0-constraints and mixed-integer conic programming techniques to solve the underlying double combinatorial problem in a framework that allows one to pursue optimality guarantees. We use our proposal to investigate the main drivers of honey bee (Apis mellifera) loss through the annual winter loss survey data collected by the Pennsylvania State Beekeepers Association. Previous studies mainly focused on predictive performance, however our approach produces a more interpretable classification model and provides evidence for several outlying observations within the survey data. We compare our proposal with existing heuristic methods and non-robust procedures, demonstrating its effectiveness. In addition to the application to honey bee loss, we present a simulation study where our proposal outperforms other methods across most performance measures and settings.


PLoS ONE ◽  
2019 ◽  
Vol 14 (11) ◽  
pp. e0225427 ◽  
Author(s):  
Amjad Ali ◽  
Sabz Ali ◽  
Sajjad Ahmad Khan ◽  
Dost Muhammad Khan ◽  
Kamran Abbas ◽  
...  

2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Nicholas Siame Adam ◽  
Halima S. Twabi ◽  
Samuel O.M. Manda

Abstract Background Multilevel logistic regression models are widely used in health sciences research to account for clustering in multilevel data when estimating effects on subject binary outcomes of individual-level and cluster-level covariates. Several measures for quantifying between-cluster heterogeneity have been proposed. This study compared the performance of between-cluster variance based heterogeneity measures (the Intra-class Correlation Coefficient (ICC) and the Median Odds Ratio (MOR)), and cluster-level covariate based heterogeneity measures (the 80% Interval Odds Ratio (IOR-80) and the Sorting Out Index (SOI)). Methods We used several simulation datasets of a two-level logistic regression model to assess the performance of the four clustering measures for a multilevel logistic regression model. We also empirically compared the four measures of cluster variation with an analysis of childhood anemia to investigate the importance of unexplained heterogeneity between communities and community geographic type (rural vs urban) effect in Malawi. Results Our findings showed that the estimates of SOI and ICC were generally unbiased with at least 10 clusters and a cluster size of at least 20. On the other hand, estimates of MOR and IOR-80 were less accurate with 50 or fewer clusters regardless of the cluster size. The performance of the four clustering measures improved with increased clusters and cluster size at all cluster variances. In the analysis of childhood anemia, the estimate of the between-community variance was 0.455, and the effect of community geographic type (rural vs urban) had an odds ratio (OR)=1.21 (95% CI: 0.97, 1.52). The resulting estimates of ICC, MOR, IOR-80 and SOI were 0.122 (indicative of low homogeneity of childhood anemia in the same community); 1.898 (indicative of large unexplained heterogeneity); 0.345-3.978 and 56.7% (implying that the between community heterogeneity was more significant in explaining the variations in childhood anemia than the estimated effect of community geographic type (rural vs urban)), respectively. Conclusion At least 300 clusters with sizes of at least 50 would be adequate to estimate the strength of clustering in multilevel logistic regression with negligible bias. We recommend using the SOI to assess unexplained heterogeneity between clusters when the interest also involves the effect of cluster-level covariates, otherwise, the usual intra-cluster correlation coefficient would suffice in multilevel logistic regression analyses.


Author(s):  
Moza S. Al-Balushi ◽  
Mohammed S. Ahmed ◽  
M. Mazharul Islam

In this paper, multilevel logistic regression models are developed for examining the hierarchical effects of contraceptive use and its selected determinants in Oman using the 2008 Oman National Reproductive Health Survey (ONRHS). Comparison between single level and multilevel logistic regression models has been made to examine the plausibility of multilevel effects of contraceptive use. From the multilevel logistic regression model analysis, it was found that there is real multilevel variation among contraceptive users in Oman. The results indicate that a multilevel logistic regression model is the best fit over ordinary multiple logistic regression models. Generally, this study revealed that women’s age, education, number of living children and region of residence are important factors that affect contraceptive use in Oman. The effect of regional variation for age of women, education of women and number of living children further implies that there exists considerable differences in modern contraceptive use among regions, and a model with a random coefficient or slope is more appropriate to explain the regional variation than a model with fixed coefficients or without random effects. The study suggests that researchers should use multilevel models rather than traditional regression methods when their data structure is hierarchal.  


2017 ◽  
Vol 95 (10) ◽  
pp. 695-705B ◽  
Author(s):  
Manuel W Hetzel ◽  
Justin Pulford ◽  
Yangta Ura ◽  
Sharon Jamea-Maiasa ◽  
Anthony Tandrapah ◽  
...  

2021 ◽  
Vol 8 ◽  
Author(s):  
Mengsha Sun ◽  
Qiyu Bo ◽  
Bing Lu ◽  
Xiaodong Sun ◽  
Minwen Zhou

Objective: This study aims to investigate the association of sleep duration with vision impairment (VI) in middle-aged and elderly adults.Methods: This cross-sectional study used the data from the baseline survey of the China Health and Retirement Longitudinal Study (CHARLS) 2011–2012, a national survey of adults aged 45 years or older. Weighted multilevel logistic regression models were used to evaluate the association between self-reported sleep duration and VI.Results: Of the 13,959 survey respondents, a total of 4,776 (34.2%) reported VI. The prevalence of short (≤6 h/night) and long (>8 h/night) sleep durations was higher among respondents with VI than those without VI (P < 0.001). Multilevel logistic regression models showed that compared with a sleep duration of 6–8 h/night, a sleep duration of ≤6 h/night was associated with a 1.45-fold [95% confidence interval (CI) = 1.34–1.56] higher VI risk, and a sleep duration of >8 h/night was associated with a 1.18-fold (95% CI = 1.03–1.34) higher VI risk, after adjusting for sociodemographic data, lifestyle factors, and health conditions. Vision impairment was associated with short sleep duration in respondents from all age or gender categories. However, VI was associated with long sleep duration in respondents from the elderly or female categories. The association between VI and long sleep duration disappeared in respondents of middle-aged or male categories.Conclusions: The potential impact of sleep on the risk of visual functions requires further attention. A more comprehensive and integrated health care and rehabilitation system covering vision and sleep is also needed.


Sign in / Sign up

Export Citation Format

Share Document