scholarly journals XGBoost-Based Framework for Smoking-Induced Noncommunicable Disease Prediction

Author(s):  
Khishigsuren Davagdorj ◽  
Van Huy Pham ◽  
Nipon Theera-Umpon ◽  
Keun Ho Ryu

Smoking-induced noncommunicable diseases (SiNCDs) have become a significant threat to public health and cause of death globally. In the last decade, numerous studies have been proposed using artificial intelligence techniques to predict the risk of developing SiNCDs. However, determining the most significant features and developing interpretable models are rather challenging in such systems. In this study, we propose an efficient extreme gradient boosting (XGBoost) based framework incorporated with the hybrid feature selection (HFS) method for SiNCDs prediction among the general population in South Korea and the United States. Initially, HFS is performed in three stages: (I) significant features are selected by t-test and chi-square test; (II) multicollinearity analysis serves to obtain dissimilar features; (III) final selection of best representative features is done based on least absolute shrinkage and selection operator (LASSO). Then, selected features are fed into the XGBoost predictive model. The experimental results show that our proposed model outperforms several existing baseline models. In addition, the proposed model also provides important features in order to enhance the interpretability of the SiNCDs prediction model. Consequently, the XGBoost based framework is expected to contribute for early diagnosis and prevention of the SiNCDs in public health concerns.

Author(s):  
James Francis Oehmke ◽  
Theresa B Oehmke ◽  
Lauren Nadya Singh ◽  
Lori Ann Post

BACKGROUND SARS-CoV-2, the novel coronavirus that causes COVID-19, is a global pandemic with higher mortality and morbidity than any other virus in the last 100 years. Without public health surveillance, policy makers cannot know where and how the disease is accelerating, decelerating, and shifting. Unfortunately, existing models of COVID-19 contagion rely on parameters such as the basic reproduction number and use static statistical methods that do not capture all the relevant dynamics needed for surveillance. Existing surveillance methods use data that are subject to significant measurement error and other contaminants. OBJECTIVE The aim of this study is to provide a proof of concept of the creation of surveillance metrics that correct for measurement error and data contamination to determine when it is safe to ease pandemic restrictions. We applied state-of-the-art statistical modeling to existing internet data to derive the best available estimates of the state-level dynamics of COVID-19 infection in the United States. METHODS Dynamic panel data (DPD) models were estimated with the Arellano-Bond estimator using the generalized method of moments. This statistical technique enables control of various deficiencies in a data set. The validity of the model and statistical technique was tested. RESULTS A Wald chi-square test of the explanatory power of the statistical approach indicated that it is valid (χ<sup>2</sup><sub>10</sub>=1489.84, <i>P</i>&lt;.001), and a Sargan chi-square test indicated that the model identification is valid (χ<sup>2</sup><sub>946</sub>=935.52, <i>P</i>=.59). The 7-day persistence rate for the week of June 27 to July 3 was 0.5188 (<i>P</i>&lt;.001), meaning that every 10,000 new cases in the prior week were associated with 5188 cases 7 days later. For the week of July 4 to 10, the 7-day persistence rate increased by 0.2691 (<i>P</i>=.003), indicating that every 10,000 new cases in the prior week were associated with 7879 new cases 7 days later. Applied to the reported number of cases, these results indicate an increase of almost 100 additional new cases per day per state for the week of July 4-10. This signifies an increase in the reproduction parameter in the contagion models and corroborates the hypothesis that economic reopening without applying best public health practices is associated with a resurgence of the pandemic. CONCLUSIONS DPD models successfully correct for measurement error and data contamination and are useful to derive surveillance metrics. The opening of America involves two certainties: the country will be COVID-19–free only when there is an effective vaccine, and the “social” end of the pandemic will occur before the “medical” end. Therefore, improved surveillance metrics are needed to inform leaders of how to open sections of the United States more safely. DPD models can inform this reopening in combination with the extraction of COVID-19 data from existing websites.


10.2196/20924 ◽  
2020 ◽  
Vol 22 (9) ◽  
pp. e20924 ◽  
Author(s):  
James Francis Oehmke ◽  
Theresa B Oehmke ◽  
Lauren Nadya Singh ◽  
Lori Ann Post

Background SARS-CoV-2, the novel coronavirus that causes COVID-19, is a global pandemic with higher mortality and morbidity than any other virus in the last 100 years. Without public health surveillance, policy makers cannot know where and how the disease is accelerating, decelerating, and shifting. Unfortunately, existing models of COVID-19 contagion rely on parameters such as the basic reproduction number and use static statistical methods that do not capture all the relevant dynamics needed for surveillance. Existing surveillance methods use data that are subject to significant measurement error and other contaminants. Objective The aim of this study is to provide a proof of concept of the creation of surveillance metrics that correct for measurement error and data contamination to determine when it is safe to ease pandemic restrictions. We applied state-of-the-art statistical modeling to existing internet data to derive the best available estimates of the state-level dynamics of COVID-19 infection in the United States. Methods Dynamic panel data (DPD) models were estimated with the Arellano-Bond estimator using the generalized method of moments. This statistical technique enables control of various deficiencies in a data set. The validity of the model and statistical technique was tested. Results A Wald chi-square test of the explanatory power of the statistical approach indicated that it is valid (χ210=1489.84, P<.001), and a Sargan chi-square test indicated that the model identification is valid (χ2946=935.52, P=.59). The 7-day persistence rate for the week of June 27 to July 3 was 0.5188 (P<.001), meaning that every 10,000 new cases in the prior week were associated with 5188 cases 7 days later. For the week of July 4 to 10, the 7-day persistence rate increased by 0.2691 (P=.003), indicating that every 10,000 new cases in the prior week were associated with 7879 new cases 7 days later. Applied to the reported number of cases, these results indicate an increase of almost 100 additional new cases per day per state for the week of July 4-10. This signifies an increase in the reproduction parameter in the contagion models and corroborates the hypothesis that economic reopening without applying best public health practices is associated with a resurgence of the pandemic. Conclusions DPD models successfully correct for measurement error and data contamination and are useful to derive surveillance metrics. The opening of America involves two certainties: the country will be COVID-19–free only when there is an effective vaccine, and the “social” end of the pandemic will occur before the “medical” end. Therefore, improved surveillance metrics are needed to inform leaders of how to open sections of the United States more safely. DPD models can inform this reopening in combination with the extraction of COVID-19 data from existing websites.


Author(s):  
Irfan Ullah Khan ◽  
Nida Aslam ◽  
Malak Aljabri ◽  
Sumayh S. Aljameel ◽  
Mariam Moataz Aly Kamaleldin ◽  
...  

The COVID-19 outbreak is currently one of the biggest challenges facing countries around the world. Millions of people have lost their lives due to COVID-19. Therefore, the accurate early detection and identification of severe COVID-19 cases can reduce the mortality rate and the likelihood of further complications. Machine Learning (ML) and Deep Learning (DL) models have been shown to be effective in the detection and diagnosis of several diseases, including COVID-19. This study used ML algorithms, such as Decision Tree (DT), Logistic Regression (LR), Random Forest (RF), Extreme Gradient Boosting (XGBoost), and K-Nearest Neighbor (KNN) and DL model (containing six layers with ReLU and output layer with sigmoid activation), to predict the mortality rate in COVID-19 cases. Models were trained using confirmed COVID-19 patients from 146 countries. Comparative analysis was performed among ML and DL models using a reduced feature set. The best results were achieved using the proposed DL model, with an accuracy of 0.97. Experimental results reveal the significance of the proposed model over the baseline study in the literature with the reduced feature set.


2022 ◽  
pp. 073112142110677
Author(s):  
Rebecca Farber ◽  
Joseph Harris

COVID-19 has focused global attention on disease spread across borders. But how has research on infectious and noncommunicable disease figured into the sociological imagination historically, and to what degree has American medical sociology examined health problems beyond U.S. borders? Our 35-year content analysis of 2,588 presentations in the American Sociological Association’s (ASA) Section on Medical Sociology and 922 articles within the section’s official journal finds less than 15 percent of total research examined contexts outside the United States. Research on three infectious diseases in the top eight causes of death in low-income countries (diarrheal disease, malaria, and tuberculosis [TB]) and emerging diseases—Ebola, Middle East Respiratory Syndrome (MERS), and Severe Acute Respiratory Syndrome (SARS)—was nearly absent, as was research on major noncommunicable diseases. Human Immunodeficiency Virus/Acquired Immunodeficiency Syndrome (HIV/AIDS) received much more focus, although world regions hit hardest received scant attention. Interviews suggest a number of factors shape geographic foci of research, but this epistemic parochialism may ultimately impoverish sociological understanding of illness and disease.


Author(s):  
Matthew W Parker ◽  
Diana Sobieraj ◽  
Mary Beth Farrell ◽  
Craig I Coleman

Background: Little has been published on the practice of echocardiography (echo) in the United States. We used the Intersocietal Accreditation Commission-Echocardiography (IAC-Echo) applications database to describe the personnel in echo laboratories seeking accreditation. Methods: We used de-identified data provided on IAC-Echo applications to characterize facilities by hospital association, census region, annual volume, number of sites, previous accreditation, and numbers of physicians and sonographers as well as National Board of Echocardiography (NBE) testamur status of physicians and registered credential status of sonographers. We categorized Medical Directors by board certification in cardiovascular diseases, internal medicine, other specialty, or none. Medical Director echo training could be formal Level 2 or 3 or experiential by ≥3 years of practice. Frequencies, means, and medians were compared between groups using the chi-square test, t-test, or Mann Whitney test, respectively. Results: From 2011 to 2013, 1926 echo labs representing 10618 physicians and 6870 sonographers applied for IAC-Echo accreditation or re-accreditation. The majority of medical directors were board certified in cardiovascular diseases and 34.1% of medical directors and 27.2% of staff physicians held NBE testamur status; 79.5% of sonographers held registered credentials. Most echo labs were in the Northeast or South census regions, have an average of 1.75 sites, and are based outside of hospitals (Table). Compared to nonhospital echo labs, medical directors of hospital-based echo labs were more likely to be Level 3 trained (19.8% versus 30.8%, p<0.01) and be NBE testamurs (28.9% versus 45.6%, p<0.01). Markers of echo lab size, region, previous accreditation, and credentialed sonographers were associated with accreditation versus delay decisions; there was a trend toward accreditation among facilities with NBE medical directors. Conclusion: Among facilities seeking IAC-Echo accreditation, the minority of echo physicians hold NBE testamur status. Hospital and nonhospital facilities are different in the credentials of their personnel.


Stroke ◽  
2015 ◽  
Vol 46 (suppl_1) ◽  
Author(s):  
Opeolu Adeoye ◽  
Dawn Kleindorfer

Background: In 2013, the NIH Stroke Trials Network (StrokeNET) was established to maximize efficiencies in stroke clinical trials. Successful recruitment in future trials was required for participating sites. A high volume of cases treated is a surrogate for the potential to recruit. Among Medicare-eligible acute ischemic stroke (AIS) cases, we estimated the IV rt-PA and endovascular embolectomy treatment rates at StrokeNET Regional Coordinating Centers and their partner hospitals compared with non-StrokeNET hospitals in the United States (US). Methods: We used demographics and IV rt-PA and embolectomy rates in the 2013 Medicare Provider and Analysis Review (MEDPAR) dataset. ICD-9 codes 433.xx, 434.xx and 436 identified AIS cases. ICD-9 code 99.10 defined rt-PA treatment and ICD-9 code 39.74 defined embolectomy. Demographics and treatment rates at StrokeNET and non-StrokeNET sites were compared using t-test for proportions and Chi-square test for categorical variables as appropriate. Results: Of 386,157 AIS primary diagnosis discharges, 5.1% received IV rt-PA and 0.8% had embolectomy (Table). By June 6, 2014, StrokeNET comprised 247 acute care hospitals that discharged 48,946 (13%) out of 386,157 AIS cases. rt-PA (7.4% vs 4.8%) and embolectomy (1.9% vs 0.6%) treatment rates were higher at StrokeNET hospitals. In 2013, 36% of StrokeNET hospitals treated more than 20 AIS cases with rt-PA or embolectomy compared with 6% of non-StrokeNET hospitals (P<0.0001).Conclusions StrokeNET hospitals treat more AIS cases with acute reperfusion therapies. Thus, StrokeNET could successfully recruit in acute reperfusion clinical trials depending on study size, capture of eligible patients and the number of competing trials. We likely underestimated treatment rates due to not accounting for drip-and-ship and non-Medicare cases. To further enhance enrollments in large acute reperfusion phase 3 trials, partnership with high volume non-StrokeNET hospitals may be warranted.


Author(s):  
Brain Guntoro ◽  
Kasih Purwati

Hypertension is one of the number one causes of death and disability in the world. Hypertension contributes nearly 9.4 million deaths from cardiovascular disease each year. Hypertension can cause undesirable effects, it needs good handling, one of them is by doing a hypertension diet. To carry out a hypertension diet requires knowledge, lack of knowledge can increase risk factors for hypertension. This study aims to determine the relationship of the level of knowledge about hypertension diet to the incidence of hypertension in the elderly at the Baloi Permai Public Health Center Batam City. This research method is an analytic observational with a cross-sectional approach conducted at the Baloi Permai Public Health Center Batam City 2018. Sampling technique is a total sampling with a sample of 64 people in 2018 determined by inclusion and exclusion criteria. The results of the study were analyzed with frequency distribution and then tested with the Chi-square test. Based on the results of this study indicate that of the 64 respondents found elderly who have a good level of knowledge are 41 people (64.1%), 48 people (75.0%) have an age range between 60-70 years. 27 people (42.2%) elderly have the last high school education and 40 people (62.5%) have jobs as entrepreneurs. Elderly people who have normal blood pressure are 40 people (62.5%), and those affected by hypertension are 24 people (37.5%). The elderly who have a family history of hypertension is 21 people (32.8%) and those who do not have a history of hypertension are 43 people (67.2%). Chi-Square Test analysis results show the significance value p = 0.009. This number is significant because the p-value is smaller than the significance level (α) ≤ 5% (0.05), so H0 is rejected and Ha is accepted. Therefore it can be concluded that there is a significant relationship about the level of knowledge about the hypertension diet to the incidence of hypertension in the elderly. From the results of this study it was concluded that there was a relationship between the level of knowledge about the hypertension diet and the incidence of hypertension in the elderly at the Baloi Permai Public Health Center Batam City in 2016.


2015 ◽  
Vol 28 (3) ◽  
pp. 319-326 ◽  
Author(s):  
Vanessa Ribeiro dos Santos ◽  
Diego Giulliano Destro Christofaro ◽  
Igor Conterato Gomes ◽  
Ricardo Ribeiro Agostinete ◽  
Ismael Forte Freitas Júnior ◽  
...  

OBJECTIVE: To analyze whether sarcopenia is associated with sociodemographic factors and chronic noncommunicable diseases in adults aged 80 years and older. METHODS: The sample consisted of 120 adults aged 80 to 95 years (83.4±2.9 years) from the city of Presidente Prudente (São Paulo, Brazil), of which 76 were females (83.4±3.0 years) and 44 were males (83.4±2.6 years). The study sociodemographic and epidemiological factors were: age stratum, gender, marital status, education level, chronic noncommunicable diseases, ethnicity, and nutritional status. Body composition was determined by Dual-Energy X-Ray Absorptiometry and sarcopenia was identified by the appendicular lean mass ratio (upper limb lean mass + lower limb lean mass [kg]/height [m]2). The Chi-square test analyzed whether sarcopenia was associated with sociodemographic and epidemiological factors and binary logistic regression expressed the magnitude of the associations. The data were treated by the software Statistical Package for the Social Sciences (17.0) at a significance level of 5%. RESULTS: The factors associated with sarcopenia were gender, age, nutritional status, and osteopenia/osteoporosis. CONCLUSION: The factors gender, age, nutritional status, and osteopenia/osteoporosis are independently associated with sarcopenia in adults aged 80 years and older.


2021 ◽  
pp. 01-06
Author(s):  
Unnati Saxena ◽  
Debdipta Bose ◽  
Shruti Saha ◽  
Nithya J Gogtay ◽  
Urmila M Thatte

The present audit was carried out with the objective of evaluating warning letters (WLs) issued to trial sponsors, clinical investigators and institutional review boards (IRBs) by the United States Food and Drug Administration during a six-year period and compare it with two similar earlier audits. WLs were reviewed and classified as per stakeholders and further categorised as per predefined violation themes. The chi-square test was performed for trend analysis of WLs. A total of 62 WLs were issued to the three stakeholders. The maximum number of WLs were issued to the clinical investigators (36/62, 58.06%), followed by sponsors (19/62, 30.64%), and least to the IRBs (7/62, 11.29%). Among sponsors, lack of standard operating procedures for the monitoring, receipt, evaluation and reporting of post-marketing adverse drug events was the most common violation theme (8/19, 42.1%). Among clinical investigators, deviation from investigational plan was the most common violation theme (31/36, 86.11%.). For IRBs, inadequate documentation was the most common violation theme (6/7, 85.71%). We saw an overall reduction in the number of WLs issued to the stakeholders. Thus, we identified multiple areas on which each stakeholder should work for improvement.


Author(s):  
Anon Khunakorncharatphong ◽  
Nareerut Pudpong ◽  
Rapeepong Suphanchaimat ◽  
Sataporn Julchoo ◽  
Mathudara Phaiyarom ◽  
...  

Global morbidity associated with noncommunicable diseases (NCDs) has increased over the years. In Thailand, NCDs are among the most prevalent of all health problems, and affect both Thai citizens and non-Thai residents, such as expatriates. Key barriers to NCD health service utilization among expatriates include cultural and language differences. This study aimed to describe the situation and factors associated with NCD service utilizations among expatriate patients in Thailand. We employed a cross-sectional study design and used the service records of public hospitals from the Ministry of Public Health (MOPH) during the fiscal years 2014–2018. The focus of this study was on expatriates or those who had stayed in Thailand for at least three months. The results showed that, after 2014, there was an increasing trend in NCD service utilizations among expatriate patients for both outpatient (OP) and inpatient (IP) care. For OP care, Cambodia, Laos PDR, Myanmar, and Vietnam (CLMV) expatriates had fewer odds of NCD service utilization, relative to non-CLMV expatriates (p-value < 0.001). For IP care, males tended to have greater odds of NCD service utilization compared with females (AdjOR = 1.35, 95% CI = 1.05–1.74, p-value = 0.019). Increasing age showed a significant association with NCD service utilization. In addition, there was a growing trend of the NCD prevalence amongst expatriate patients. This issue points to a need for prompt public health actions if Thailand aims to have all people on its soil protected with universal health coverage for their well-being, as stipulated in the Sustainable Development Goals. Future studies that aim to collect primary evidence of expatriates at the household level should be conducted. Additional research on other societal factors that may help provide a better insight into access to healthcare for NCDs, such as socioeconomic status, beliefs, and attitudes, should be conducted.


Sign in / Sign up

Export Citation Format

Share Document