P5710Clinical applications of machine learning for prediction of incident atrial fibrillation from the general population: a nationwide cohort study

Abstract Background To evaluate the ability of machine learning algorithms to predict incident atrial fibrillation (AF) from the general population using health examination items. Methods We included 483,343 subjects who received national health examinations from the Korean National Health Insurance Service-based National Sample Cohort (NHIS-NSC). We trained deep neural network model (DNN) of a deep learning system and decision tree model (DT) of a machine learning system using clinical variables and health examination items (including age, sex, body mass index, history of heart failure, hypertension or diabetes, baseline creatinine, and smoking and alcohol intake habits) to predict incident AF using a training dataset of 341,771 subjects constructed from the NHIS-NSC database. The DNN and DT were validated using an independent test dataset of 141,572 remaining subjects. C-indices of DNN and DT for prediction of incident AF were compared with that of conventional logistic regression model. Results During 1,874,789 person·years (mean±standard-deviation age 47.7±14.4 years, 49.6% male), 3,282 subjects with incident AF were observed. In the validation dataset, 1,139 subjects with incident AF were observed. The c-indices of the DNN and DT for incident AF prediction were 0.828 [0.819–0.836] and 0.835 [0.825–0.844], and were significantly higher (p<0.01) than conventional logistic regression model (c-index=0.789 [0.784–0.794]). Conclusions Application of machine learning using simple clinical variables and health examination items was helpful to predict incident AF in the general population. Prospective study is warranted to construct an individualized precision medicine.

Download Full-text

Work absence and multimorbidity in Portugal: results from the 1st National Health Examination Survey

European Journal of Public Health ◽

10.1093/eurpub/ckaa166.1390 ◽

2020 ◽

Vol 30 (Supplement_5) ◽

Author(s):

J Matos ◽

C Matias Dias ◽

A Félix

Keyword(s):

Logistic Regression ◽

Regression Model ◽

Chronic Diseases ◽

National Health ◽

Logistic Regression Model ◽

Health Examination ◽

Work Absence ◽

Health Examination Survey ◽

Absence From Work ◽

The Impact

Abstract Background Studies on the impact of patients with multimorbidity in the absence of work indicate that the number and type of chronic diseases may increase absenteeism and that the risk of absence from work is higher in people with two or more chronic diseases. This study analyzed the association between multimorbidity and greater frequency and duration of work absence in the portuguese population between the ages of 25 and 65 during 2015. Methods This is an epidemiological, observational, cross-sectional study with an analytical component that has its source of information from the 1st National Health Examination Survey. The study analyzed univariate, bivariate and multivariate variables under study. A multivariate logistic regression model was constructed. Results The prevalence of absenteeism was 55,1%. Education showed an association with absence of work (p = 0,0157), as well as professional activity (p = 0,0086). It wasn't possible to verify association between the presence of chronic diseases (p = 0,9358) or the presence of multimorbidity (p = 0,4309) with absence of work. The prevalence of multimorbidity was 31,8%. There was association between age (p < 0,0001), education (p < 0,001) and yield (p = 0,0009) and multimorbidity. There is no increase in the number of days of absence from work due to the increase in the number of chronic diseases. In the optimized logistic regression model the only variables that demonstrated association with the variable labor absence were age (p = 0,0391) and education (0,0089). Conclusions The scientific evidence generated will contribute to the current discussion on the need for the health and social security system to develop policies to patients with multimorbidity. Key messages The prevalence of absenteeism and multimorbidity in Portugal was respectively 55,1% and 31,8%. In the optimized model age and education demonstrated association with the variable labor absence.

Download Full-text

Prediction of all-cause mortality in coronary artery disease patients with atrial fibrillation based on machine learning models

BMC Cardiovascular Disorders ◽

10.1186/s12872-021-02314-w ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Xinyun Liu ◽

Jicheng Jiang ◽

Lili Wei ◽

Wenlu Xing ◽

Hailong Shang ◽

...

Keyword(s):

Machine Learning ◽

Atrial Fibrillation ◽

Coronary Artery Disease ◽

Logistic Regression ◽

Regression Model ◽

Logistic Regression Model ◽

Support Vector ◽

Vector Machines ◽

All Cause Mortality ◽

Artery Disease

Abstract Background Machine learning (ML) can include more diverse and more complex variables to construct models. This study aimed to develop models based on ML methods to predict the all-cause mortality in coronary artery disease (CAD) patients with atrial fibrillation (AF). Methods A total of 2037 CAD patients with AF were included in this study. Three ML methods were used, including the regularization logistic regression, random forest, and support vector machines. The fivefold cross-validation was used to evaluate model performance. The performance was quantified by calculating the area under the curve (AUC) with 95% confidence intervals (CI), sensitivity, specificity, and accuracy. Results After univariate analysis, 24 variables with statistical differences were included into the models. The AUC of regularization logistic regression model, random forest model, and support vector machines model was 0.732 (95% CI 0.649–0.816), 0.728 (95% CI 0.642–0.813), and 0.712 (95% CI 0.630–0.794), respectively. The regularization logistic regression model presented the highest AUC value (0.732 vs 0.728 vs 0.712), specificity (0.699 vs 0.663 vs 0.668), and accuracy (0.936 vs 0.935 vs 0.935) among the three models. However, no statistical differences were observed in the receiver operating characteristic (ROC) curve of the three models (all P > 0.05). Conclusion Combining the performance of all aspects of the models, the regularization logistic regression model was recommended to be used in clinical practice.

Download Full-text

Logistic Regression Model for Loan Prediction: A Machine Learning Approach

10.1109/eti4.051663.2021.9619201 ◽

2021 ◽

Author(s):

Richa Manglani ◽

Anuja Bokhare

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Regression Model ◽

Logistic Regression Model ◽

Learning Approach ◽

Machine Learning Approach

Download Full-text

Machine-Learning vs. Expert-Opinion Driven Logistic Regression Modelling for Predicting 30-Day Unplanned Rehospitalisation in Preterm Babies: A Prospective, Population-Based Study (EPIPAGE 2)

Frontiers in Pediatrics ◽

10.3389/fped.2020.585868 ◽

2021 ◽

Vol 8 ◽

Author(s):

Robert A. Reed ◽

Andrei S. Morgan ◽

Jennifer Zeitlin ◽

Pierre-Henri Jarreau ◽

Héloïse Torchin ◽

...

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Random Forest ◽

Regression Model ◽

Expert Opinion ◽

Logistic Regression Model ◽

Population Based ◽

Regression Modelling ◽

Preterm Babies ◽

Logistic Regression Modelling

Introduction: Preterm babies are a vulnerable population that experience significant short and long-term morbidity. Rehospitalisations constitute an important, potentially modifiable adverse event in this population. Improving the ability of clinicians to identify those patients at the greatest risk of rehospitalisation has the potential to improve outcomes and reduce costs. Machine-learning algorithms can provide potentially advantageous methods of prediction compared to conventional approaches like logistic regression.Objective: To compare two machine-learning methods (least absolute shrinkage and selection operator (LASSO) and random forest) to expert-opinion driven logistic regression modelling for predicting unplanned rehospitalisation within 30 days in a large French cohort of preterm babies.Design, Setting and Participants: This study used data derived exclusively from the population-based prospective cohort study of French preterm babies, EPIPAGE 2. Only those babies discharged home alive and whose parents completed the 1-year survey were eligible for inclusion in our study. All predictive models used a binary outcome, denoting a baby's status for an unplanned rehospitalisation within 30 days of discharge. Predictors included those quantifying clinical, treatment, maternal and socio-demographic factors. The predictive abilities of models constructed using LASSO and random forest algorithms were compared with a traditional logistic regression model. The logistic regression model comprised 10 predictors, selected by expert clinicians, while the LASSO and random forest included 75 predictors. Performance measures were derived using 10-fold cross-validation. Performance was quantified using area under the receiver operator characteristic curve, sensitivity, specificity, Tjur's coefficient of determination and calibration measures.Results: The rate of 30-day unplanned rehospitalisation in the eligible population used to construct the models was 9.1% (95% CI 8.2–10.1) (350/3,841). The random forest model demonstrated both an improved AUROC (0.65; 95% CI 0.59–0.7; p = 0.03) and specificity vs. logistic regression (AUROC 0.57; 95% CI 0.51–0.62, p = 0.04). The LASSO performed similarly (AUROC 0.59; 95% CI 0.53–0.65; p = 0.68) to logistic regression.Conclusions: Compared to an expert-specified logistic regression model, random forest offered improved prediction of 30-day unplanned rehospitalisation in preterm babies. However, all models offered relatively low levels of predictive ability, regardless of modelling method.

Download Full-text

Abstract 211: Association of Patient Characteristics With the Initiation of Dabigatran Versus Warfarin Among Anticoagulant Naive Patients With Non-valvular Atrial Fibrillation

Circulation Cardiovascular Quality and Outcomes ◽

10.1161/circoutcomes.6.suppl_1.a211 ◽

2013 ◽

Vol 6 (suppl_1) ◽

Author(s):

David R Walker ◽

Jasmina Ivanova ◽

Keith A Betts ◽

Sapna Rao ◽

Eric Q Wu

Keyword(s):

Atrial Fibrillation ◽

Logistic Regression ◽

Regression Model ◽

Primary Care Physicians ◽

Online Survey ◽

Logistic Regression Model ◽

Stroke Risk ◽

Dabigatran Etexilate ◽

Patient Characteristics ◽

Categorical Variables

Background and Objective: Dabigatran etexilate (DE) and warfarin, both oral anticoagulants used for stroke risk reduction in patients with non-valvular atrial fibrillation (NVAF), have been or are being compared in several comparative effectiveness studies. Understanding patient characteristics of those prescribed DE vs. warfarin are important for interpreting such studies. The objective of this study is to identify the characteristics that differentiate NVAF patients prescribed DE versus warfarin as first-line anticoagulation. Methods: An online survey was administered in October 2012 to an established panel of cardiologists and primary care physicians (PCPs) in the US. Physicians were asked to identify medical charts of their patients diagnosed with NVAF and who had at least one prescription for DE or warfarin between 1/1/2011 and 6/30/2012. Patients were further required to be anticoagulant naïve prior to the first prescription of DE or warfarin. A computer generated random dice was applied to direct the random selection of the patients. Patient characteristics, comorbidities and clinical risk measures were compared between DE and warfarin patients using Chi-square tests for categorical variables and t-tests for continuous variables. A logistic regression model was utilized to evaluate patient characteristics associated with DE vs. warfarin use among anticoagulant naïve NVAF patients. Results: A total of 288 physicians (144 cardiologists and 144 PCPs) completed the survey. 262 medical records for DE patients and 247 for warfarin patients were randomly selected. The mean age of the DE and warfarin patients, respectively were 61.6 and 65.8 years (p < 0.01). The proportion of females was 20.6% and 41.7% in the DE and warfarin patients respectively (p<0.01). 86.3% of DE patients vs. 68.4% of warfarin patients were Caucasian (p<0.01). Other differences between DE and warfarin patients respectively included: previous myocardial infarction (3.8%, 9.3%; p<0.05), previous transient ischemic attack (8.4%, 16.2%; p <0.01), and CHA 2 DS 2 -VASc stroke risk score (2.21, 2.98; p<0.01). The logistic regression model found age (OR = 0.96; p=0.001), female gender (OR=0.46; p = 0.002), Hispanic/Latino (OR = 0.33; p=.007), Black (OR= 0.37; p = 0.006), and > 6 months and < 1 year for time from first NVAF diagnosis to first prescription date (OR = 0.38; p = 0.02) were associated with initiation of DE vs. warfarin. However, CHA 2 DS 2 -VASc was not found to be a significant predictor of anticoagulant prescription. Conclusions: Patients who are younger, male, Caucasian, and recently diagnosed with NVAF were significantly more likely to be initiated by their physician on DE vs. warfarin. These findings should be considered when doing comparative analyses of outcomes between patients on DE vs. warfarin.

Download Full-text

The impact of atrial fibrillation on hospitalization outcomes for patients with chronic lymphocytic leukemia: A large database analysis.

Journal of Clinical Oncology ◽

10.1200/jco.2021.39.15_suppl.e19508 ◽

2021 ◽

Vol 39 (15_suppl) ◽

pp. e19508-e19508

Author(s):

Mohammad Ammad Ud Din ◽

Samarthkumar Thakkar ◽

Harsh P. Patel ◽

Syed Ather Hussain ◽

Aneeqa Zafar ◽

...

Keyword(s):

Heart Failure ◽

Atrial Fibrillation ◽

Logistic Regression ◽

Chronic Lymphocytic Leukemia ◽

Regression Model ◽

Logistic Regression Model ◽

Lymphocytic Leukemia ◽

Total Cost ◽

Patient Admissions ◽

All Cause Mortality

e19508 Background: With the increased use of novel agents like Bruton tyrosine kinase inhibitors (BTKi) for the treatment of chronic lymphocytic leukemia (CLL), the incidence of atrial fibrillation (AF) is on the rise in these patients. However, the excess burden added by AF to the morbidity and mortality of CLL patients is unclear. Methods: Using the appropriate ICD-9 and ICD-10 codes, the National Inpatient Sample (NIS) database was accessed to gather data of hospitalized CLL patients with AF from 2008 to 2019. Propensity-score matching (PSM) and logistic regression model were performed to control for baseline patient factors like age, sex, income, and the relevant co-morbidities to match 7265 CLL patient admissions with AF and 7265 CLL patient admissions without AF. The primary outcome was all-cause mortality (ACM), while secondary outcomes included stroke, acute heart failure (AHF), and total cost of hospital stay. Results: The mean age of the cohorts was 82 years. Females made up 44% of both groups. The AF group had similar prevalence of systemic hypertension (62.38% vs 62.10%; p= 0.73), diabetes mellitus (5.09% vs 5.43%; p= 0.35), congestive heart failure (5.57% vs 5.36%; p= 0.58), valvular heart disease (1.17% vs 1.44%; p= 0.14), and pulmonary hypertension (0.21% vs 0.14%; p= 0.31) compared to the group without AF. PSM revealed CLL patients with AF had a higher rate of ACM (6.06% vs 4.47%; p= <0.0001), AHF (7.50% vs 3.85%; p= <0.001), and stroke (3.09% vs 1.65%; p= <0.0001). Admission in the AF group also had a higher median total cost of hospital stay ($9097 vs $7646). A logistic regression model was done to adjust for confounders which revealed similar results for the AF group with increased adjusted odd’s ratio (aOR) of ACM (aOR:1.39, 95% confidence interval (CI): 1.19-1.61; p= <0.001), AHF (aOR: 2.16, 95% CI: 1.85-2.52; p= <0.001), and stroke (aOR:1.94, 95% CI: 1.54-2.44; P= <0.001) (Table). Conclusions: Our data suggest that hospitalized CLL patients with AF are at a significantly increased risk of all-cause mortality, AHF, and stroke. Several limitations like the inability to establish the temporal relationship between CLL and AF and the lack of data regarding medications of individual patients are important to keep in mind while interpreting the results.[Table: see text]

Download Full-text

Prediction of perinatal death using machine learning models: a birth registry-based cohort study in northern Tanzania

BMJ Open ◽

10.1136/bmjopen-2020-040132 ◽

2020 ◽

Vol 10 (10) ◽

pp. e040132

Author(s):

Innocent B Mboya ◽

Michael J Mahande ◽

Mohanad Mohammed ◽

Joseph Obure ◽

Henry G Mwambi

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Regression Model ◽

Logistic Regression Model ◽

Perinatal Death ◽

Learning Models ◽

Net Benefit ◽

Birth Registry ◽

Perinatal Deaths ◽

Machine Learning Models

ObjectiveWe aimed to determine the key predictors of perinatal deaths using machine learning models compared with the logistic regression model.DesignA secondary data analysis using the Kilimanjaro Christian Medical Centre (KCMC) Medical Birth Registry cohort from 2000 to 2015. We assessed the discriminative ability of models using the area under the receiver operating characteristics curve (AUC) and the net benefit using decision curve analysis.SettingThe KCMC is a zonal referral hospital located in Moshi Municipality, Kilimanjaro region, Northern Tanzania. The Medical Birth Registry is within the hospital grounds at the Reproductive and Child Health Centre.ParticipantsSingleton deliveries (n=42 319) with complete records from 2000 to 2015.Primary outcome measuresPerinatal death (composite of stillbirths and early neonatal deaths). These outcomes were only captured before mothers were discharged from the hospital.ResultsThe proportion of perinatal deaths was 3.7%. There were no statistically significant differences in the predictive performance of four machine learning models except for bagging, which had a significantly lower performance (AUC 0.76, 95% CI 0.74 to 0.79, p=0.006) compared with the logistic regression model (AUC 0.78, 95% CI 0.76 to 0.81). However, in the decision curve analysis, the machine learning models had a higher net benefit (ie, the correct classification of perinatal deaths considering a trade-off between false-negatives and false-positives)—over the logistic regression model across a range of threshold probability values.ConclusionsIn this cohort, there was no significant difference in the prediction of perinatal deaths between machine learning and logistic regression models, except for bagging. The machine learning models had a higher net benefit, as its predictive ability of perinatal death was considerably superior over the logistic regression model. The machine learning models, as demonstrated by our study, can be used to improve the prediction of perinatal deaths and triage for women at risk.

Download Full-text

Non-participation modestly increased with distance to the examination clinic among adults in Finnish health examination surveys

Scandinavian Journal of Public Health ◽

10.1177/1403494817739502 ◽

2017 ◽

Vol 46 (7) ◽

pp. 752-754

Author(s):

Jaakko Reinikainen ◽

Perttu Saarsalmi ◽

Tommi Härkänen ◽

Pekka Jousilahti ◽

Juha Karvanen ◽

...

Keyword(s):

Logistic Regression ◽

Regression Model ◽

Population Health ◽

Logistic Regression Model ◽

Additional Analysis ◽

Health Examination ◽

Participation Rates ◽

Related Factors ◽

Health Related ◽

At Home

Aims: Health examination surveys (HES) provide important information about population health and health-related factors, but declining participation rates threaten the representativeness of collected data. It is hard to conduct national HESs at examination clinics near to every sampled individual. Thus, it is interesting to look into the possible association between the distance from home to the examination clinic and non-participation, and whether there is a certain distance after which the participation activity decreases considerably. Methods: Data from two national HESs conducted in Finland in 2011 and 2012 were used and a logistic regression model was fitted to investigate how distance was related to non-participation. Results: We found out that non-participation modestly increased with distance to the examination clinic. An additional analysis indicated that the option of having an examination at home may decrease the effect of distance to participation. Conclusions: Long distances from home to the examination clinic are one reason for low participation activity. Possible bias caused by these differences in participation could be decreased by providing the option of a home examination.

Download Full-text

Early Prediction of the Carbapenem Resistance Gram-negative Bacteria Carriage in Intensive Care Unit using Machine-Learning

10.21203/rs.3.rs-60222/v1 ◽

2020 ◽

Author(s):

Qiqiang Liang ◽

Qinyu Zhao ◽

Xin Xu ◽

Yu Zhou ◽

Man Huang

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Regression Model ◽

Logistic Regression Model ◽

Carbapenem Resistance ◽

Multivariate Logistic Regression Model ◽

Gram Negative Bacteria ◽

Multivariate Logistic Regression ◽

Gram Negative ◽

Better Than

Abstract Background The prevention and control of carbapenem-resistance gram-negative bacteria (CR-GNB) is the difficulty and focus for clinicians in the intensive care unit (ICU). This study construct a CR-GNB carriage prediction model in order to predict the CR-GNB incidence in one week. Methods The database is comprised of nearly 10,000 patients. the model is constructed by the multivariate logistic regression model and three machine learning algorithms. Then we choose the optimal model and verify the accuracy by daily predicted and recorded the occurrence of CR-GNB of all patients admitted for 4 months. Results There are 1385 patients with positive CR-GNB cultures and 1535 negative patients in this study. Forty-five variables have statistical significant differences. We include the 17 variables in the multivariate logistic regression model and build three machine learning models for all variables. In terms of accuracy and the area under the receiver operating characteristic (AUROC) curve, the random forest is better than XGBoost and multivariate logistic regression model, and better than decision tree model (accuracy: 84% >82%>81%>72%), (AUROC: 0.9089 > 0.8947 ≈ 0.8987 > 0.7845). In the 4-month prospective study, 81 cases were predicted to be positive in CR-GNB culture within 7 days, 146 cases were predicted to be negative, 86 cases were positive, and 120 cases were negative, with an overall accuracy of 84% and AUROC of 91.98%. Conclusions Prediction models by machine learning can predict the occurrence of CR-GNB colonization or infection within a week period, and can real-time predict and guide medical staff to identify high-risk groups more accurately.

Download Full-text

Root for a Phishing Page using Machine Learning

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.a5169.119119 ◽

2019 ◽

Vol 9 (1) ◽

pp. 1692-1695

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Regression Model ◽

Logistic Regression Model

Phishing alludes of the mimicking of the first website. To infiltrate this sort of con,the correspondence claims will a chance to be starting with an official illustrative of a website alternately another institutional Furthermore starting with the place an individual need a probable benefits of the business with. (eg. PayPal,Amazon,UPS,Bank for america etc). It focuses those vunariblities Toward method for pop ups ,ads,fake login pages and so on. Web clients are pulled in Eventually Tom's perusing method for leveraging their trust on acquire their delicate data for example, such that usernames,passwords,account numbers or other data with open record on acquire loans or purchase all the merchandise through e-commerce locales. Upto 5% for clients appear on make lured under these attacks,so it might remain calm gainful for scammers-many about whom who send a large number for trick e-mails An day. In this system,we offer an answer with this issue Toward settling on those client mindful of such phishing exercises Eventually Tom's perusing identifying the trick joins Furthermore urls Toward utilizing the blending of the The majority powerful calculations for machine learning, Concerning illustration An result, we infer our paper with correctness from claiming 98.8% What's more mix from claiming 26 offers. The best algorithm being ,the logistic regression model.

Download Full-text