scholarly journals Predicting mortality among patients with liver cirrhosis in electronic health records with machine learning

PLoS ONE ◽  
2021 ◽  
Vol 16 (8) ◽  
pp. e0256428
Author(s):  
Aixia Guo ◽  
Nikhilesh R. Mazumder ◽  
Daniela P. Ladner ◽  
Randi E. Foraker

Objective Liver cirrhosis is a leading cause of death and effects millions of people in the United States. Early mortality prediction among patients with cirrhosis might give healthcare providers more opportunity to effectively treat the condition. We hypothesized that laboratory test results and other related diagnoses would be associated with mortality in this population. Our another assumption was that a deep learning model could outperform the current Model for End Stage Liver disease (MELD) score in predicting mortality. Materials and methods We utilized electronic health record data from 34,575 patients with a diagnosis of cirrhosis from a large medical center to study associations with mortality. Three time-windows of mortality (365 days, 180 days and 90 days) and two cases with different number of variables (all 41 available variables and 4 variables in MELD-NA) were studied. Missing values were imputed using multiple imputation for continuous variables and mode for categorical variables. Deep learning and machine learning algorithms, i.e., deep neural networks (DNN), random forest (RF) and logistic regression (LR) were employed to study the associations between baseline features such as laboratory measurements and diagnoses for each time window by 5-fold cross validation method. Metrics such as area under the receiver operating curve (AUC), overall accuracy, sensitivity, and specificity were used to evaluate models. Results Performance of models comprising all variables outperformed those with 4 MELD-NA variables for all prediction cases and the DNN model outperformed the LR and RF models. For example, the DNN model achieved an AUC of 0.88, 0.86, and 0.85 for 90, 180, and 365-day mortality respectively as compared to the MELD score, which resulted in corresponding AUCs of 0.81, 0.79, and 0.76 for the same instances. The DNN and LR models had a significantly better f1 score compared to MELD at all time points examined. Conclusion Other variables such as alkaline phosphatase, alanine aminotransferase, and hemoglobin were also top informative features besides the 4 MELD-Na variables. Machine learning and deep learning models outperformed the current standard of risk prediction among patients with cirrhosis. Advanced informatics techniques showed promise for risk prediction in patients with cirrhosis.

2020 ◽  
Vol 15 (11) ◽  
pp. 1557-1565 ◽  
Author(s):  
Kumardeep Chaudhary ◽  
Akhil Vaid ◽  
Áine Duffy ◽  
Ishan Paranjpe ◽  
Suraj Jaladanki ◽  
...  

Background and objectivesSepsis-associated AKI is a heterogeneous clinical entity. We aimed to agnostically identify sepsis-associated AKI subphenotypes using deep learning on routinely collected data in electronic health records.Design, setting, participants, & measurementsWe used the Medical Information Mart for Intensive Care III database, which consists of electronic health record data from intensive care units in a tertiary care hospital in the United States. We included patients ≥18 years with sepsis who developed AKI within 48 hours of intensive care unit admission. We then used deep learning to utilize all available vital signs, laboratory measurements, and comorbidities to identify subphenotypes. Outcomes were mortality 28 days after AKI and dialysis requirement.ResultsWe identified 4001 patients with sepsis-associated AKI. We utilized 2546 combined features for K-means clustering, identifying three subphenotypes. Subphenotype 1 had 1443 patients, and subphenotype 2 had 1898 patients, whereas subphenotype 3 had 660 patients. Subphenotype 1 had the lowest proportion of liver disease and lowest Simplified Acute Physiology Score II scores compared with subphenotypes 2 and 3. The proportions of patients with CKD were similar between subphenotypes 1 and 3 (15%) but highest in subphenotype 2 (21%). Subphenotype 1 had lower median bilirubin levels, aspartate aminotransferase, and alanine aminotransferase compared with subphenotypes 2 and 3. Patients in subphenotype 1 also had lower median lactate, lactate dehydrogenase, and white blood cell count than patients in subphenotypes 2 and 3. Subphenotype 1 also had lower creatinine and BUN than subphenotypes 2 and 3. Dialysis requirement was lowest in subphenotype 1 (4% versus 7% [subphenotype 2] versus 26% [subphenotype 3]). The mortality 28 days after AKI was lowest in subphenotype 1 (23% versus 35% [subphenotype 2] versus 49% [subphenotype 3]). After adjustment, the adjusted odds ratio for mortality for subphenotype 3, with subphenotype 1 as a reference, was 1.9 (95% confidence interval, 1.5 to 2.4).ConclusionsUtilizing routinely collected laboratory variables, vital signs, and comorbidities, we were able to identify three distinct subphenotypes of sepsis-associated AKI with differing outcomes.


2019 ◽  
Vol 40 (Supplement_1) ◽  
Author(s):  
I Korsakov ◽  
A Gusev ◽  
T Kuznetsova ◽  
D Gavrilov ◽  
R Novitskiy

Abstract Abstract Background Advances in precision medicine will require an increasingly individualized prognostic evaluation of patients in order to provide the patient with appropriate therapy. The traditional statistical methods of predictive modeling, such as SCORE, PROCAM, and Framingham, according to the European guidelines for the prevention of cardiovascular disease, not adapted for all patients and require significant human involvement in the selection of predictive variables, transformation and imputation of variables. In ROC-analysis for prediction of significant cardiovascular disease (CVD), the areas under the curve for Framingham: 0.62–0.72, for SCORE: 0.66–0.73 and for PROCAM: 0.60–0.69. To improve it, we apply for approaches to predict a CVD event rely on conventional risk factors by machine learning and deep learning models to 10-year CVD event prediction by using longitudinal electronic health record (EHR). Methods For machine learning, we applied logistic regression (LR) and recurrent neural networks with long short-term memory (LSTM) units as a deep learning algorithm. We extract from longitudinal EHR the following features: demographic, vital signs, diagnoses (ICD-10-cm: I21-I22.9: I61-I63.9) and medication. The problem in this step, that near 80 percent of clinical information in EHR is “unstructured” and contains errors and typos. Missing data are important for the correct training process using by deep learning & machine learning algorithm. The study cohort included patients between the ages of 21 to 75 with a dynamic observation window. In total, we got 31517 individuals in the dataset, but only 3652 individuals have all features or missing features values can be easy to impute. Among these 3652 individuals, 29.4% has a CVD, mean age 49.4 years, 68,2% female. Evaluation We randomly divided the dataset into a training and a test set with an 80/20 split. The LR was implemented with Python Scikit-Learn and the LSTM model was implemented with Keras using Tensorflow as the backend. Results We applied machine learning and deep learning models using the same features as traditional risk scale and longitudinal EHR features for CVD prediction, respectively. Machine learning model (LR) achieved an AUROC of 0.74–0.76 and deep learning (LSTM) 0.75–0.76. By using features from EHR logistic regression and deep learning models improved the AUROC to 0.78–0.79. Conclusion The machine learning models outperformed a traditional clinically-used predictive model for CVD risk prediction (i.e. SCORE, PROCAM, and Framingham equations). This approach was used to create a clinical decision support system (CDSS). It uses both traditional risk scales and models based on neural networks. Especially important is the fact that the system can calculate the risks of cardiovascular disease automatically and recalculate immediately after adding new information to the EHR. The results are delivered to the user's personal account.


2019 ◽  
Author(s):  
Michelle Odlum ◽  
Omar Sims ◽  
Sunmoo Yoon

BACKGROUND As people living with HIV age, it becomes increasingly important to understand aging-related outcomes. The analysis of electronic health data (EHR) can further the understanding of such outcomes to support HIV aging phenotype development and improved overall health. OBJECTIVE For further insight, we evaluated the performance of two machine learning models: deep learning, and logistic regression on electronic health record data, to identify predictors of medical resource utilization, represented by Charlson comorbidity scores. Diagnostic codes that comprise the factors of individual characteristics, chronic conditions, treatment, and high-risk behaviors served as predictors. METHODS Diagnostic codes (ICD 9/10) were extracted for HIV infected (N=786), and uninfected (N=100,000) patients. A data mining process was applied to build comorbidity prediction models for use with two machine learning algorithms: deep learning and logistic regression. Final models were based on the relationship strength between the outcome (Charlson score: high (>5), low (≤5), and predictors (diagnostic codes). RESULTS Mean Charlson scores were 7.45±4.01 for HIV infected and 3.18±3.3 for uninfected patients. Top diagnostic codes were chemotherapy, hypertension, heart failure and acute kidney disease for HIV infected and substance abuse, length of hospital stay and chemotherapy for uninfected patients. Deep learning model predictors for HIV infected were age (16.16%), chemotherapy (13.17%), noncompliance treatment/regimen (11.83%), and hypertension (10.52%). For uninfected were, age (91.39%), and substance abuse (4.99%). Machine learning logistic regression predictors for HIV infected were chemotherapy (30.3%; OR: 48.7), age (26.6%; OR: 1.03), malnutrition (15.8%; OR: 4.58), and heart failure (10.8%; OR: 4.18). For uninfected were, age (88.7%; OR: 0.89), and length of hospital stay (9.74%; OR: 0.97). CONCLUSIONS Differences were observed for medical resource utilization by HIV status, and predictive models. Results contribute to the development of narrower HIV and aging phenotypes with greater clinical validity to improve interventions for optimal aging-related outcomes.


Electronics ◽  
2021 ◽  
Vol 10 (13) ◽  
pp. 1550
Author(s):  
Alexandros Liapis ◽  
Evanthia Faliagka ◽  
Christos P. Antonopoulos ◽  
Georgios Keramidas ◽  
Nikolaos Voros

Physiological measurements have been widely used by researchers and practitioners in order to address the stress detection challenge. So far, various datasets for stress detection have been recorded and are available to the research community for testing and benchmarking. The majority of the stress-related available datasets have been recorded while users were exposed to intense stressors, such as songs, movie clips, major hardware/software failures, image datasets, and gaming scenarios. However, it remains an open research question if such datasets can be used for creating models that will effectively detect stress in different contexts. This paper investigates the performance of the publicly available physiological dataset named WESAD (wearable stress and affect detection) in the context of user experience (UX) evaluation. More specifically, electrodermal activity (EDA) and skin temperature (ST) signals from WESAD were used in order to train three traditional machine learning classifiers and a simple feed forward deep learning artificial neural network combining continues variables and entity embeddings. Regarding the binary classification problem (stress vs. no stress), high accuracy (up to 97.4%), for both training approaches (deep-learning, machine learning), was achieved. Regarding the stress detection effectiveness of the created models in another context, such as user experience (UX) evaluation, the results were quite impressive. More specifically, the deep-learning model achieved a rather high agreement when a user-annotated dataset was used for validation.


Author(s):  
Jeffrey G Klann ◽  
Griffin M Weber ◽  
Hossein Estiri ◽  
Bertrand Moal ◽  
Paul Avillach ◽  
...  

Abstract Introduction The Consortium for Clinical Characterization of COVID-19 by EHR (4CE) is an international collaboration addressing COVID-19 with federated analyses of electronic health record (EHR) data. Objective We sought to develop and validate a computable phenotype for COVID-19 severity. Methods Twelve 4CE sites participated. First we developed an EHR-based severity phenotype consisting of six code classes, and we validated it on patient hospitalization data from the 12 4CE clinical sites against the outcomes of ICU admission and/or death. We also piloted an alternative machine-learning approach and compared selected predictors of severity to the 4CE phenotype at one site. Results The full 4CE severity phenotype had pooled sensitivity of 0.73 and specificity 0.83 for the combined outcome of ICU admission and/or death. The sensitivity of individual code categories for acuity had high variability - up to 0.65 across sites. At one pilot site, the expert-derived phenotype had mean AUC 0.903 (95% CI: 0.886, 0.921), compared to AUC 0.956 (95% CI: 0.952, 0.959) for the machine-learning approach. Billing codes were poor proxies of ICU admission, with as low as 49% precision and recall compared to chart review. Discussion We developed a severity phenotype using 6 code classes that proved resilient to coding variability across international institutions. In contrast, machine-learning approaches may overfit hospital-specific orders. Manual chart review revealed discrepancies even in the gold-standard outcomes, possibly due to heterogeneous pandemic conditions. Conclusion We developed an EHR-based severity phenotype for COVID-19 in hospitalized patients and validated it at 12 international sites.


2020 ◽  
Author(s):  
Haishuai Wang ◽  
Paul Avillach

BACKGROUND In the United States, about 3 million people have autism spectrum disorder (ASD), and around 1 out of 59 children are diagnosed with ASD. People with ASD have characteristic social communication deficits and repetitive behaviors. The causes of this disorder remain unknown; however, in up to 25% of cases, a genetic cause can be identified. Detecting ASD as early as possible is desirable because early detection of ASD enables timely interventions in children with ASD. Identification of ASD based on objective pathogenic mutation screening is the major first step toward early intervention and effective treatment of affected children. OBJECTIVE Recent investigation interrogated genomics data for detecting and treating autism disorders, in addition to the conventional clinical interview as a diagnostic test. Since deep neural networks perform better than shallow machine learning models on complex and high-dimensional data, in this study, we sought to apply deep learning to genetic data obtained across thousands of simplex families at risk for ASD to identify contributory mutations and to create an advanced diagnostic classifier for autism screening. METHODS After preprocessing the genomics data from the Simons Simplex Collection, we extracted top ranking common variants that may be protective or pathogenic for autism based on a chi-square test. A convolutional neural network–based diagnostic classifier was then designed using the identified significant common variants to predict autism. The performance was then compared with shallow machine learning–based classifiers and randomly selected common variants. RESULTS The selected contributory common variants were significantly enriched in chromosome X while chromosome Y was also discriminatory in determining the identification of autistic from nonautistic individuals. The ARSD, MAGEB16, and MXRA5 genes had the largest effect in the contributory variants. Thus, screening algorithms were adapted to include these common variants. The deep learning model yielded an area under the receiver operating characteristic curve of 0.955 and an accuracy of 88% for identifying autistic from nonautistic individuals. Our classifier demonstrated a significant improvement over standard autism screening tools by average 13% in terms of classification accuracy. CONCLUSIONS Common variants are informative for autism identification. Our findings also suggest that the deep learning process is a reliable method for distinguishing the diseased group from the control group based on the common variants of autism.


2021 ◽  
Author(s):  
Aditya Nagori ◽  
Anushtha Kalia ◽  
Arjun Sharma ◽  
Pradeep Singh ◽  
Harsh Bandhey ◽  
...  

Shock is a major killer in the ICU and machine learning based early predictions can potentially save lives. Generalization across age and geographical context is an unaddressed challenge. In this retrospective observational study, we built real-time shock prediction models generalized across age groups and continents. More than 1.5 million patient-hours of novel data from a pediatric ICU in New Delhi and 5 million patient-hours from the adult ICU MIMIC database were used to build models. We achieved model generalization through a novel fractal deep-learning approach and predicted shock up to 12 hours in advance. Our deep learning models showed a receiver operating curve (AUROC) drop from 78% (95%CI, 73-83) on MIMIC data to 66% (95%CI, 54-78) on New Delhi data, outperforming standard machine learning by nearly a 10% gap. Therefore, better representations and deep learning can partly address the generalizability-gap of ICU prediction models trained across geographies. Our data and algorithms are publicly available as a pre-configured docker environment at https://github.com/SAFE-ICU/ShoQPred.


Author(s):  
Emily Kogan ◽  
Kathryn Twyman ◽  
Jesse Heap ◽  
Dejan Milentijevic ◽  
Jennifer H. Lin ◽  
...  

Abstract Background Stroke severity is an important predictor of patient outcomes and is commonly measured with the National Institutes of Health Stroke Scale (NIHSS) scores. Because these scores are often recorded as free text in physician reports, structured real-world evidence databases seldom include the severity. The aim of this study was to use machine learning models to impute NIHSS scores for all patients with newly diagnosed stroke from multi-institution electronic health record (EHR) data. Methods NIHSS scores available in the Optum© de-identified Integrated Claims-Clinical dataset were extracted from physician notes by applying natural language processing (NLP) methods. The cohort analyzed in the study consists of the 7149 patients with an inpatient or emergency room diagnosis of ischemic stroke, hemorrhagic stroke, or transient ischemic attack and a corresponding NLP-extracted NIHSS score. A subset of these patients (n = 1033, 14%) were held out for independent validation of model performance and the remaining patients (n = 6116, 86%) were used for training the model. Several machine learning models were evaluated, and parameters optimized using cross-validation on the training set. The model with optimal performance, a random forest model, was ultimately evaluated on the holdout set. Results Leveraging machine learning we identified the main factors in electronic health record data for assessing stroke severity, including death within the same month as stroke occurrence, length of hospital stay following stroke occurrence, aphagia/dysphagia diagnosis, hemiplegia diagnosis, and whether a patient was discharged to home or self-care. Comparing the imputed NIHSS scores to the NLP-extracted NIHSS scores on the holdout data set yielded an R2 (coefficient of determination) of 0.57, an R (Pearson correlation coefficient) of 0.76, and a root-mean-squared error of 4.5. Conclusions Machine learning models built on EHR data can be used to determine proxies for stroke severity. This enables severity to be incorporated in studies of stroke patient outcomes using administrative and EHR databases.


Sign in / Sign up

Export Citation Format

Share Document