Predict the Risk of Dyslipidemia Using Electronic Health Records: Apply Deep Neural Networks for Survival Data (Preprint)

2021 ◽  
Author(s):  
Hailun Liang ◽  
Dongzuo Liang ◽  
Lei Tao ◽  
Xiaoshuai Zhang ◽  
Xiao Li ◽  
...  

BACKGROUND Dyslipidemia is an important risk factor for coronary artery disease and stroke. Early detection and prevention of dyslipidemia can markedly alter cardiovascular morbidity and mortality. Cox proportional hazard model has been commonly employed for survival datasets to construct the prediction model. Recently, the data-driven learning algorithm began to be used to analyze right-censored survival data. However, there is no attempt to use deep neural networks in dyslipidemia prediction. OBJECTIVE The objective of this study is to predict the risk of dyslipidemia via deep neural networks for survival data. METHODS The study cohort was based on the routine health check-up data from 6,328 participants aged 19 to 90 years and free of dyslipidemia at baseline. A deep neural network (DNN) was used to develop risk models for predicting dyslipidemia. Cox Proportional Hazards (Cox) and Random Survival Forests (RSF) were applied in comparison with the DNN model. As our metric of performance, we use the time-dependent concordance index (Ctd-index). RESULTS The Ctd-index of the prediction models by using DNN was 0.802. The DNN model performed significantly better than Cox and RSF model (Ctd-index: 0.735 and 0.770, respectively). The improvement of DNN over the competing methods was statistically significant. Moreover, DNN provides performance gain on time intervals compared to conventional survival models. CONCLUSIONS DNN is a promising method in learning the estimated distribution of survival time and event while capturing the right-censored nature inherent in survival data. DNN achieves large and statistically significant performance improvements over previous intuitive regression model and state-of-the-art data-mining methods.

2021 ◽  
Author(s):  
Fabrizio Kuruc ◽  
Harald Binder ◽  
Moritz Hess

AbstractDeep neural networks are now frequently employed to predict survival conditional on omics-type biomarkers, e.g. by employing the partial likelihood of Cox proportional hazards model as loss function. Due to the generally limited number of observations in clinical studies, combining different data-sets has been proposed to improve learning of network parameters. However, if baseline hazards differ between the studies, the assumptions of Cox proportional hazards model are violated. Based on high dimensional transcriptome profiles from different tumor entities, we demonstrate how using a stratified partial likelihood as loss function allows for accounting for the different baseline hazards in a deep learning framework. Additionally, we compare the partial likelihood with the ranking loss, which is frequently employed as loss function in machine learning approaches due to its seemingly simplicity. Using RNA-seq data from the Cancer Genome Atlas (TCGA) we show that use of stratified loss functions leads to an overall better discriminatory power and lower prediction error compared to their nonstratified counterparts. We investigate which genes are identified to have the greatest marginal impact on prediction of survival when using different loss functions. We find that while similar genes are identified, in particular known prognostic genes receive higher importance from stratified loss functions. Taken together, pooling data from different sources for improved parameter learning of deep neural networks benefits largely from employing stratified loss functions that consider potentially varying baseline hazards. For easy application, we provide PyTorch code for stratified loss functions and an explanatory Jupyter notebook in a GitHub repository.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Dipendra Jha ◽  
Vishu Gupta ◽  
Logan Ward ◽  
Zijiang Yang ◽  
Christopher Wolverton ◽  
...  

AbstractThe application of machine learning (ML) techniques in materials science has attracted significant attention in recent years, due to their impressive ability to efficiently extract data-driven linkages from various input materials representations to their output properties. While the application of traditional ML techniques has become quite ubiquitous, there have been limited applications of more advanced deep learning (DL) techniques, primarily because big materials datasets are relatively rare. Given the demonstrated potential and advantages of DL and the increasing availability of big materials datasets, it is attractive to go for deeper neural networks in a bid to boost model performance, but in reality, it leads to performance degradation due to the vanishing gradient problem. In this paper, we address the question of how to enable deeper learning for cases where big materials data is available. Here, we present a general deep learning framework based on Individual Residual learning (IRNet) composed of very deep neural networks that can work with any vector-based materials representation as input to build accurate property prediction models. We find that the proposed IRNet models can not only successfully alleviate the vanishing gradient problem and enable deeper learning, but also lead to significantly (up to 47%) better model accuracy as compared to plain deep neural networks and traditional ML techniques for a given input materials representation in the presence of big data.


Author(s):  
Anh Hong Nguyen ◽  
Bethlehem Mekonnen ◽  
Eric Kim ◽  
Nisha R. Acharya

Abstract Background Macular edema (ME) is the most frequent cause of irreversible visual impairment in patients with uveitis. To date, little data exists about the clinical course of ME in pediatric patients. A retrospective, observational study was performed to examine the visual and macular thickness outcomes of ME associated with chronic, noninfectious uveitis in pediatric patients. Methods Pediatric patients with noninfectious uveitis complicated by ME seen in the University of California San Francisco Health System from 2012 to 2018 were identified using ICD-9 and ICD-10 codes. Data were collected from medical records including demographics, diagnoses, ocular history, OCT imaging findings, complications, and treatments at first encounter and at 3, 6, 9, and 12-month follow-up visits. Cox proportional hazards regression was used to investigate the association between different classes of treatment (steroid drops, steroid injections, oral steroids and other immunosuppressive therapies) and resolution of macular edema. Results The cohort comprised of 21 children (26 eyes) with a mean age of 10.5 years (SD 3.3). Undifferentiated uveitis was the most common diagnosis, affecting 19 eyes (73.1%). The majority of observed macular edema was unilateral (16 patients, 76.2%) and 5 patients had bilateral macular edema. The mean duration of follow-up at UCSF was 35.3 months (SD 25.7). By 12 months, 18 eyes (69.2%) had achieved resolution of ME. The median time to resolution was 3 months (IQR 3–6 months). Median best-corrected visual acuity (BCVA) at baseline was 0.54 logMAR (Snellen 20/69, IQR 20/40 to 20/200). Median BCVA at 12 months was 0.1 logMAR (Snellen 20/25, IQR 20/20 to 20/50) Corticosteroid injections were associated with a 4.0-fold higher rate of macular edema resolution (95% CI 1.3–12.2, P = 0.01). Conclusions Although only 15% of the pediatric patients with uveitis in the study cohort had ME, it is clinically important to conduct OCTs to detect ME in this population. Treatment resulted in 69% of eyes achieving resolution of ME by 12 months, accompanied with improvement in visual acuity. Corticosteroid injections were significantly associated with resolution of macular edema.


Stroke ◽  
2012 ◽  
Vol 43 (suppl_1) ◽  
Author(s):  
Elsayed Z Soliman ◽  
George Howard ◽  
George Howard ◽  
Mary Cushman ◽  
Brett Kissela ◽  
...  

Background: Prolongation of heart rate-corrected QT interval (QTc) is a well established predictor of cardiovascular morbidity and mortality. Little is known, however, about the relationship between this simple electrocardiographic (ECG) marker and risk of stroke. Methods: A total of 27,411 participants aged > 45 years without prior stroke from the REasons for Geographic and Racial Differences in Stroke (REGARDS) study were included in this analysis. QTc was calculated using Framingham formula (QTcFram). Stroke cases were identified and adjudicated during an up to 7 years of follow-up (median 2.7 years). Cox proportional hazards analysis was used to estimate the hazard ratios for incident stroke associated with prolonged QTcFram interval (vs. normal) and per 1 standard deviation (SD) increase, separately, in a series of incremental models. Results: The risk of incident stroke in the study participants with baseline prolonged QTcFram was almost 3 times the risk in those with normal QTcFram [HR (95% CI): 2.88 (2.12, 3.92), p<0.0001]. After adjustment for age, race, sex, antihypertensive medication use, systolic blood pressure, current smoking, diabetes, left ventricular hypertrophy, atrial fibrillation, prior cardiovascular disease, QRS duration, warfarin use, and QT-prolonging drugs (full model), the risk of stroke remained significantly high [HR (95% CI): 1.67 (1.16, 2.41), p=0.0060)], and was consistent across several subgroups of REGARDS participants. When the risk of stroke was estimated per 1 SD increase in QTcFram, a 24% increased risk was observed [HR (95% CI): 1.24 (1.16, 1.33), p<0.0001)]. This risk remained significant in the fully adjusted model [HR (95% CI): 1.12 (1.03, 1.21), p=0.0055]. Similar results were obtained when other QTc correction formulas including Hodge’s, Bazett’s and Fridericia’s were used. Conclusions: QTc prolongation is associated with a significantly increased risk of incident stroke independently from known stroke risk factors. In light of our results, examining the risk of stroke associated with QT-prolonging drugs may be warranted.


2018 ◽  
Vol 5 (suppl_1) ◽  
pp. S426-S426
Author(s):  
Christopher M Rubino ◽  
Lukas Stulik ◽  
Harald Rouha ◽  
Zehra Visram ◽  
Adriana Badarau ◽  
...  

Abstract Background ASN100 is a combination of two co-administered fully human monoclonal antibodies (mAbs), ASN-1 and ASN-2, that together neutralize the six cytotoxins critical to S. aureus pneumonia pathogenesis. ASN100 is in development for prevention of S. aureus pneumonia in mechanically ventilated patients. A pharmacometric approach to dose discrimination in humans was taken in order to bridge from dose-ranging, survival studies in rabbits to anticipated human exposures using a mPBPK model derived from data from rabbits (infected and noninfected) and noninfected humans [IDWeek 2017, Poster 1849]. Survival in rabbits was assumed to be indicative of a protective effect through ASN100 neutralization of S. aureus toxins. Methods Data from studies in rabbits (placebo through 20 mg/kg single doses of ASN100, four strains representing MRSA and MSSA isolates with different toxin profiles) were pooled with data from a PK and efficacy study in infected rabbits (placebo and 40 mg/kg ASN100) [IDWeek 2017, Poster 1844]. A Cox proportional hazards model was used to relate survival to both strain and mAb exposure. Monte Carlo simulation was then applied to generate ASN100 exposures for simulated patients given a range of ASN100 doses and infection with each strain (n = 500 per scenario) using a mPBPK model. Using the Cox model, the probability of full protection from toxins (i.e., predicted survival) was estimated for each simulated patient. Results Cox models showed that survival in rabbits is dependent on both strain and ASN100 exposure in lung epithelial lining fluid (ELF). At human doses simulated (360–10,000 mg of ASN100), full or substantial protection is expected for all four strains tested. For the most virulent strain tested in the rabbit pneumonia study (a PVL-negative MSSA, Figure 1), the clinical dose of 3,600 mg of ASN100 provides substantially higher predicted effect relative to lower doses, while doses above 3,600 mg are not predicted to provide significant additional protection. Conclusion A pharmacometric approach allowed for the translation of rabbit survival data to infected patients as well as discrimination of potential clinical doses. These results support the ASN100 dose of 3,600 mg currently being evaluated in a Phase 2 S. aureus pneumonia prevention trial. Disclosures C. M. Rubino, Arsanis, Inc.: Research Contractor, Research support. L. Stulik, Arsanis Biosciences GmbH: Employee, Salary. H. Rouha, 3Arsanis Biosciences GmbH: Employee, Salary. Z. Visram, Arsanis Biosciences GmbH: Employee, Salary. A. Badarau, Arsanis Biosciences GmbH: Employee, Salary. S. A. Van Wart, Arsanis, Inc.: Research Contractor, Research support. P. G. Ambrose, Arsanis, Inc.: Research Contractor, Research support. M. M. Goodwin, Arsanis, Inc.: Employee, Salary. E. Nagy, Arsanis Biosciences GmbH: Employee, Salary.


2018 ◽  
Vol 8 (12) ◽  
pp. 2416 ◽  
Author(s):  
Ansi Zhang ◽  
Honglei Wang ◽  
Shaobo Li ◽  
Yuxin Cui ◽  
Zhonghao Liu ◽  
...  

Prognostics, such as remaining useful life (RUL) prediction, is a crucial task in condition-based maintenance. A major challenge in data-driven prognostics is the difficulty of obtaining a sufficient number of samples of failure progression. However, for traditional machine learning methods and deep neural networks, enough training data is a prerequisite to train good prediction models. In this work, we proposed a transfer learning algorithm based on Bi-directional Long Short-Term Memory (BLSTM) recurrent neural networks for RUL estimation, in which the models can be first trained on different but related datasets and then fine-tuned by the target dataset. Extensive experimental results show that transfer learning can in general improve the prediction models on the dataset with a small number of samples. There is one exception that when transferring from multi-type operating conditions to single operating conditions, transfer learning led to a worse result.


Author(s):  
Vishal Babu Siramshetty ◽  
Dac-Trung Nguyen ◽  
Natalia J. Martinez ◽  
Anton Simeonov ◽  
Noel T. Southall ◽  
...  

The rise of novel artificial intelligence methods necessitates a comparison of this wave of new approaches with classical machine learning for a typical drug discovery project. Inhibition of the potassium ion channel, whose alpha subunit is encoded by human Ether-à-go-go-Related Gene (hERG), leads to prolonged QT interval of the cardiac action potential and is a significant safety pharmacology target for the development of new medicines. Several computational approaches have been employed to develop prediction models for assessment of hERG liabilities of small molecules including recent work using deep learning methods. Here we perform a comprehensive comparison of prediction models based on classical (random forests and gradient boosting) and modern (deep neural networks and recurrent neural networks) artificial intelligence methods. The training set (~9000 compounds) was compiled by integrating hERG bioactivity data from ChEMBL database with experimental data generated from an in-house, high-throughput thallium flux assay. We utilized different molecular descriptors including the latent descriptors, which are real-valued continuous vectors derived from chemical autoencoders trained on a large chemical space (> 1.5 million compounds). The models were prospectively validated on ~840 in-house compounds screened in the same thallium flux assay. The deep neural networks performed significantly better than the classical methods with the latent descriptors. The recurrent neural networks that operate on SMILES provided highest model sensitivity. The best models were merged into a consensus model that offered superior performance compared to reference models from academic and commercial domains. Further, we shed light on the potential of artificial intelligence methods to exploit the chemistry big data and generate novel chemical representations useful in predictive modeling and tailoring new chemical space.<br>


Author(s):  
Yun-Peng Liu ◽  
Ning Xu ◽  
Yu Zhang ◽  
Xin Geng

The performances of deep neural networks (DNNs) crucially rely on the quality of labeling. In some situations, labels are easily corrupted, and therefore some labels become noisy labels. Thus, designing algorithms that deal with noisy labels is of great importance for learning robust DNNs. However, it is difficult to distinguish between clean labels and noisy labels, which becomes the bottleneck of many methods. To address the problem, this paper proposes a novel method named Label Distribution based Confidence Estimation (LDCE). LDCE estimates the confidence of the observed labels based on label distribution. Then, the boundary between clean labels and noisy labels becomes clear according to confidence scores. To verify the effectiveness of the method, LDCE is combined with the existing learning algorithm to train robust DNNs. Experiments on both synthetic and real-world datasets substantiate the superiority of the proposed algorithm against state-of-the-art methods.


2018 ◽  
Vol 28 (5) ◽  
pp. 1523-1539
Author(s):  
Simon Bussy ◽  
Agathe Guilloux ◽  
Stéphane Gaïffas ◽  
Anne-Sophie Jannot

We introduce a supervised learning mixture model for censored durations (C-mix) to simultaneously detect subgroups of patients with different prognosis and order them based on their risk. Our method is applicable in a high-dimensional setting, i.e. with a large number of biomedical covariates. Indeed, we penalize the negative log-likelihood by the Elastic-Net, which leads to a sparse parameterization of the model and automatically pinpoints the relevant covariates for the survival prediction. Inference is achieved using an efficient Quasi-Newton Expectation Maximization algorithm, for which we provide convergence properties. The statistical performance of the method is examined on an extensive Monte Carlo simulation study and finally illustrated on three publicly available genetic cancer datasets with high-dimensional covariates. We show that our approach outperforms the state-of-the-art survival models in this context, namely both the CURE and Cox proportional hazards models penalized by the Elastic-Net, in terms of C-index, AUC( t) and survival prediction. Thus, we propose a powerful tool for personalized medicine in cancerology.


Circulation ◽  
2008 ◽  
Vol 118 (suppl_18) ◽  
Author(s):  
Cesare Russo ◽  
Zhezhen Jin ◽  
Ralph L Sacco ◽  
Shunichi Homma ◽  
Tatjana Rundek ◽  
...  

BACKGROUND: Aortic arch plaques (AAP) are a risk factor for cardiovascular embolic events. However, the risk of vascular events associated with AAP in the general population is unclear. AIM: To assess whether AAP detected by transesophageal echocardiography (TEE) are associated with an increased risk of vascular events in a stroke-free cohort. METHODS: The study cohort consisted of stroke-free subjects over age 50 from the Aortic Plaques and Risk of Ischemic Stroke (APRIS) study. AAP were assessed by multiplane TEE, and considered large if ≥ 4 mm in thickness. Vascular events including myocardial infarction, ischemic stroke and vascular death were recorded during the follow-up. The association between AAP and outcomes was assessed by univariate and multivariate Cox proportional hazards models. RESULTS: A group of 209 subjects was studied (mean age 67±9 years; 45% women; 14% whites, 30% blacks, 56% Hispanics). AAP of any size were present in 130 subjects (62%); large AAP in 50 (24%). Subjects with AAP were older (69±8 vs. 63±7 years), had higher systolic BP (146±21 vs.139±20 mmHg), were more often white (19% vs. 8%), smokers (20% vs. 9%) and more frequently had a history of coronary artery disease (26% vs. 14%) than those without AAP (all p<0.05). Lipid parameters, prevalence of atrial fibrillation and diabetes mellitus were not significantly different between the two groups. During the follow up (94±29 months) 30 events occurred (13 myocardial infarctions, 11 ischemic strokes, 6 vascular deaths). After adjustment for other risk factors, AAP of any size were not associated with an increased risk of combined vascular events (HR 1.07, 95% CI 0.44 to 2.56). The same result was observed for large AAP (HR 0.94, CI 0.34 to 2.64). Age (HR 1.05, CI 1.01 to 1.10), body mass index (HR 1.08, CI 1.01 to 1.15) and atrial fibrillation (HR 3.52, CI 1.07 to 11.61) showed independent association with vascular events. In a sub-analysis with ischemic stroke as outcome, neither AAP of any size nor large AAP were associated with an increased risk. CONCLUSIONS: In this cohort without prior stroke, the incidental detection of AAP was not associated with an increased risk of future vascular events. Associated co-factors may affect the AAP-related risk of vascular events reported in previous studies.


Sign in / Sign up

Export Citation Format

Share Document