BACKGROUND
Acute diseases have severe complications that develop rapidly, exhibit distinct phenotypes, and have profound effects on patient outcomes. Predictive analytics can enhance physicians’ care and management of acute disease patients by predicting crucial complication phenotypes for timely diagnosis and treatment. However, effective phenotype predictions require overcoming several challenges. First, patient data collected in the early stages of an acute disease (e.g., clinical data, laboratory results) are less informative for predicting phenotypic outcomes. Second, patient data are temporal and heterogeneous; for example, patients receive laboratory tests at different time intervals and frequencies. Third, imbalanced distributions of patient outcomes create an additional complexity for complication phenotype predictions.
OBJECTIVE
To predict crucial complication phenotypes among patients suffering acute diseases, we propose a novel, deep learning–based method that uses recurrent neural network–based sequence embedding to represent disease progressions, with the consideration of temporal heterogeneities in patient data. Our method incorporates a latent regulator to alleviate data insufficiency constraints by accounting for the underlying mechanisms that are not observed in patient data. The proposed method also includes cost-sensitive learning to address imbalanced outcome distributions in patient data for improved predictions.
METHODS
From a major health care organization in Taiwan, we obtain a sample of 10,354 electronic health records that pertain to 6,545 peritonitis patients. The proposed method projects these temporal, heterogeneous, clinical data into a substantially reduced feature space, then incorporates a latent regulator (latent parameter matrix) to obviate data insufficiencies and account for variations in phenotypic expressions. In addition, our method employs cost-sensitive learning to increase predictive performance further.
RESULTS
We evaluate the proposed method’s efficacy for predicting two hepatic complication phenotypes for peritonitis patients: acute hepatic encephalopathy (A-HE) and hepatorenal syndrome (HRS). The evaluation includes three benchmark techniques: temporal case-based reasoning (T-MMCBR), temporal short long-term memory (T-SLTM) networks, and time fusion convolutional neural network (CNN). For A-HE predictions, our method attains an area under the curve (AUC) of 0.82, which outperforms T-MMCBR by 64%, T-SLTM by 26%, and time fusion CNN by 26%. For HRS predictions, our method achieves an AUC of 0.64, which is 29% better than that of T-MMCBR (0.54). Overall, the evaluation results show that the proposed method significantly outperforms all the benchmarks, as measured by recall, F-measure, and AUC, while maintaining comparable precision values.
CONCLUSIONS
The proposed method learns a short-term temporal representation from patient data to predict complication phenotypes, and it offers greater predictive utilities than prevalent data-driven techniques. This method is generalizable and can be applied to different acute disease (illness) scenarios that are characterized by insufficient patient clinical data availability, temporal heterogeneities, and imbalanced distributions of important patient outcomes.
CLINICALTRIAL