Applying Machine Learning Across Sites: External Validation of a Surgical Site Infection Detection Algorithm

Background Surgical site infection (SSI) is one of the most common types of health care–associated infections. It increases mortality, prolongs hospital length of stay, and raises health care costs. Many institutions developed risk assessment models for SSI to help surgeons preoperatively identify high-risk patients and guide clinical intervention. However, most of these models had low accuracies. Objective We aimed to provide a solution in the form of an Artificial intelligence–based Multimodal Risk Assessment Model for Surgical site infection (AMRAMS) for inpatients undergoing operations, using routinely collected clinical data. We internally and externally validated the discriminations of the models, which combined various machine learning and natural language processing techniques, and compared them with the National Nosocomial Infections Surveillance (NNIS) risk index. Methods We retrieved inpatient records between January 1, 2014, and June 30, 2019, from the electronic medical record (EMR) system of Rui Jin Hospital, Luwan Branch, Shanghai, China. We used data from before July 1, 2018, as the development set for internal validation and the remaining data as the test set for external validation. We included patient demographics, preoperative lab results, and free-text preoperative notes as our features. We used word-embedding techniques to encode text information, and we trained the LASSO (least absolute shrinkage and selection operator) model, random forest model, gradient boosting decision tree (GBDT) model, convolutional neural network (CNN) model, and self-attention network model using the combined data. Surgeons manually scored the NNIS risk index values. Results For internal bootstrapping validation, CNN yielded the highest mean area under the receiver operating characteristic curve (AUROC) of 0.889 (95% CI 0.886-0.892), and the paired-sample t test revealed statistically significant advantages as compared with other models (P<.001). The self-attention network yielded the second-highest mean AUROC of 0.882 (95% CI 0.878-0.886), but the AUROC was only numerically higher than the AUROC of the third-best model, GBDT with text embeddings (mean AUROC 0.881, 95% CI 0.878-0.884, P=.47). The AUROCs of LASSO, random forest, and GBDT models using text embeddings were statistically higher than the AUROCs of models not using text embeddings (P<.001). For external validation, the self-attention network yielded the highest AUROC of 0.879. CNN was the second-best model (AUROC 0.878), and GBDT with text embeddings was the third-best model (AUROC 0.872). The NNIS risk index scored by surgeons had an AUROC of 0.651. Conclusions Our AMRAMS based on EMR data and deep learning methods—CNN and self-attention network—had significant advantages in terms of accuracy compared with other conventional machine learning methods and the NNIS risk index. Moreover, the semantic embeddings of preoperative notes improved the model performance further. Our models could replace the NNIS risk index to provide personalized guidance for the preoperative intervention of SSIs. Through this case, we offered an easy-to-implement solution for building multimodal RAMs for other similar scenarios.

Download Full-text

Variable Case Detection and Many Unreported Cases of Surgical-Site Infection Following Colon Surgery and Abdominal Hysterectomy in a Statewide Validation

Infection Control and Hospital Epidemiology ◽

10.1017/ice.2017.134 ◽

2017 ◽

Vol 38 (9) ◽

pp. 1091-1097 ◽

Cited By ~ 5

Author(s):

Michael S. Calderwood ◽

Susan S. Huang ◽

Vicki Keller ◽

Christina B. Bruce ◽

N. Neely Kazerouni ◽

...

Keyword(s):

Surgical Site Infection ◽

Chart Review ◽

External Validation ◽

Colon Surgery ◽

Abdominal Hysterectomy ◽

Site Infection ◽

California Department ◽

Added Benefit ◽

Infection Preventionists ◽

Traditional Surveillance

OBJECTIVETo assess hospital surgical-site infection (SSI) identification and reporting following colon surgery and abdominal hysterectomy via a statewide external validationMETHODSInfection preventionists (IPs) from the California Department of Public Health (CDPH) performed on-site SSI validation for surgical procedures performed in hospitals that voluntarily participated. Validation involved chart review of SSI cases previously reported by hospitals plus review of patient records flagged for review by claims codes suggestive of SSI. We assessed the sensitivity of traditional surveillance and the added benefit of claims-based surveillance. We also evaluated the positive predictive value of claims-based surveillance (ie, workload efficiency).RESULTSUpon validation review, CDPH IPs identified 239 SSIs following colon surgery at 42 hospitals and 76 SSIs following abdominal hysterectomy at 34 hospitals. For colon surgery, traditional surveillance had a sensitivity of 50% (47% for deep incisional or organ/space [DI/OS] SSI), compared to 84% (88% for DI/OS SSI) for claims-based surveillance. For abdominal hysterectomy, traditional surveillance had a sensitivity of 68% (67% for DI/OS SSI) compared to 74% (78% for DI/OS SSI) for claims-based surveillance. Claims-based surveillance was also efficient, with 1 SSI identified for every 2 patients flagged for review who had undergone abdominal hysterectomy and for every 2.6 patients flagged for review who had undergone colon surgery. Overall, CDPH identified previously unreported SSIs in 74% of validation hospitals performing colon surgery and 35% of validation hospitals performing abdominal hysterectomy.CONCLUSIONSClaims-based surveillance is a standardized approach that hospitals can use to augment traditional surveillance methods and health departments can use for external validation.Infect Control Hosp Epidemiol 2017;38:1091–1097

Download Full-text

Using clinical variables to guide surgical site infection detection: A novel surveillance strategy

American Journal of Infection Control ◽

10.1016/j.ajic.2014.08.013 ◽

2014 ◽

Vol 42 (12) ◽

pp. 1291-1295 ◽

Cited By ~ 13

Author(s):

Westyn Branch-Elliman ◽

Judith Strymish ◽

Kamal M.F. Itani ◽

Kalpana Gupta

Keyword(s):

Surgical Site Infection ◽

Site Infection ◽

Surveillance Strategy ◽

Clinical Variables ◽

Infection Detection

Download Full-text

Artificial Intelligence–Based Multimodal Risk Assessment Model for Surgical Site Infection (AMRAMS): Development and Validation Study (Preprint)

10.2196/preprints.18186 ◽

2020 ◽

Author(s):

Weijia Chen ◽

Zhijun Lu ◽

Lijue You ◽

Lingling Zhou ◽

Jie Xu ◽

...

Keyword(s):

Risk Assessment ◽

Surgical Site Infection ◽

Risk Index ◽

External Validation ◽

The Self ◽

Assessment Model ◽

Risk Assessment Model ◽

Site Infection ◽

Attention Network ◽

The Third

BACKGROUND Surgical site infection (SSI) is one of the most common types of health care–associated infections. It increases mortality, prolongs hospital length of stay, and raises health care costs. Many institutions developed risk assessment models for SSI to help surgeons preoperatively identify high-risk patients and guide clinical intervention. However, most of these models had low accuracies. OBJECTIVE We aimed to provide a solution in the form of an Artificial intelligence–based Multimodal Risk Assessment Model for Surgical site infection (AMRAMS) for inpatients undergoing operations, using routinely collected clinical data. We internally and externally validated the discriminations of the models, which combined various machine learning and natural language processing techniques, and compared them with the National Nosocomial Infections Surveillance (NNIS) risk index. METHODS We retrieved inpatient records between January 1, 2014, and June 30, 2019, from the electronic medical record (EMR) system of Rui Jin Hospital, Luwan Branch, Shanghai, China. We used data from before July 1, 2018, as the development set for internal validation and the remaining data as the test set for external validation. We included patient demographics, preoperative lab results, and free-text preoperative notes as our features. We used word-embedding techniques to encode text information, and we trained the LASSO (least absolute shrinkage and selection operator) model, random forest model, gradient boosting decision tree (GBDT) model, convolutional neural network (CNN) model, and self-attention network model using the combined data. Surgeons manually scored the NNIS risk index values. RESULTS For internal bootstrapping validation, CNN yielded the highest mean area under the receiver operating characteristic curve (AUROC) of 0.889 (95% CI 0.886-0.892), and the paired-sample t test revealed statistically significant advantages as compared with other models (P<.001). The self-attention network yielded the second-highest mean AUROC of 0.882 (95% CI 0.878-0.886), but the AUROC was only numerically higher than the AUROC of the third-best model, GBDT with text embeddings (mean AUROC 0.881, 95% CI 0.878-0.884, P=.47). The AUROCs of LASSO, random forest, and GBDT models using text embeddings were statistically higher than the AUROCs of models not using text embeddings (P<.001). For external validation, the self-attention network yielded the highest AUROC of 0.879. CNN was the second-best model (AUROC 0.878), and GBDT with text embeddings was the third-best model (AUROC 0.872). The NNIS risk index scored by surgeons had an AUROC of 0.651. CONCLUSIONS Our AMRAMS based on EMR data and deep learning methods—CNN and self-attention network—had significant advantages in terms of accuracy compared with other conventional machine learning methods and the NNIS risk index. Moreover, the semantic embeddings of preoperative notes improved the model performance further. Our models could replace the NNIS risk index to provide personalized guidance for the preoperative intervention of SSIs. Through this case, we offered an easy-to-implement solution for building multimodal RAMs for other similar scenarios.

Download Full-text