scholarly journals Predicting postoperative surgical site infection with administrative data: a random forests algorithm

2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Yelena Petrosyan ◽  
Kednapa Thavorn ◽  
Glenys Smith ◽  
Malcolm Maclure ◽  
Roanne Preston ◽  
...  

Abstract Background Since primary data collection can be time-consuming and expensive, surgical site infections (SSIs) could ideally be monitored using routinely collected administrative data. We derived and internally validated efficient algorithms to identify SSIs within 30 days after surgery with health administrative data, using Machine Learning algorithms. Methods All patients enrolled in the National Surgical Quality Improvement Program from the Ottawa Hospital were linked to administrative datasets in Ontario, Canada. Machine Learning approaches, including a Random Forests algorithm and the high-performance logistic regression, were used to derive parsimonious models to predict SSI status. Finally, a risk score methodology was used to transform the final models into the risk score system. The SSI risk models were validated in the validation datasets. Results Of 14,351 patients, 795 (5.5%) had an SSI. First, separate predictive models were built for three distinct administrative datasets. The final model, including hospitalization diagnostic, physician diagnostic and procedure codes, demonstrated excellent discrimination (C statistics, 0.91, 95% CI, 0.90–0.92) and calibration (Hosmer-Lemeshow χ2 statistics, 4.531, p = 0.402). Conclusion We demonstrated that health administrative data can be effectively used to identify SSIs. Machine learning algorithms have shown a high degree of accuracy in predicting postoperative SSIs and can integrate and utilize a large amount of administrative data. External validation of this model is required before it can be routinely used to identify SSIs.

2020 ◽  
Author(s):  
Yelena Petrosyan ◽  
Kednapa Thavorn ◽  
Glenys Smith ◽  
Malcolm Maclure ◽  
Roanne Preston ◽  
...  

Abstract Background: Since primary data collection can be time-consuming and expensive, surgical site infections (SSIs) could ideally be monitored using routinely collected administrative data. We derived and internally validated efficient algorithms to identify SSIs within 30 days after surgery with health administrative data, using Machine Learning algorithms. All patients enrolled in the National Surgical Quality Improvement Program from the Ottawa Hospital were linked to administrative datasets in Ontario, Canada. Machine Learning approaches, including a Random Forests algorithm and the high-performance logistic regression, were used to derive parsimonious models to predict SSI status. Finally, a risk score methodology was used to transform the final models into the risk score system. The SSI risk models were validated in the validation datasets.Results: Of 14,351 patients, 795 (5.5%) had an SSI. First, separate predictive models were built for three distinct administrative datasets. The final model, including hospitalization diagnostic, physician diagnostic and procedure codes, demonstrated excellent discrimination (C statistics, 0.91, 95% CI, 0.90-0.92) and calibration (Hosmer-Lemeshow χ2 statistics, 4.531, p=0.402). Conclusion: We demonstrated that health administrative data can be effectively used to identify SSIs. Machine learning algorithms have shown a high degree of accuracy in predicting postoperative SSIs and can integrate and utilize a large amount of administrative data. External validation of this model is required before it can be routinely used to identify SSIs.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Alan Brnabic ◽  
Lisa M. Hess

Abstract Background Machine learning is a broad term encompassing a number of methods that allow the investigator to learn from the data. These methods may permit large real-world databases to be more rapidly translated to applications to inform patient-provider decision making. Methods This systematic literature review was conducted to identify published observational research of employed machine learning to inform decision making at the patient-provider level. The search strategy was implemented and studies meeting eligibility criteria were evaluated by two independent reviewers. Relevant data related to study design, statistical methods and strengths and limitations were identified; study quality was assessed using a modified version of the Luo checklist. Results A total of 34 publications from January 2014 to September 2020 were identified and evaluated for this review. There were diverse methods, statistical packages and approaches used across identified studies. The most common methods included decision tree and random forest approaches. Most studies applied internal validation but only two conducted external validation. Most studies utilized one algorithm, and only eight studies applied multiple machine learning algorithms to the data. Seven items on the Luo checklist failed to be met by more than 50% of published studies. Conclusions A wide variety of approaches, algorithms, statistical software, and validation strategies were employed in the application of machine learning methods to inform patient-provider decision making. There is a need to ensure that multiple machine learning approaches are used, the model selection strategy is clearly defined, and both internal and external validation are necessary to be sure that decisions for patient care are being made with the highest quality evidence. Future work should routinely employ ensemble methods incorporating multiple machine learning algorithms.


Cancers ◽  
2020 ◽  
Vol 12 (12) ◽  
pp. 3817
Author(s):  
Shi-Jer Lou ◽  
Ming-Feng Hou ◽  
Hong-Tai Chang ◽  
Chong-Chi Chiu ◽  
Hao-Hsien Lee ◽  
...  

No studies have discussed machine learning algorithms to predict recurrence within 10 years after breast cancer surgery. This study purposed to compare the accuracy of forecasting models to predict recurrence within 10 years after breast cancer surgery and to identify significant predictors of recurrence. Registry data for breast cancer surgery patients were allocated to a training dataset (n = 798) for model development, a testing dataset (n = 171) for internal validation, and a validating dataset (n = 171) for external validation. Global sensitivity analysis was then performed to evaluate the significance of the selected predictors. Demographic characteristics, clinical characteristics, quality of care, and preoperative quality of life were significantly associated with recurrence within 10 years after breast cancer surgery (p < 0.05). Artificial neural networks had the highest prediction performance indices. Additionally, the surgeon volume was the best predictor of recurrence within 10 years after breast cancer surgery, followed by hospital volume and tumor stage. Accurate recurrence within 10 years prediction by machine learning algorithms may improve precision in managing patients after breast cancer surgery and improve understanding of risk factors for recurrence within 10 years after breast cancer surgery.


Author(s):  
Dazhong Wu ◽  
Connor Jennings ◽  
Janis Terpenny ◽  
Robert X. Gao ◽  
Soundar Kumara

Manufacturers have faced an increasing need for the development of predictive models that predict mechanical failures and the remaining useful life (RUL) of manufacturing systems or components. Classical model-based or physics-based prognostics often require an in-depth physical understanding of the system of interest to develop closed-form mathematical models. However, prior knowledge of system behavior is not always available, especially for complex manufacturing systems and processes. To complement model-based prognostics, data-driven methods have been increasingly applied to machinery prognostics and maintenance management, transforming legacy manufacturing systems into smart manufacturing systems with artificial intelligence. While previous research has demonstrated the effectiveness of data-driven methods, most of these prognostic methods are based on classical machine learning techniques, such as artificial neural networks (ANNs) and support vector regression (SVR). With the rapid advancement in artificial intelligence, various machine learning algorithms have been developed and widely applied in many engineering fields. The objective of this research is to introduce a random forests (RFs)-based prognostic method for tool wear prediction as well as compare the performance of RFs with feed-forward back propagation (FFBP) ANNs and SVR. Specifically, the performance of FFBP ANNs, SVR, and RFs are compared using an experimental data collected from 315 milling tests. Experimental results have shown that RFs can generate more accurate predictions than FFBP ANNs with a single hidden layer and SVR.


2019 ◽  
Vol 31 (4) ◽  
pp. 568-578 ◽  
Author(s):  
Anshit Goyal ◽  
Che Ngufor ◽  
Panagiotis Kerezoudis ◽  
Brandon McCutcheon ◽  
Curtis Storlie ◽  
...  

OBJECTIVENonhome discharge and unplanned readmissions represent important cost drivers following spinal fusion. The authors sought to utilize different machine learning algorithms to predict discharge to rehabilitation and unplanned readmissions in patients receiving spinal fusion.METHODSThe authors queried the 2012–2013 American College of Surgeons National Surgical Quality Improvement Program (ACS-NSQIP) for patients undergoing cervical or lumbar spinal fusion. Outcomes assessed included discharge to nonhome facility and unplanned readmissions within 30 days after surgery. A total of 7 machine learning algorithms were evaluated. Predictive hierarchical clustering of procedure codes was used to increase model performance. Model performance was evaluated using overall accuracy and area under the receiver operating characteristic curve (AUC), as well as sensitivity, specificity, and positive and negative predictive values. These performance metrics were computed for both the imputed and unimputed (missing values dropped) datasets.RESULTSA total of 59,145 spinal fusion cases were analyzed. The incidence rates of discharge to nonhome facility and 30-day unplanned readmission were 12.6% and 4.5%, respectively. All classification algorithms showed excellent discrimination (AUC > 0.80, range 0.85–0.87) for predicting nonhome discharge. The generalized linear model showed comparable performance to other machine learning algorithms. By comparison, all models showed poorer predictive performance for unplanned readmission, with AUC ranging between 0.63 and 0.66. Better predictive performance was noted with models using imputed data.CONCLUSIONSIn an analysis of patients undergoing spinal fusion, multiple machine learning algorithms were found to reliably predict nonhome discharge with modest performance noted for unplanned readmissions. These results provide early evidence regarding the feasibility of modern machine learning classifiers in predicting these outcomes and serve as possible clinical decision support tools to facilitate shared decision making.


2018 ◽  
Vol 10 (10) ◽  
pp. 1513 ◽  
Author(s):  
Julio Duarte-Carvajalino ◽  
Diego Alzate ◽  
Andrés Ramirez ◽  
Juan Santa-Sepulveda ◽  
Alexandra Fajardo-Rojas ◽  
...  

This work presents quantitative prediction of severity of the disease caused by Phytophthora infestans in potato crops using machine learning algorithms such as multilayer perceptron, deep learning convolutional neural networks, support vector regression, and random forests. The machine learning algorithms are trained using datasets extracted from multispectral data captured at the canopy level with an unmanned aerial vehicle, carrying an inexpensive digital camera. The results indicate that deep learning convolutional neural networks, random forests and multilayer perceptron using band differences can predict the level of Phytophthora infestans affectation on potato crops with acceptable accuracy.


Sign in / Sign up

Export Citation Format

Share Document