MO353MACHINE LEARNING-BASED PREDICTION OF ACUTE KIDNEY INJURY AFTER NEPHRECTOMY IN PATIENTS WITH RENAL CELL CARCINOMA

2021 ◽  
Vol 36 (Supplement_1) ◽  
Author(s):  
Sejoong Kim ◽  
Yeonhee Lee ◽  
Seung Seok Han

Abstract Background and Aims The precise prediction of acute kidney injury (AKI) after nephrectomy for renal cell carcinoma (RCC) is an important issue because of its relationship with subsequent kidney dysfunction and high mortality. Herein we addressed whether machine learning algorithms could predict postoperative AKI risk better than conventional logistic regression (LR) models. Method A total of 4,104 RCC patients who had undergone unilateral nephrectomy from January 2003 to December 2017 were reviewed. Machine learning models such as support vector machine, random forest, extreme gradient boosting, and light gradient boosting machine (LightGBM) were developed, and their performance based on the area under the receiver operating characteristic curve, accuracy, and F1 score was compared with that of the LR-based scoring model. Results Postoperative AKI developed in 1,167 patients (28.4%). All the machine learning models had higher performance index values than the LR-based scoring model. Among them, the LightGBM model had the highest value of 0.810 (0.783–0.837). The decision curve analysis demonstrated a greater net benefit of the machine learning models than the LR-based scoring model over all the ranges of threshold probabilities. The LightGBM and random forest models, but not others, were well calibrated. Conclusion The application of machine learning algorithms improves the predictability of AKI after nephrectomy for RCC, and these models perform better than conventional LR-based models.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Yeonhee Lee ◽  
Jiwon Ryu ◽  
Min Woo Kang ◽  
Kyung Ha Seo ◽  
Jayoun Kim ◽  
...  

AbstractThe precise prediction of acute kidney injury (AKI) after nephrectomy for renal cell carcinoma (RCC) is an important issue because of its relationship with subsequent kidney dysfunction and high mortality. Herein we addressed whether machine learning (ML) algorithms could predict postoperative AKI risk better than conventional logistic regression (LR) models. A total of 4104 RCC patients who had undergone unilateral nephrectomy from January 2003 to December 2017 were reviewed. ML models such as support vector machine, random forest, extreme gradient boosting, and light gradient boosting machine (LightGBM) were developed, and their performance based on the area under the receiver operating characteristic curve, accuracy, and F1 score was compared with that of the LR-based scoring model. Postoperative AKI developed in 1167 patients (28.4%). All the ML models had higher performance index values than the LR-based scoring model. Among them, the LightGBM model had the highest value of 0.810 (0.783–0.837). The decision curve analysis demonstrated a greater net benefit of the ML models than the LR-based scoring model over all the ranges of threshold probabilities. The application of ML algorithms improves the predictability of AKI after nephrectomy for RCC, and these models perform better than conventional LR-based models.


2020 ◽  
Vol 2020 ◽  
pp. 1-8
Author(s):  
Cheng Qu ◽  
Lin Gao ◽  
Xian-qiang Yu ◽  
Mei Wei ◽  
Guo-quan Fang ◽  
...  

Background. Acute kidney injury (AKI) has long been recognized as a common and important complication of acute pancreatitis (AP). In the study, machine learning (ML) techniques were used to establish predictive models for AKI in AP patients during hospitalization. This is a retrospective review of prospectively collected data of AP patients admitted within one week after the onset of abdominal pain to our department from January 2014 to January 2019. Eighty patients developed AKI after admission (AKI group) and 254 patients did not (non-AKI group) in the hospital. With the provision of additional information such as demographic characteristics or laboratory data, support vector machine (SVM), random forest (RF), classification and regression tree (CART), and extreme gradient boosting (XGBoost) were used to build models of AKI prediction and compared to the predictive performance of the classic model using logistic regression (LR). XGBoost performed best in predicting AKI with an AUC of 91.93% among the machine learning models. The AUC of logistic regression analysis was 87.28%. Present findings suggest that compared to the classical logistic regression model, machine learning models using features that can be easily obtained at admission had a better performance in predicting AKI in the AP patients.


Author(s):  
Wachiranun Sirikul ◽  
Nida Buawangpong ◽  
Ratana Sapbamrer ◽  
Penprapa Siviroj

Background: Alcohol-related road-traffic injury is the leading cause of premature death in middle- and lower-income countries, including Thailand. Applying machine-learning algorithms can improve the effectiveness of driver-impairment screening strategies by legal limits. Methods: Using 4794 RTI drivers from secondary cross-sectional data from the Thai Governmental Road Safety Evaluation project in 2002–2004, the machine-learning models (Gradient Boosting Classifier: GBC, Multi-Layers Perceptrons: MLP, Random Forest: RF, K-Nearest Neighbor: KNN) and a parsimonious logistic regression (Logit) were developed for predicting the mortality risk from road-traffic injury in drunk drivers. The predictors included alcohol concentration level in blood or breath, driver characteristics and environmental factors. Results: Of 4974 drivers in the derived dataset, 4365 (92%) were surviving drivers and 429 (8%) were dead drivers. The class imbalance was rebalanced by the Synthetic Minority Oversampling Technique (SMOTE) into a 1:1 ratio. All models obtained good-to-excellent discrimination performance. The AUC of GBC, RF, KNN, MLP, and Logit models were 0.95 (95% CI 0.90 to 1.00), 0.92 (95% CI 0.87 to 0.97), 0.86 (95% CI 0.83 to 0.89), 0.83 (95% CI 0.78 to 0.88), and 0.81 (95% CI 0.75 to 0.87), respectively. MLP and GBC also had a good model calibration, visualized by the calibration plot. Conclusions: Our machine-learning models can predict road-traffic mortality risk with good model discrimination and calibration. External validation using current data is recommended for future implementation.


2020 ◽  
Author(s):  
Albert Morera ◽  
Juan Martínez de Aragón ◽  
José Antonio Bonet ◽  
Jingjing Liang ◽  
Sergio de-Miguel

Abstract BackgroundThe prediction of biogeographical patterns from a large number of driving factors with complex interactions, correlations and non-linear dependences require advanced analytical methods and modelling tools. This study compares different statistical and machine learning models for predicting fungal productivity biogeographical patterns as a case study for the thorough assessment of the performance of alternative modelling approaches to provide accurate and ecologically-consistent predictions.MethodsWe evaluated and compared the performance of two statistical modelling techniques, namely, generalized linear mixed models and geographically weighted regression, and four machine learning models, namely, random forest, extreme gradient boosting, support vector machine and deep learning to predict fungal productivity. We used a systematic methodology based on substitution, random, spatial and climatic blocking combined with principal component analysis, together with an evaluation of the ecological consistency of spatially-explicit model predictions.ResultsFungal productivity predictions were sensitive to the modelling approach and complexity. Moreover, the importance assigned to different predictors varied between machine learning modelling approaches. Decision tree-based models increased prediction accuracy by ~7% compared to other machine learning approaches and by more than 25% compared to statistical ones, and resulted in higher ecological consistence at the landscape level.ConclusionsWhereas a large number of predictors are often used in machine learning algorithms, in this study we show that proper variable selection is crucial to create robust models for extrapolation in biophysically differentiated areas. When dealing with spatial-temporal data in the analysis of biogeographical patterns, climatic blocking is postulated as a highly informative technique to be used in cross-validation to assess the prediction error over larger scales. Random forest was the best approach for prediction both in sampling-like environments as well as in extrapolation beyond the spatial and climatic range of the modelling data.


2021 ◽  
Vol 4 ◽  
Author(s):  
Bedil Karimov ◽  
Piotr Wójcik

Following the emergence of cryptocurrencies, the field of digital assets experienced a sudden explosion of interest among institutional investors. However, regarding ICOs, there were a lot of scams involving the disappearance of firms after they had collected significant amounts of funds. We study how well one can predict if an offering will turn out to be a scam, doing so based on the characteristics known ex-ante. We therefore examine which of these characteristics are the most important predictors of a scam, and how they influence the probability of a scam. We use detailed data with 160 features from about 300 ICOs that took place before March 2018 and succeeded in raising most of their required capital. Various machine learning algorithms are applied together with novel XAI tools in order to identify the most important predictors of an offering’s failure and understand the shape of relationships. It turns out that based on the features known ex-ante, one can predict a scam with an accuracy of about 65–70%, and that nonlinear machine learning models perform better than traditional logistic regression and its regularized extensions.


2020 ◽  
Vol 22 (10) ◽  
pp. 694-704 ◽  
Author(s):  
Wanben Zhong ◽  
Bineng Zhong ◽  
Hongbo Zhang ◽  
Ziyi Chen ◽  
Yan Chen

Aim and Objective: Cancer is one of the deadliest diseases, taking the lives of millions every year. Traditional methods of treating cancer are expensive and toxic to normal cells. Fortunately, anti-cancer peptides (ACPs) can eliminate this side effect. However, the identification and development of new anti Materials and Methods: In our study, a multi-classifier system was used, combined with multiple machine learning models, to predict anti-cancer peptides. These individual learners are composed of different feature information and algorithms, and form a multi-classifier system by voting. Results and Conclusion: The experiments show that the overall prediction rate of each individual learner is above 80% and the overall accuracy of multi-classifier system for anti-cancer peptides prediction can reach 95.93%, which is better than the existing prediction model.


2021 ◽  
pp. 1-15
Author(s):  
O. Basturk ◽  
C. Cetek

ABSTRACT In this study, prediction of aircraft Estimated Time of Arrival (ETA) is proposed using machine learning algorithms. Accurate prediction of ETA is important for management of delay and air traffic flow, runway assignment, gate assignment, collaborative decision making (CDM), coordination of ground personnel and equipment, and optimisation of arrival sequence etc. Machine learning is able to learn from experience and make predictions with weak assumptions or no assumptions at all. In the proposed approach, general flight information, trajectory data and weather data were obtained from different sources in various formats. Raw data were converted to tidy data and inserted into a relational database. To obtain the features for training the machine learning models, the data were explored, cleaned and transformed into convenient features. New features were also derived from the available data. Random forests and deep neural networks were used to train the machine learning models. Both models can predict the ETA with a mean absolute error (MAE) less than 6min after departure, and less than 3min after terminal manoeuvring area (TMA) entrance. Additionally, a web application was developed to dynamically predict the ETA using proposed models.


Viruses ◽  
2021 ◽  
Vol 13 (2) ◽  
pp. 252
Author(s):  
Laura M. Bergner ◽  
Nardus Mollentze ◽  
Richard J. Orton ◽  
Carlos Tello ◽  
Alice Broos ◽  
...  

The contemporary surge in metagenomic sequencing has transformed knowledge of viral diversity in wildlife. However, evaluating which newly discovered viruses pose sufficient risk of infecting humans to merit detailed laboratory characterization and surveillance remains largely speculative. Machine learning algorithms have been developed to address this imbalance by ranking the relative likelihood of human infection based on viral genome sequences, but are not yet routinely applied to viruses at the time of their discovery. Here, we characterized viral genomes detected through metagenomic sequencing of feces and saliva from common vampire bats (Desmodus rotundus) and used these data as a case study in evaluating zoonotic potential using molecular sequencing data. Of 58 detected viral families, including 17 which infect mammals, the only known zoonosis detected was rabies virus; however, additional genomes were detected from the families Hepeviridae, Coronaviridae, Reoviridae, Astroviridae and Picornaviridae, all of which contain human-infecting species. In phylogenetic analyses, novel vampire bat viruses most frequently grouped with other bat viruses that are not currently known to infect humans. In agreement, machine learning models built from only phylogenetic information ranked all novel viruses similarly, yielding little insight into zoonotic potential. In contrast, genome composition-based machine learning models estimated different levels of zoonotic potential, even for closely related viruses, categorizing one out of four detected hepeviruses and two out of three picornaviruses as having high priority for further research. We highlight the value of evaluating zoonotic potential beyond ad hoc consideration of phylogeny and provide surveillance recommendations for novel viruses in a wildlife host which has frequent contact with humans and domestic animals.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Moojung Kim ◽  
Young Jae Kim ◽  
Sung Jin Park ◽  
Kwang Gi Kim ◽  
Pyung Chun Oh ◽  
...  

Abstract Background Annual influenza vaccination is an important public health measure to prevent influenza infections and is strongly recommended for cardiovascular disease (CVD) patients, especially in the current coronavirus disease 2019 (COVID-19) pandemic. The aim of this study is to develop a machine learning model to identify Korean adult CVD patients with low adherence to influenza vaccination Methods Adults with CVD (n = 815) from a nationally representative dataset of the Fifth Korea National Health and Nutrition Examination Survey (KNHANES V) were analyzed. Among these adults, 500 (61.4%) had answered "yes" to whether they had received seasonal influenza vaccinations in the past 12 months. The classification process was performed using the logistic regression (LR), random forest (RF), support vector machine (SVM), and extreme gradient boosting (XGB) machine learning techniques. Because the Ministry of Health and Welfare in Korea offers free influenza immunization for the elderly, separate models were developed for the < 65 and ≥ 65 age groups. Results The accuracy of machine learning models using 16 variables as predictors of low influenza vaccination adherence was compared; for the ≥ 65 age group, XGB (84.7%) and RF (84.7%) have the best accuracies, followed by LR (82.7%) and SVM (77.6%). For the < 65 age group, SVM has the best accuracy (68.4%), followed by RF (64.9%), LR (63.2%), and XGB (61.4%). Conclusions The machine leaning models show comparable performance in classifying adult CVD patients with low adherence to influenza vaccination.


2021 ◽  
Author(s):  
Alejandro Celemín ◽  
Diego A. Estupiñan ◽  
Ricardo Nieto

Abstract Electrical Submersible Pumps reliability and run-life analysis has been extensively studied since its development. Current machine learning algorithms allow to correlate operational conditions to ESP run-life in order to generate predictions for active and new wells. Four machine learning models are compared to a linear proportional hazards model, used as a baseline for comparison purposes. Proper accuracy metrics for survival analysis problems are calculated on run-life predictions vs. actual values over training and validation data subsets. Results demonstrate that the baseline model is able to produce more consistent predictions with a slight reduction in its accuracy, compared to current machine learning models for small datasets. This study demonstrates that the quality of the date and it pre-processing supports the current shift from model-centric to data-centric approach to machine and deep learning problems.


Sign in / Sign up

Export Citation Format

Share Document