Automated Performance Metrics and Machine Learning Algorithms to Measure Surgeon Performance and Anticipate Clinical Outcomes in Robotic Surgery

Cardiovascular Diseases (CVDs) are a leading cause of death globally. In CVDs, the heart is unable to deliver enough blood to other body regions. As an effective and accurate diagnosis of CVDs is essential for CVD prevention and treatment, machine learning (ML) techniques can be effectively and reliably used to discern patients suffering from a CVD from those who do not suffer from any heart condition. Namely, machine learning algorithms (MLAs) play a key role in the diagnosis of CVDs through predictive models that allow us to identify the main risks factors influencing CVD development. In this study, we analyze the performance of ten MLAs on two datasets for CVD prediction and two for CVD diagnosis. Algorithm performance is analyzed on top-two and top-four dataset attributes/features with respect to five performance metrics –accuracy, precision, recall, f1-score, and roc-auc—using the train-test split technique and k-fold cross-validation. Our study identifies the top-two and top-four attributes from CVD datasets analyzing the performance of the accuracy metrics to determine that they are the best for predicting and diagnosing CVD. As our main findings, the ten ML classifiers exhibited appropriate diagnosis in classification and predictive performance with accuracy metric with top-two attributes, identifying three main attributes for diagnosis and prediction of a CVD such as arrhythmia and tachycardia; hence, they can be successfully implemented for improving current CVD diagnosis efforts and help patients around the world, especially in regions where medical staff is lacking.

Download Full-text

Haar Wavelet Pyramid-Based Melanoma Skin Cancer Identification With Ensemble of Machine Learning Algorithms

International Journal of Healthcare Information Systems and Informatics ◽

10.4018/ijhisi.20211001.oa24 ◽

2021 ◽

Vol 16 (4) ◽

pp. 1-15

Author(s):

Sudeep D. Thepade ◽

Gaurav Ramnani

Keyword(s):

Machine Learning ◽

Skin Cancer ◽

Health Informatics ◽

Performance Metrics ◽

Learning Algorithms ◽

Haar Wavelet ◽

Machine Learning Algorithms ◽

Computer Assisted ◽

Marginal Improvement ◽

Wavelet Pyramid

Melanoma is a mortal type of skin cancer. Early detection of melanoma significantly improves the patient’s chances of survival. Detection of melanoma at an early juncture demands expert doctors. The scarcity of such expert doctors is a major issue with healthcare systems globally. Computer-assisted diagnostics may prove helpful in this case. This paper proposes a health informatics system for melanoma identification using machine learning with dermoscopy skin images. In the proposed method, the features of dermoscopy skin images are extracted using the Haar wavelet pyramid various levels. These features are employed to train machine learning algorithms and ensembles for melanoma identification. The consideration of higher levels of Haar Wavelet Pyramid helps speed up the identification process. It is observed that the performance gradually improves from the Haar wavelet pyramid level 4x4 to 16x16, and shows marginal improvement further. The ensembles of machine learning algorithms have shown a boost in performance metrics compared to the use of individual machine learning algorithms.

Download Full-text

Surgical skill assessment using machine learning algorithms

British Journal of Surgery ◽

10.1093/bjs/znab202.093 ◽

2021 ◽

Vol 108 (Supplement_4) ◽

Author(s):

J L Lavanchy ◽

J Zindel ◽

K Kirtac ◽

I Twick ◽

E Hosgor ◽

...

Keyword(s):

Machine Learning ◽

Clinical Outcomes ◽

Learning Algorithm ◽

Learning Algorithms ◽

Movement Patterns ◽

Surgical Skills ◽

Machine Learning Algorithms ◽

Skill Assessment ◽

Surgical Skill ◽

Surgical Skill Assessment

Abstract Objective Surgical skill is correlated with clinical outcomes. Therefore, the assessment of surgical skill is of major importance to improve clinical outcomes and increase patient safety. However, surgical skill assessment often lacks objectivity and reproducibility. Furthermore, it is time-consuming and expensive. Therefore, we developed an automated surgical skill assessment using machine learning algorithms. Methods Surgical skills were assessed in videos of laparoscopic cholecystectomy using a three-step machine learning algorithm. First, a three-dimensional convolutional neural network was trained to localize and classify the instruments within the videos. Second, movement patterns of the instruments were recorded over time and extracted. Third, the movement patterns were correlated with human surgical skill ratings using a linear regression model to predict surgical skill ratings automatically. Machine ratings were compared against human ratings of four board certified surgeons using a score ranging from 1 (poor skills) to 5 (excellent skills). Results Human raters and machine learning algorithms assessed surgical skills in 242 videos. Inter-rater reliability for human raters was excellent (79%, 95%CI 72-85%). Instrument detection showed an average precision of 78% and average recall of 82%. Machine learning algorithms showed an 87% accuracy in predicting good or poor surgical skills, when compared to human raters. Conclusion Machine learning algorithms can be trained to distinguish good and poor surgical skills with high accuracy. This work was published in Sci Rep 11, 5197 (2021). https://doi.org/10.1038/s41598-021-84295-6

Download Full-text

Performance Evaluation of Different Machine Learning Classification Algorithms for Disease Diagnosis

International Journal of E-Health and Medical Communications ◽

10.4018/ijehmc.20211101.oa5 ◽

2021 ◽

Vol 12 (6) ◽

pp. 1-28

Author(s):

Munder Abdulatef Al-Hashem ◽

Ali Mohammad Alqudah ◽

Qasem Qananwah

Keyword(s):

Machine Learning ◽

Nearest Neighbor ◽

Performance Metrics ◽

Confusion Matrix ◽

Learning Algorithms ◽

Disease Diagnosis ◽

Machine Learning Algorithms ◽

Classification Algorithms ◽

K Nearest Neighbor ◽

Machine Learning Classification

Knowledge extraction within a healthcare field is a very challenging task since we are having many problems such as noise and imbalanced datasets. They are obtained from clinical studies where uncertainty and variability are popular. Lately, a wide number of machine learning algorithms are considered and evaluated to check their validity of being used in the medical field. Usually, the classification algorithms are compared against medical experts who are specialized in certain disease diagnoses and provide an effective methodological evaluation of classifiers by applying performance metrics. The performance metrics contain four criteria: accuracy, sensitivity, and specificity forming the confusion matrix of each used algorithm. We have utilized eight different well-known machine learning algorithms to evaluate their performances in six different medical datasets. Based on the experimental results we conclude that the XGBoost and K-Nearest Neighbor classifiers were the best overall among the used datasets and signs can be used for diagnosing various diseases.

Download Full-text

Classifying lymphoma and tuberculosis case reports using machine learning algorithms

Bulletin of Electrical Engineering and Informatics ◽

10.11591/eei.v10i5.3132 ◽

2021 ◽

Vol 10 (5) ◽

pp. 2857-2865

Author(s):

Moanda Diana Pholo ◽

Yskandar Hamam ◽

Abdel Baset Khalaf ◽

Chunling Du

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Performance Metrics ◽

Case Reports ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Tuberculosis Case ◽

Starting Point

Available literature reports several lymphoma cases misdiagnosed as tuberculosis, especially in countries with a heavy TB burden. This frequent misdiagnosis is due to the fact that the two diseases can present with similar symptoms. The present study therefore aims to analyse and explore TB as well as lymphoma case reports using Natural Language Processing tools and evaluate the use of machine learning to differentiate between the two diseases. As a starting point in the study, case reports were collected for each disease using web scraping. Natural language processing tools and text clustering were then used to explore the created dataset. Finally, six machine learning algorithms were trained and tested on the collected data, which contained 765 lymphoma and 546 tuberculosis case reports. Each method was evaluated using various performance metrics. The results indicated that the multi-layer perceptron model achieved the best accuracy (93.1%), recall (91.9%) and precision score (93.7%), thus outperforming other algorithms in terms of correctly classifying the different case reports.

Download Full-text

Identifying the Main Risk Factors for CVD Prediction Using Machine Learning Algorithms

10.20944/preprints202108.0471.v1 ◽

2021 ◽

Author(s):

Luis Rolando Guarneros-Nolasco ◽

Nancy Aracely Cruz-Ramos ◽

Giner Alor-Hernández ◽

Lisbeth Rodríguez-Mazahua ◽

José Luis Sánchez-Cervantes

Keyword(s):

Machine Learning ◽

Cross Validation ◽

Performance Metrics ◽

Learning Algorithms ◽

Predictive Performance ◽

Machine Learning Algorithms ◽

Algorithm Performance ◽

Body Regions ◽

Risks Factors ◽

Fold Cross Validation

CVDs are a leading cause of death globally. In CVDs, the heart is unable to deliver enough blood to other body regions. Since effective and accurate diagnosis of CVDs is essential for CVD prevention and treatment, machine learning (ML) techniques can be effectively and reliably used to discern patients suffering from a CVD from those who do not suffer from any heart condition. Namely, machine learning algorithms (MLAs) play a key role in the diagnosis of CVDs through predictive models that allow us to identify the main risks factors influencing CVD development. In this study, we analyze the performance of ten MLAs on two datasets for CVD prediction and two for CVD diagnosis. Algorithm performance is analyzed on top-two and top-four dataset attributes/features with respect to five performance metrics –accuracy, precision, recall, f1-score, and roc-auc – using the train-test split technique and k-fold cross-validation. Our study identifies the top two and four attributes from each CVD diagnosis/prediction dataset. As our main findings, the ten MLAs exhibited appropriate diagnosis and predictive performance; hence, they can be successfully implemented for improving current CVD diagnosis efforts and help patients around the world, especially in regions where medical staff is lacking.

Download Full-text

Analysis of Machine Learning Algorithms for Anomaly Detection on Edge Devices

Sensors ◽

10.3390/s21144946 ◽

2021 ◽

Vol 21 (14) ◽

pp. 4946

Author(s):

Aleks Huč ◽

Jakob Šalej ◽

Mira Trebar

Keyword(s):

Machine Learning ◽

Performance Metrics ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Training Dataset ◽

Raspberry Pi ◽

Comparable Performance ◽

Reduction Methods ◽

Performance Results ◽

The Internet Of Things

The Internet of Things (IoT) consists of small devices or a network of sensors, which permanently generate huge amounts of data. Usually, they have limited resources, either computing power or memory, which means that raw data are transferred to central systems or the cloud for analysis. Lately, the idea of moving intelligence to the IoT is becoming feasible, with machine learning (ML) moved to edge devices. The aim of this study is to provide an experimental analysis of processing a large imbalanced dataset (DS2OS), split into a training dataset (80%) and a test dataset (20%). The training dataset was reduced by randomly selecting a smaller number of samples to create new datasets Di (i = 1, 2, 5, 10, 15, 20, 40, 60, 80%). Afterwards, they were used with several machine learning algorithms to identify the size at which the performance metrics show saturation and classification results stop improving with an F1 score equal to 0.95 or higher, which happened at 20% of the training dataset. Further on, two solutions for the reduction of the number of samples to provide a balanced dataset are given. In the first, datasets DRi consist of all anomalous samples in seven classes and a reduced majority class (‘NL’) with i = 0.1, 0.2, 0.5, 1, 2, 5, 10, 15, 20 percent of randomly selected samples. In the second, datasets DCi are generated from the representative samples determined with clustering from the training dataset. All three dataset reduction methods showed comparable performance results. Further evaluation of training times and memory usage on Raspberry Pi 4 shows a possibility to run ML algorithms with limited sized datasets on edge devices.

Download Full-text

The use and applicability of machine learning algorithms in predicting the surgical outcome for patients with benign prostatic enlargement. Which model to use?

Archivio Italiano di Urologia e Andrologia ◽

10.4081/aiua.2021.4.418 ◽

2021 ◽

Vol 93 (4) ◽

pp. 418-424

Author(s):

Panagiotis Mourmouris ◽

Lazaros Tzelves ◽

Georgios Feretzakis ◽

Dimitris Kalles ◽

Ioannis Manolitsis ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Clinical Outcomes ◽

Learning Algorithms ◽

Small Sample ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Benign Prostatic Enlargement ◽

Statistical Measures ◽

Prostatic Enlargement

Objectives: Artificial intelligence (AI) is increasingly used in medicine, but data on benign prostatic enlargement (BPE) management are lacking. This study aims to test the performance of several machine learning algorithms, in predicting clinical outcomes during BPE surgical management. Methods: Clinical data were extracted from a prospectively collected database for 153 men with BPE, treated with transurethral resection (monopolar or bipolar) or vaporization of the prostate. Due to small sample size, we applied a method for increasing our dataset, Synthetic Minority Oversampling Technique (SMOTE). The new dataset created with SMOTE has been expanded by 453 synthetic instances, in addition to the original 153. The WEKA Data Mining Software was used for constructing predictive models, while several appropriate statistical measures, like Correlation coefficient (R), Mean Absolute Error (MAE), Root Mean-Squared Error (RMSE), were calculated with several supervised regression algorithms - techniques (Linear Regression, Multilayer Perceptron, SMOreg, k-Nearest Neighbors, Bagging, M5Rules, M5P - Pruned Model Tree, and Random forest). Results: The baseline characteristics of patients were extracted, with age, prostate volume, method of operation, baseline Qmax and baseline IPSS being used as independent variables. Using the Random Forest algorithm resulted in values of R, MAE, RMSE that indicate the ability of these models to better predict % Qmax increase. The Random Forest model also demonstrated the best results in R, MAE, RMSE for predicting % IPSS reduction.Conclusions: Machine Learning techniques can be used for making predictions regarding clinical outcomes of surgical BPRE management. Wider-scale validation studies are necessary to strengthen our results in choosing the best model.

Download Full-text

Prediction of addiction to drugs and alcohol using machine learning: A case study on Bangladeshi population

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v11i5.pp4471-4480 ◽

2021 ◽

Vol 11 (5) ◽

pp. 4471

Author(s):

Md. Ariful Islam Arif ◽

Saiful Islam Sany ◽

Farah Sharmin ◽

Md. Sadekur Rahman ◽

Md. Tarek Habib

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Performance Metrics ◽

Learning Algorithms ◽

Principal Component ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Data Set ◽

Adaptive Boosting ◽

Drugs And Alcohol

Nowadays addiction to drugs and alcohol has become a significant threat to the youth of the society as Bangladesh’s population. So, being a conscientious member of society, we must go ahead to prevent these young minds from life-threatening addiction. In this paper, we approach a machinelearning-based way to forecast the risk of becoming addicted to drugs using machine-learning algorithms. First, we find some significant factors for addiction by talking to doctors, drug-addicted people, and read relevant articles and write-ups. Then we collect data from both addicted and nonaddicted people. After preprocessing the data set, we apply nine conspicuous machine learning algorithms, namely k-nearest neighbors, logistic regression, SVM, naïve bayes, classification, and regression trees, random forest, multilayer perception, adaptive boosting, and gradient boosting machine on our processed data set and measure the performances of each of these classifiers in terms of some prominent performance metrics. Logistic regression is found outperforming all other classifiers in terms of all metrics used by attaining an accuracy approaching 97.91%. On the contrary, CART shows poor results of an accuracy approaching 59.37% after applying principal component analysis.

Download Full-text