Apply Machine Learning Methods to Predict Failure of Glaucoma Drainage

Paul Morrison; Maxwell Dixon; Arsham Sheybani; Bahareh Rahmani

doi:10.5121/ijdkp.2021.11101

Apply Machine Learning Methods to Predict Failure of Glaucoma Drainage

International Journal of Data Mining & Knowledge Management Process ◽

10.5121/ijdkp.2021.11101 ◽

2021 ◽

Vol 11 (1) ◽

pp. 1-12

Author(s):

Paul Morrison ◽

Maxwell Dixon ◽

Arsham Sheybani ◽

Bahareh Rahmani

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Intraocular Pressure ◽

Random Forest ◽

Glaucoma Drainage Device ◽

Recursive Feature Elimination ◽

Support Vector ◽

Demographic Information ◽

Drainage Device ◽

Machine Learning Methods

The purpose of this retrospective study is to measure machine learning models' ability to predict glaucoma drainage device failure based on demographic information and preoperative measurements. The medical records of 165 patients were used. Potential predictors included the patients' race, age, sex, preoperative intraocular pressure (IOP), preoperative visual acuity, number of IOP-lowering medications, and number and type of previous ophthalmic surgeries. Failure was defined as final IOP greater than 18 mm Hg, reduction in intraocular pressure less than 20% from baseline, or need for reoperation unrelated to normal implant maintenance. Five classifiers were compared: logistic regression, artificial neural network, random forest, decision tree, and support vector machine. Recursive feature elimination was used to shrink the number of predictors and grid search was used to choose hyperparameters. To prevent leakage, nested cross-validation was used throughout. With a small amount of data, the best classfier was logistic regression, but with more data, the best classifier was the random forest.

Predicting Failures of Molteno and Baerveldt Glaucoma Drainage Devices Using Machine Learning Models

10.5121/csit.2020.101610 ◽

2020 ◽

Author(s):

Paul Morrison ◽

Maxwell Dixon ◽

Arsham Sheybani ◽

Bahareh Rahmani

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Glaucoma Drainage Device ◽

Recursive Feature Elimination ◽

Support Vector ◽

Learning Models ◽

Demographic Information ◽

Drainage Device ◽

Glaucoma Drainage Devices ◽

Machine Learning Models

The purpose of this retrospective study is to measure machine learning models' ability to predict glaucoma drainage device (GDD) failure based on demographic information and preoperative measurements. The medical records of sixty-two patients were used. Potential predictors included the patient's race, age, sex, preoperative intraocular pressure (IOP), preoperative visual acuity, number of IOP-lowering medications, and number and type of previous ophthalmic surgeries. Failure was defined as final IOP greater than 18 mm Hg, reduction in IOP less than 20% from baseline, or need for reoperation unrelated to normal implant maintenance. Five classifiers were compared: logistic regression, artificial neural network, random forest, decision tree, and support vector machine. Recursive feature elimination was used to shrink the number of predictors and grid search was used to choose hyperparameters. To prevent leakage, nested cross-validation was used throughout. Overall, the best classifier was logistic regression.

Predicting failures of Molteno and Baerveldt glaucoma drainage devices using machine learning models

10.1101/646885 ◽

2019 ◽

Author(s):

Paul Morrison ◽

Maxwell Dixon ◽

Arsham Sheybani ◽

Bahareh Rahmani

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Intraocular Pressure ◽

Glaucoma Drainage Device ◽

Recursive Feature Elimination ◽

Support Vector ◽

Learning Models ◽

Glaucoma Drainage Devices ◽

Pressure Lowering ◽

Machine Learning Models

AbstractThe purpose of this retrospective study is to measure machine learning models’ ability to predict glaucoma drainage device failure based on demographic information and preoperative measurements. The medical records of sixty-two patients were used. Potential predictors included the patient’s race, age, sex, preoperative intraocular pressure, preoperative visual acuity, number of intraocular pressure-lowering medications, and number and type of previous ophthalmic surgeries. Failure was defined as final intraocular pressure greater than 18 mm Hg, reduction in intraocular pressure less than 20% from baseline, or need for reoperation unrelated to normal implant maintenance. Five classifiers were compared: logistic regression, artificial neural network, random forest, decision tree, and support vector machine. Recursive feature elimination was used to shrink the number of predictors and grid search was used to choose hyperparameters. To prevent leakage, nested cross-validation was used throughout. Overall, the best classifier was logistic regression.

Detecting Face Touching Using Smartwatches to Mitigate the Spread of COVID-19: Pilot Study (Preprint)

10.2196/preprints.28799 ◽

2021 ◽

Author(s):

Chen Bai ◽

Yu-Peng Chen ◽

Adam Wolach ◽

Lisa Anthony ◽

Mamoun Mardini

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Random Forest ◽

Respiratory Diseases ◽

Window Size ◽

Support Vector ◽

Accelerometer Data ◽

Respiratory Illnesses ◽

Motion Data ◽

Machine Learning Methods

BACKGROUND Frequent spontaneous facial self-touches, predominantly during outbreaks, have the theoretical potential to be a mechanism of contracting and transmitting diseases. Despite the recent advent of vaccines, behavioral approaches remain an integral part of reducing the spread of COVID-19 and other respiratory illnesses. Real-time biofeedback of face touching can potentially mitigate the spread of respiratory diseases. The gap addressed in this study is the lack of an on-demand platform that utilizes motion data from smartwatches to accurately detect face touching. OBJECTIVE The aim of this study was to utilize the functionality and the spread of smartwatches to develop a smartwatch application to identifying motion signatures that are mapped accurately to face touching. METHODS Participants (n=10, 50% women, aged 20-83) performed 10 physical activities classified into: face touching (FT) and non-face touching (NFT) categories, in a standardized laboratory setting. We developed a smartwatch application on Samsung Galaxy Watch to collect raw accelerometer data from participants. Then, data features were extracted from consecutive non-overlapping windows varying from 2-16 seconds. We examined the performance of state-of-the-art machine learning methods on face touching movements recognition (FT vs NFT) and individual activity recognition (IAR): logistic regression, support vector machine, decision trees and random forest. RESULTS Machine learning models were accurate in recognizing face touching categories; logistic regression achieved the best performance across all metrics (Accuracy: 0.93 +/- 0.08, Recall: 0.89 +/- 0.16, Precision: 0.93 +/- 0.08, F1-score: 0.90 +/- 0.11, AUC: 0.95 +/- 0.07) at the window size of 5 seconds. IAR models resulted in lower performance; the random forest classifier achieved the best performance across all metrics (Accuracy: 0.70 +/- 0.14, Recall: 0.70 +/- 0.14, Precision: 0.70 +/- 0.16, F1-score: 0.67 +/- 0.15) at the window size of 9 seconds. CONCLUSIONS Wearable devices, powered with machine learning, are effective in detecting facial touches. This is highly significant during respiratory infection outbreaks, as it has a great potential to refrain people from touching their faces and potentially mitigate the possibility of transmitting COVID-19 and future respiratory diseases.

Comparative Analysis of Intellectual Methods for Muscular Contraction Interpretation for Gesture Interface Implementation

Journal of Physics Conference Series ◽

10.1088/1742-6596/2096/1/012190 ◽

2021 ◽

Vol 2096 (1) ◽

pp. 012190

Author(s):

E V Bunyaeva ◽

I V Kuznetsov ◽

Y V Ponomarchuk ◽

P S Timosh

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Logistic Regression ◽

Comparative Analysis ◽

Random Forest ◽

Decision Tree ◽

Single Channel ◽

Muscular Contraction ◽

Support Vector ◽

Machine Learning Methods

Abstract The paper considers comparative analysis results of the machine learning methods used for the gesture recognition based on the surface single-channel electromyography (sEMG) data. The data were processed using multilayer perceptron, support vector machine, decision tree ensemble (Random Forest) and logistic regression for the chosen four gesture types. The conclusion was derived on the analysis efficiency of these methods using commonly recommended accuracy metrics.

Perbandingan Algoritma Machine Learning dalam Menilai Sebuah Lokasi Toko Ritel

Jurnal Teknik Informatika dan Sistem Informasi ◽

10.28932/jutisi.v7i1.3182 ◽

2021 ◽

Vol 7 (1) ◽

Author(s):

Kristiawan Kristiawan ◽

Andreas Widjaja

Keyword(s):

Neural Network ◽

Machine Learning ◽

Logistic Regression ◽

Random Forest ◽

Pearson Correlation ◽

Recursive Feature Elimination ◽

Support Vector ◽

Learning Technology ◽

K Nearest Neighbor ◽

Store Location

Abstract — The application of machine learning technology in various industrial fields is currently developing rapidly, including in the retail industry. This study aims to find the most accurate algorithmic model so that it can be used to help retailers choose a store location more precisely. By using several methods such as Pearson Correlation, Chi-Square Features, Recursive Feature Elimination and Tree-based to select features (predictive variables). These features are then used to train and build models using 6 different classification algorithms such as Logistic Regression, K Nearest Neighbor (KNN), Decision Tree, Random Forest, Support Vector Machine (SVM) and Neural Network to classify whether a location is recommended or not as a new store location. Keywords— Application of Machine Learning, Pearson Correlation, Random Forest, Neural Network, Logistic Regression.

Using Smartwatches to Detect Face Touching

Sensors ◽

10.3390/s21196528 ◽

2021 ◽

Vol 21 (19) ◽

pp. 6528

Author(s):

Chen Bai ◽

Yu-Peng Chen ◽

Adam Wolach ◽

Lisa Anthony ◽

Mamoun T. Mardini

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Random Forest ◽

State Of The Art ◽

Window Size ◽

Support Vector ◽

Accelerometer Data ◽

Respiratory Illnesses ◽

Machine Learning Methods ◽

Movement Recognition

Frequent spontaneous facial self-touches, predominantly during outbreaks, have the theoretical potential to be a mechanism of contracting and transmitting diseases. Despite the recent advent of vaccines, behavioral approaches remain an integral part of reducing the spread of COVID-19 and other respiratory illnesses. The aim of this study was to utilize the functionality and the spread of smartwatches to develop a smartwatch application to identify motion signatures that are mapped accurately to face touching. Participants (n = 10, five women, aged 20–83) performed 10 physical activities classified into face touching (FT) and non-face touching (NFT) categories in a standardized laboratory setting. We developed a smartwatch application on Samsung Galaxy Watch to collect raw accelerometer data from participants. Data features were extracted from consecutive non-overlapping windows varying from 2 to 16 s. We examined the performance of state-of-the-art machine learning methods on face-touching movement recognition (FT vs. NFT) and individual activity recognition (IAR): logistic regression, support vector machine, decision trees, and random forest. While all machine learning models were accurate in recognizing FT categories, logistic regression achieved the best performance across all metrics (accuracy: 0.93 ± 0.08, recall: 0.89 ± 0.16, precision: 0.93 ± 0.08, F1-score: 0.90 ± 0.11, AUC: 0.95 ± 0.07) at the window size of 5 s. IAR models resulted in lower performance, where the random forest classifier achieved the best performance across all metrics (accuracy: 0.70 ± 0.14, recall: 0.70 ± 0.14, precision: 0.70 ± 0.16, F1-score: 0.67 ± 0.15) at the window size of 9 s. In conclusion, wearable devices, powered by machine learning, are effective in detecting facial touches. This is highly significant during respiratory infection outbreaks as it has the potential to limit face touching as a transmission vector.

Possibility of Autonomous Estimation of Shiba Goat’s Estrus and Non-Estrus Behavior by Machine Learning Methods

Animals ◽

10.3390/ani10050771 ◽

2020 ◽

Vol 10 (5) ◽

pp. 771

Author(s):

Toshiya Arakawa

Keyword(s):

Neural Network ◽

Machine Learning ◽

Random Forest ◽

Markov Models ◽

Tracking System ◽

Video Tracking ◽

Training Data ◽

Support Vector ◽

Learning Methods ◽

Machine Learning Methods

Mammalian behavior is typically monitored by observation. However, direct observation requires a substantial amount of effort and time, if the number of mammals to be observed is sufficiently large or if the observation is conducted for a prolonged period. In this study, machine learning methods as hidden Markov models (HMMs), random forests, support vector machines (SVMs), and neural networks, were applied to detect and estimate whether a goat is in estrus based on the goat’s behavior; thus, the adequacy of the method was verified. Goat’s tracking data was obtained using a video tracking system and used to estimate whether they, which are in “estrus” or “non-estrus”, were in either states: “approaching the male”, or “standing near the male”. Totally, the PC of random forest seems to be the highest. However, The percentage concordance (PC) value besides the goats whose data were used for training data sets is relatively low. It is suggested that random forest tend to over-fit to training data. Besides random forest, the PC of HMMs and SVMs is high. However, considering the calculation time and HMM’s advantage in that it is a time series model, HMM is better method. The PC of neural network is totally low, however, if the more goat’s data were acquired, neural network would be an adequate method for estimation.

A Framework for Effective Application of Machine Learning to Microbiome-Based Classification Problems

mBio ◽

10.1128/mbio.00434-20 ◽

2020 ◽

Vol 11 (3) ◽

Cited By ~ 9

Author(s):

Begüm D. Topçuoğlu ◽

Nicholas A. Lesniak ◽

Mack T. Ruffin ◽

Jenna Wiens ◽

Patrick D. Schloss

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Random Forest ◽

Sequence Data ◽

Characteristic Curve ◽

Predictive Performance ◽

Model Complexity ◽

Support Vector ◽

Classification Problems ◽

Microbial Biomarkers

ABSTRACT Machine learning (ML) modeling of the human microbiome has the potential to identify microbial biomarkers and aid in the diagnosis of many diseases such as inflammatory bowel disease, diabetes, and colorectal cancer. Progress has been made toward developing ML models that predict health outcomes using bacterial abundances, but inconsistent adoption of training and evaluation methods call the validity of these models into question. Furthermore, there appears to be a preference by many researchers to favor increased model complexity over interpretability. To overcome these challenges, we trained seven models that used fecal 16S rRNA sequence data to predict the presence of colonic screen relevant neoplasias (SRNs) (n = 490 patients, 261 controls and 229 cases). We developed a reusable open-source pipeline to train, validate, and interpret ML models. To show the effect of model selection, we assessed the predictive performance, interpretability, and training time of L2-regularized logistic regression, L1- and L2-regularized support vector machines (SVM) with linear and radial basis function kernels, a decision tree, random forest, and gradient boosted trees (XGBoost). The random forest model performed best at detecting SRNs with an area under the receiver operating characteristic curve (AUROC) of 0.695 (interquartile range [IQR], 0.651 to 0.739) but was slow to train (83.2 h) and not inherently interpretable. Despite its simplicity, L2-regularized logistic regression followed random forest in predictive performance with an AUROC of 0.680 (IQR, 0.625 to 0.735), trained faster (12 min), and was inherently interpretable. Our analysis highlights the importance of choosing an ML approach based on the goal of the study, as the choice will inform expectations of performance and interpretability. IMPORTANCE Diagnosing diseases using machine learning (ML) is rapidly being adopted in microbiome studies. However, the estimated performance associated with these models is likely overoptimistic. Moreover, there is a trend toward using black box models without a discussion of the difficulty of interpreting such models when trying to identify microbial biomarkers of disease. This work represents a step toward developing more-reproducible ML practices in applying ML to microbiome research. We implement a rigorous pipeline and emphasize the importance of selecting ML models that reflect the goal of the study. These concepts are not particular to the study of human health but can also be applied to environmental microbiology studies.

Classification models using circulating neutrophil transcripts can detect unruptured intracranial aneurysm

Journal of Translational Medicine ◽

10.1186/s12967-020-02550-2 ◽

2020 ◽

Vol 18 (1) ◽

Author(s):

Kerry E. Poppenberg ◽

Vincent M. Tutino ◽

Lu Li ◽

Muhammad Waqas ◽

Armond June ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Prediction Models ◽

Model Performance ◽

Supervised Machine Learning ◽

Support Vector ◽

Learning Methods ◽

Training Cohort ◽

Network Analyses ◽

Machine Learning Methods

Abstract Background Intracranial aneurysms (IAs) are dangerous because of their potential to rupture. We previously found significant RNA expression differences in circulating neutrophils between patients with and without unruptured IAs and trained machine learning models to predict presence of IA using 40 neutrophil transcriptomes. Here, we aim to develop a predictive model for unruptured IA using neutrophil transcriptomes from a larger population and more robust machine learning methods. Methods Neutrophil RNA extracted from the blood of 134 patients (55 with IA, 79 IA-free controls) was subjected to next-generation RNA sequencing. In a randomly-selected training cohort (n = 94), the Least Absolute Shrinkage and Selection Operator (LASSO) selected transcripts, from which we constructed prediction models via 4 well-established supervised machine-learning algorithms (K-Nearest Neighbors, Random Forest, and Support Vector Machines with Gaussian and cubic kernels). We tested the models in the remaining samples (n = 40) and assessed model performance by receiver-operating-characteristic (ROC) curves. Real-time quantitative polymerase chain reaction (RT-qPCR) of 9 IA-associated genes was used to verify gene expression in a subset of 49 neutrophil RNA samples. We also examined the potential influence of demographics and comorbidities on model prediction. Results Feature selection using LASSO in the training cohort identified 37 IA-associated transcripts. Models trained using these transcripts had a maximum accuracy of 90% in the testing cohort. The testing performance across all methods had an average area under ROC curve (AUC) = 0.97, an improvement over our previous models. The Random Forest model performed best across both training and testing cohorts. RT-qPCR confirmed expression differences in 7 of 9 genes tested. Gene ontology and IPA network analyses performed on the 37 model genes reflected dysregulated inflammation, cell signaling, and apoptosis processes. In our data, demographics and comorbidities did not affect model performance. Conclusions We improved upon our previous IA prediction models based on circulating neutrophil transcriptomes by increasing sample size and by implementing LASSO and more robust machine learning methods. Future studies are needed to validate these models in larger cohorts and further investigate effect of covariates.

Machine Learning–Based Signal Quality Evaluation of Single-Period Radial Artery Pulse Waves: Model Development and Validation (Preprint)

10.2196/preprints.18134 ◽

2020 ◽

Author(s):

Xiaodong Ding ◽

Feng Cheng ◽

Robert Morris ◽

Cong Chen ◽

Yiqin Wang

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Radial Artery ◽

Pulse Wave ◽

Disease Diagnosis ◽

Recursive Feature Elimination ◽

Physiological Parameter ◽

Support Vector ◽

External Interference ◽

Pulse Waves

BACKGROUND The radial artery pulse wave is a widely used physiological signal for disease diagnosis and personal health monitoring because it provides insight into the overall health of the heart and blood vessels. Periodic radial artery pulse signals are subsequently decomposed into single pulse wave periods (segments) for physiological parameter evaluations. However, abnormal periods frequently arise due to external interference, the inherent imperfections of current segmentation methods, and the quality of the pulse wave signals. OBJECTIVE The objective of this paper was to develop a machine learning model to detect abnormal pulse periods in real clinical data. METHODS Various machine learning models, such as k-nearest neighbor, logistic regression, and support vector machines, were applied to classify the normal and abnormal periods in 8561 segments extracted from the radial pulse waves of 390 outpatients. The recursive feature elimination method was used to simplify the classifier. RESULTS It was found that a logistic regression model with only four input features can achieve a satisfactory result. The area under the receiver operating characteristic curve from the test set was 0.9920. In addition, these classifiers can be easily interpreted. CONCLUSIONS We expect that this model can be applied in smart sport watches and watchbands to accurately evaluate human health status.