Full Accuracy of Machine Learning for Differentiation between Optic Neuropathies and Pseudopapilledema

10.21203/rs.2.318/v1 ◽

2019 ◽

Author(s):

Jin Mo Ahn ◽

Sangsoo Kim ◽

Kwang-Sung Ahn ◽

Sung-Hoon Cho ◽

Ungsoo Kim

Keyword(s):

Machine Learning ◽

Visual Acuity ◽

Optic Disc ◽

Optic Neuropathy ◽

Characteristic Curve ◽

Machine Learning Techniques ◽

Fundus Photography ◽

Ischemic Optic Neuropathy ◽

Machine Learning Classifiers ◽

Learning Classifiers

Abstract Background: This study is to evaluate the accuracy of machine learning for differentiation between optic neuropathies and pseudopapilledema (PPE). Methods: Two hundred and ninety-five images of optic neuropathies, 295 images of PPE, and 779 control images were used. Pseudopapilledema was defined as follows: cases with elevated optic nerve head and blurred disc margin, with normal visual acuity (>0.8 Snellen visual acuity), visual field, color vision, and pupillary reflex. The optic neuropathy group included cases of ischemic optic neuropathy (177), optic neuritis (48), diabetic optic neuropathy (17), papilledema (22), and retinal disorders (31). We compared four machine learning classifiers (our model, GoogleNet Inception v3, 19-layer Very Deep Convolution Network from Visual Geometry group (VGG), and 50-layer Deep Residual Learning (ResNet)). Accuracy and area under receiver operating characteristic curve (AUROC) were analyzed Results: The accuracy of machine learning classifiers ranged from 95.89% to 98.63% (our model: 95.89%, Inception V3: 96.45%, ResNet: 98.63%, and VGG: 96.80%). A high AUROC score was noted in both ResNet and VGG (0.999). Conclusions: Machine learning techniques can be combined with fundus photography as an effective approach to distinguish between PPE and elevated optic disc associated with optic neuropathies. Keywords: Machine Learning; Pseudopapilledema; Optic neuropathy; Optic disc swelling.

Download Full-text

Physician-Friendly Machine Learning: A Case Study with Cardiovascular Disease Risk Prediction

Journal of Clinical Medicine ◽

10.3390/jcm8071050 ◽

2019 ◽

Vol 8 (7) ◽

pp. 1050 ◽

Cited By ~ 3

Author(s):

Meghana Padmanabhan ◽

Pengyu Yuan ◽

Govind Chada ◽

Hien Van Nguyen

Keyword(s):

Machine Learning ◽

Graduate Student ◽

Disease Risk ◽

Cardiovascular Disease Risk ◽

Machine Learning Techniques ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Learning Techniques ◽

Standard Code ◽

Time Required

Machine learning is often perceived as a sophisticated technology accessible only by highly trained experts. This prevents many physicians and biologists from using this tool in their research. The goal of this paper is to eliminate this out-dated perception. We argue that the recent development of auto machine learning techniques enables biomedical researchers to quickly build competitive machine learning classifiers without requiring in-depth knowledge about the underlying algorithms. We study the case of predicting the risk of cardiovascular diseases. To support our claim, we compare auto machine learning techniques against a graduate student using several important metrics, including the total amounts of time required for building machine learning models and the final classification accuracies on unseen test datasets. In particular, the graduate student manually builds multiple machine learning classifiers and tunes their parameters for one month using scikit-learn library, which is a popular machine learning library to obtain ones that perform best on two given, publicly available datasets. We run an auto machine learning library called auto-sklearn on the same datasets. Our experiments find that automatic machine learning takes 1 h to produce classifiers that perform better than the ones built by the graduate student in one month. More importantly, building this classifier only requires a few lines of standard code. Our findings are expected to change the way physicians see machine learning and encourage wide adoption of Artificial Intelligence (AI) techniques in clinical domains.

Download Full-text

Towards Predicting Student’s Dropout in University Courses Using Different Machine Learning Techniques

Applied Sciences ◽

10.3390/app11073130 ◽

2021 ◽

Vol 11 (7) ◽

pp. 3130

Author(s):

Janka Kabathova ◽

Martin Drlik

Keyword(s):

Machine Learning ◽

Performance Metrics ◽

Machine Learning Algorithms ◽

Classification Model ◽

Machine Learning Techniques ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Unseen Data ◽

E Learning ◽

The Impact

Early and precisely predicting the students’ dropout based on available educational data belongs to the widespread research topic of the learning analytics research field. Despite the amount of already realized research, the progress is not significant and persists on all educational data levels. Even though various features have already been researched, there is still an open question, which features can be considered appropriate for different machine learning classifiers applied to the typical scarce set of educational data at the e-learning course level. Therefore, the main goal of the research is to emphasize the importance of the data understanding, data gathering phase, stress the limitations of the available datasets of educational data, compare the performance of several machine learning classifiers, and show that also a limited set of features, which are available for teachers in the e-learning course, can predict student’s dropout with sufficient accuracy if the performance metrics are thoroughly considered. The data collected from four academic years were analyzed. The features selected in this study proved to be applicable in predicting course completers and non-completers. The prediction accuracy varied between 77 and 93% on unseen data from the next academic year. In addition to the frequently used performance metrics, the comparison of machine learning classifiers homogeneity was analyzed to overcome the impact of the limited size of the dataset on obtained high values of performance metrics. The results showed that several machine learning algorithms could be successfully applied to a scarce dataset of educational data. Simultaneously, classification performance metrics should be thoroughly considered before deciding to deploy the best performance classification model to predict potential dropout cases and design beneficial intervention mechanisms.

Download Full-text

Machine Learning Classifiers for Twitter Surveillance of Vaping: Comparative Machine Learning Study

Journal of Medical Internet Research ◽

10.2196/17478 ◽

2020 ◽

Vol 22 (8) ◽

pp. e17478 ◽

Cited By ~ 1

Author(s):

Shyam Visweswaran ◽

Jason B Colditz ◽

Patrick O’Halloran ◽

Na-Rae Han ◽

Sanya B Taneja ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Surveillance System ◽

Short Term Memory ◽

Characteristic Curve ◽

Superior Performance ◽

Support Vector ◽

Data Set ◽

Machine Learning Classifiers ◽

Learning Classifiers

Background Twitter presents a valuable and relevant social media platform to study the prevalence of information and sentiment on vaping that may be useful for public health surveillance. Machine learning classifiers that identify vaping-relevant tweets and characterize sentiments in them can underpin a Twitter-based vaping surveillance system. Compared with traditional machine learning classifiers that are reliant on annotations that are expensive to obtain, deep learning classifiers offer the advantage of requiring fewer annotated tweets by leveraging the large numbers of readily available unannotated tweets. Objective This study aims to derive and evaluate traditional and deep learning classifiers that can identify tweets relevant to vaping, tweets of a commercial nature, and tweets with provape sentiments. Methods We continuously collected tweets that matched vaping-related keywords over 2 months from August 2018 to October 2018. From this data set of tweets, a set of 4000 tweets was selected, and each tweet was manually annotated for relevance (vape relevant or not), commercial nature (commercial or not), and sentiment (provape or not). Using the annotated data, we derived traditional classifiers that included logistic regression, random forest, linear support vector machine, and multinomial naive Bayes. In addition, using the annotated data set and a larger unannotated data set of tweets, we derived deep learning classifiers that included a convolutional neural network (CNN), long short-term memory (LSTM) network, LSTM-CNN network, and bidirectional LSTM (BiLSTM) network. The unannotated tweet data were used to derive word vectors that deep learning classifiers can leverage to improve performance. Results LSTM-CNN performed the best with the highest area under the receiver operating characteristic curve (AUC) of 0.96 (95% CI 0.93-0.98) for relevance, all deep learning classifiers including LSTM-CNN performed better than the traditional classifiers with an AUC of 0.99 (95% CI 0.98-0.99) for distinguishing commercial from noncommercial tweets, and BiLSTM performed the best with an AUC of 0.83 (95% CI 0.78-0.89) for provape sentiment. Overall, LSTM-CNN performed the best across all 3 classification tasks. Conclusions We derived and evaluated traditional machine learning and deep learning classifiers to identify vaping-related relevant, commercial, and provape tweets. Overall, deep learning classifiers such as LSTM-CNN had superior performance and had the added advantage of requiring no preprocessing. The performance of these classifiers supports the development of a vaping surveillance system.

Download Full-text

Classification of the glioma grading using radiomics analysis

PeerJ ◽

10.7717/peerj.5982 ◽

2018 ◽

Vol 6 ◽

pp. e5982 ◽

Cited By ~ 27

Author(s):

Hwan-ho Cho ◽

Seung-hak Lee ◽

Jonghoon Kim ◽

Hyunjin Park

Keyword(s):

Machine Learning ◽

Random Forest ◽

Characteristic Curve ◽

Support Vector ◽

Tumor Segmentation ◽

Low Grade ◽

Training Cohort ◽

Machine Learning Classifiers ◽

Glioma Grading ◽

Learning Classifiers

Background Grading of gliomas is critical information related to prognosis and survival. We aimed to apply a radiomics approach using various machine learning classifiers to determine the glioma grading. Methods We considered 285 (high grade n = 210, low grade n = 75) cases obtained from the Brain Tumor Segmentation 2017 Challenge. Manual annotations of enhancing tumors, non-enhancing tumors, necrosis, and edema were provided by the database. Each case was multi-modal with T1-weighted, T1-contrast enhanced, T2-weighted, and FLAIR images. A five-fold cross validation was adopted to separate the training and test data. A total of 468 radiomics features were calculated for three types of regions of interest. The minimum redundancy maximum relevance algorithm was used to select features useful for classifying glioma grades in the training cohort. The selected features were used to build three classifier models of logistics, support vector machines, and random forest classifiers. The classification performance of the models was measured in the training cohort using accuracy, sensitivity, specificity, and area under the curve (AUC) of the receiver operating characteristic curve. The trained classifier models were applied to the test cohort. Results Five significant features were selected for the machine learning classifiers and the three classifiers showed an average AUC of 0.9400 for training cohorts and 0.9030 (logistic regression 0.9010, support vector machine 0.8866, and random forest 0.9213) for test cohorts. Discussion Glioma grading could be accurately determined using machine learning and feature selection techniques in conjunction with a radiomics approach. The results of our study might contribute to high-throughput computer aided diagnosis system for gliomas.

Download Full-text

Comparative Analysis of Intrusion Detection Attack Based on Machine Learning Classifiers

Indian Journal of Artificial Intelligence and Neural Networking ◽

10.35940/ijainn.b1025.041221 ◽

2021 ◽

Vol 1 (2) ◽

pp. 22-28

Author(s):

Surafel Mehari Atnafu ◽

Anuja Kumar Acharya

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Nearest Neighbor ◽

Detection System ◽

Machine Learning Techniques ◽

K Nearest Neighbor ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Learning Techniques

In current day information transmitted from one place to another by using network communication technology. Due to such transmission of information, networking system required a high security environment. The main strategy to secure this environment is to correctly identify the packet and detect if the packet contains a malicious and any illegal activity happened in network environments. To accomplish this, we use intrusion detection system (IDS). Intrusion detection is a security technology that design detects and automatically alert or notify to a responsible person. However, creating an efficient Intrusion Detection System face a number of challenges. These challenges are false detection and the data contain high number of features. Currently many researchers use machine learning techniques to overcome the limitation of intrusion detection and increase the efficiency of intrusion detection for correctly identify the packet either the packet is normal or malicious. Many machine-learning techniques use in intrusion detection. However, the question is which machine learning classifiers has been potentially to address intrusion detection issue in network security environment. Choosing the appropriate machine learning techniques required to improve the accuracy of intrusion detection system. In this work, three machine learning classifiers are analyzed. Support vector Machine, Naïve Bayes Classifier and K-Nearest Neighbor classifiers. These algorithms tested using NSL KDD dataset by using the combination of Chi square and Extra Tree feature selection method and Python used to implement, analyze and evaluate the classifiers. Experimental result show that K-Nearest Neighbor classifiers outperform the method in categorizing the packet either is normal or malicious.

Download Full-text

Machine Learning Classifiers for Twitter Surveillance of Vaping: Comparative Machine Learning Study (Preprint)

10.2196/preprints.17478 ◽

2019 ◽

Author(s):

Shyam Visweswaran ◽

Jason B Colditz ◽

Patrick O’Halloran ◽

Na-Rae Han ◽

Sanya B Taneja ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Surveillance System ◽

Short Term Memory ◽

Characteristic Curve ◽

Superior Performance ◽

Support Vector ◽

Data Set ◽

Machine Learning Classifiers ◽

Learning Classifiers

BACKGROUND Twitter presents a valuable and relevant social media platform to study the prevalence of information and sentiment on vaping that may be useful for public health surveillance. Machine learning classifiers that identify vaping-relevant tweets and characterize sentiments in them can underpin a Twitter-based vaping surveillance system. Compared with traditional machine learning classifiers that are reliant on annotations that are expensive to obtain, deep learning classifiers offer the advantage of requiring fewer annotated tweets by leveraging the large numbers of readily available unannotated tweets. OBJECTIVE This study aims to derive and evaluate traditional and deep learning classifiers that can identify tweets relevant to vaping, tweets of a commercial nature, and tweets with provape sentiments. METHODS We continuously collected tweets that matched vaping-related keywords over 2 months from August 2018 to October 2018. From this data set of tweets, a set of 4000 tweets was selected, and each tweet was manually annotated for relevance (vape relevant or not), commercial nature (commercial or not), and sentiment (provape or not). Using the annotated data, we derived traditional classifiers that included logistic regression, random forest, linear support vector machine, and multinomial naive Bayes. In addition, using the annotated data set and a larger unannotated data set of tweets, we derived deep learning classifiers that included a convolutional neural network (CNN), long short-term memory (LSTM) network, LSTM-CNN network, and bidirectional LSTM (BiLSTM) network. The unannotated tweet data were used to derive word vectors that deep learning classifiers can leverage to improve performance. RESULTS LSTM-CNN performed the best with the highest area under the receiver operating characteristic curve (AUC) of 0.96 (95% CI 0.93-0.98) for relevance, all deep learning classifiers including LSTM-CNN performed better than the traditional classifiers with an AUC of 0.99 (95% CI 0.98-0.99) for distinguishing commercial from noncommercial tweets, and BiLSTM performed the best with an AUC of 0.83 (95% CI 0.78-0.89) for provape sentiment. Overall, LSTM-CNN performed the best across all 3 classification tasks. CONCLUSIONS We derived and evaluated traditional machine learning and deep learning classifiers to identify vaping-related relevant, commercial, and provape tweets. Overall, deep learning classifiers such as LSTM-CNN had superior performance and had the added advantage of requiring no preprocessing. The performance of these classifiers supports the development of a vaping surveillance system.

Download Full-text

Streaming classification of variable stars

Monthly Notices of the Royal Astronomical Society ◽

10.1093/mnras/stz3426 ◽

2019 ◽

Vol 492 (2) ◽

pp. 2897-2909 ◽

Cited By ~ 2

Author(s):

L Zorich ◽

K Pichara ◽

P Protopapas

Keyword(s):

Machine Learning ◽

Gravitational Lensing ◽

Automatic Classification ◽

Variable Stars ◽

Light Curves ◽

Classification Model ◽

Machine Learning Techniques ◽

Machine Learning Classifiers ◽

Learning Classifiers

ABSTRACT In the last years, automatic classification of variable stars has received substantial attention. Using machine learning techniques for this task has proven to be quite useful. Typically, machine learning classifiers used for this task require to have a fixed training set, and the training process is performed offline. Upcoming surveys such as the Large Synoptic Survey Telescope will generate new observations daily, where an automatic classification system able to create alerts online will be mandatory. A system with those characteristics must be able to update itself incrementally. Unfortunately, after training, most machine learning classifiers do not support the inclusion of new observations in light curves, they need to re-train from scratch. Naively re-training from scratch is not an option in streaming settings, mainly because of the expensive pre-processing routines required to obtain a vector representation of light curves (features) each time we include new observations. In this work, we propose a streaming probabilistic classification model; it uses a set of newly designed features that work incrementally. With this model, we can have a machine learning classifier that updates itself in real time with new observations. To test our approach, we simulate a streaming scenario with light curves from Convention, Rotation and planetary Transits (CoRoT), Orbital Gravitational Lensing Experiment (OGLE), and Massive Compact Halo Object (MACHO) catalogues. Results show that our model achieves high classification performance, staying an order of magnitude faster than traditional classification approaches.

Download Full-text

Comparative Analysis of Intrusion Detection Attack Based on Machine Learning Classifiers

Indian Journal of Artificial Intelligence and Neural Networking ◽

10.54105/ijainn.b1025.041221 ◽

2021 ◽

pp. 22-28

Author(s):

Surafel Mehari Atnafu ◽

◽

Prof (Dr.) Anuja Kumar Acharya ◽

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Nearest Neighbor ◽

Detection System ◽

Machine Learning Techniques ◽

K Nearest Neighbor ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Learning Techniques

In current day information transmitted from one place to another by using network communication technology. Due to such transmission of information, networking system required a high security environment. The main strategy to secure this environment is to correctly identify the packet and detect if the packet contains a malicious and any illegal activity happened in network environments. To accomplish this, we use intrusion detection system (IDS). Intrusion detection is a security technology that design detects and automatically alert or notify to a responsible person. However, creating an efficient Intrusion Detection System face a number of challenges. These challenges are false detection and the data contain high number of features. Currently many researchers use machine learning techniques to overcome the limitation of intrusion detection and increase the efficiency of intrusion detection for correctly identify the packet either the packet is normal or malicious. Many machine-learning techniques use in intrusion detection. However, the question is which machine learning classifiers has been potentially to address intrusion detection issue in network security environment. Choosing the appropriate machine learning techniques required to improve the accuracy of intrusion detection system. In this work, three machine learning classifiers are analyzed. Support vector Machine, Naïve Bayes Classifier and K-Nearest Neighbor classifiers. These algorithms tested using NSL KDD dataset by using the combination of Chi square and Extra Tree feature selection method and Python used to implement, analyze and evaluate the classifiers. Experimental result show that K-Nearest Neighbor classifiers outperform the method in categorizing the packet either is normal or malicious.

Download Full-text

Performance Evaluation of Supervised Machine Learning Techniques for Efficient Detection of Emotions from Online Content

10.20944/preprints201908.0019.v1 ◽

2019 ◽

Cited By ~ 9

Author(s):

Muhammad Zubair Asghar ◽

Fazli Subhan ◽

Muhammad Imran ◽

Fazal Masud Kundi ◽

Shahab Shamshirband ◽

...

Keyword(s):

Machine Learning ◽

Online Community ◽

Opinion Mining ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Emotion Detection ◽

Emotion Classification ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Efficient Detection

Emotion detection from the text is an important and challenging problem in text analytics. The opinion-mining experts are focusing on the development of emotion detection applications as they have received considerable attention of online community including users and business organization for collecting and interpreting public emotions. However, most of the existing works on emotion detection used less efficient machine learning classifiers with limited datasets, resulting in performance degradation. To overcome this issue, this work aims at the evaluation of the performance of different machine learning classifiers on a benchmark emotion dataset. The experimental results show the performance of different machine learning classifiers in terms of different evaluation metrics like precision, recall ad f-measure. Finally, a classifier with the best performance is recommended for the emotion classification.

Download Full-text