Preliminary Screening of COVID-19 Infection Employing Machine Learning Techniques From Simple Blood Profile

Frequent testing of the entire population would help to identify individuals with active COVID-19 and allow us to identify concealed carriers. Molecular tests, antigen tests, and antibody tests are being widely used to confirm COVID-19 in the population. Molecular tests such as the real-time reverse transcription-polymerase chain reaction (rRT-PCR) test will take a minimum of 3 hours to a maximum of 4 days for the results. The authors suggest using machine learning and data mining tools to filter large populations at a preliminary level to overcome this issue. The ML tools could reduce the testing population size by 20 to 30%. In this study, they have used a subset of features from full blood profile which are drawn from patients at Israelita Albert Einstein hospital located in Brazil. They used classification models, namely KNN, logistic regression, XGBooting, naive Bayes, decision tree, random forest, support vector machine, and multilayer perceptron with k-fold cross-validation, to validate the models. Naïve bayes, KNN, and random forest stand out as the most predictive ones with 88% accuracy each.

Download Full-text

Sentiment Analysis using various Machine Learning and Deep Learning Techniques

Journal of the Nigerian Society of Physical Sciences ◽

10.46481/jnsps.2021.308 ◽

2021 ◽

pp. 385-394

Author(s):

V Umarani ◽

A Julian ◽

J Deepa

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Sentiment Analysis ◽

Naive Bayes ◽

Naïve Bayes ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Support Vector ◽

Analysis Process ◽

Learning Techniques

Sentiment analysis has gained a lot of attention from researchers in the last year because it has been widely applied to a variety of application domains such as business, government, education, sports, tourism, biomedicine, and telecommunication services. Sentiment analysis is an automated computational method for studying or evaluating sentiments, feelings, and emotions expressed as comments, feedbacks, or critiques. The sentiment analysis process can be automated using machine learning techniques, which analyses text patterns faster. The supervised machine learning technique is the most used mechanism for sentiment analysis. The proposed work discusses the flow of sentiment analysis process and investigates the common supervised machine learning techniques such as multinomial naive bayes, Bernoulli naive bayes, logistic regression, support vector machine, random forest, K-nearest neighbor, decision tree, and deep learning techniques such as Long Short-Term Memory and Convolution Neural Network. The work examines such learning methods using standard data set and the experimental results of sentiment analysis demonstrate the performance of various classifiers taken in terms of the precision, recall, F1-score, RoC-Curve, accuracy, running time and k fold cross validation and helps in appreciating the novelty of the several deep learning techniques and also giving the user an overview of choosing the right technique for their application.

Download Full-text

Prediction of Breast Cancer Using Machine Learning

Recent Advances in Computer Science and Communications ◽

10.2174/2213275912666190617160834 ◽

2020 ◽

Vol 13 (5) ◽

pp. 901-908

Author(s):

Somil Jain ◽

Puneet Kumar

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Support Vector Machine ◽

Random Forest ◽

Prediction Accuracy ◽

Naive Bayes ◽

Naïve Bayes ◽

Support Vector ◽

Classification Algorithms ◽

Breast Cancer Dataset

Background:: Breast cancer is one of the diseases which cause number of deaths ever year across the globe, early detection and diagnosis of such type of disease is a challenging task in order to reduce the number of deaths. Now a days various techniques of machine learning and data mining are used for medical diagnosis which has proven there metal by which prediction can be done for the chronic diseases like cancer which can save the life’s of the patients suffering from such type of disease. The major concern of this study is to find the prediction accuracy of the classification algorithms like Support Vector Machine, J48, Naïve Bayes and Random Forest and to suggest the best algorithm. Objective:: The objective of this study is to assess the prediction accuracy of the classification algorithms in terms of efficiency and effectiveness. Methods: This paper provides a detailed analysis of the classification algorithms like Support Vector Machine, J48, Naïve Bayes and Random Forest in terms of their prediction accuracy by applying 10 fold cross validation technique on the Wisconsin Diagnostic Breast Cancer dataset using WEKA open source tool. Results:: The result of this study states that Support Vector Machine has achieved the highest prediction accuracy of 97.89 % with low error rate of 0.14%. Conclusion:: This paper provides a clear view over the performance of the classification algorithms in terms of their predicting ability which provides a helping hand to the medical practitioners to diagnose the chronic disease like breast cancer effectively.

Download Full-text

Recognition of gasoline in fire debris using machine learning: Part I, Application of Random Forest, Gradient Boosting, Support Vector Machine and Naïve Bayes

Forensic Science International ◽

10.1016/j.forsciint.2021.111146 ◽

2021 ◽

pp. 111146 ◽

Cited By ~ 1

Author(s):

C. Bogdal ◽

R. Schellenberg ◽

O. Höpli ◽

M. Bovens ◽

M. Lory

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Random Forest ◽

Naive Bayes ◽

Naïve Bayes ◽

Gradient Boosting ◽

Support Vector ◽

Fire Debris ◽

In Fire

Download Full-text

Identifying undetected dementia in UK primary care patients: a retrospective case-control study comparing machine-learning and standard epidemiological approaches

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-019-0991-9 ◽

2019 ◽

Vol 19 (1) ◽

Cited By ~ 6

Author(s):

Elizabeth Ford ◽

Philip Rooney ◽

Seb Oliver ◽

Richard Hoile ◽

Peter Hurley ◽

...

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Random Forest ◽

Health Service ◽

Naive Bayes ◽

Case Control ◽

Naïve Bayes ◽

Support Vector ◽

Clinical Practice Research Datalink ◽

Patient Records

Abstract Background Identifying dementia early in time, using real world data, is a public health challenge. As only two-thirds of people with dementia now ultimately receive a formal diagnosis in United Kingdom health systems and many receive it late in the disease process, there is ample room for improvement. The policy of the UK government and National Health Service (NHS) is to increase rates of timely dementia diagnosis. We used data from general practice (GP) patient records to create a machine-learning model to identify patients who have or who are developing dementia, but are currently undetected as having the condition by the GP. Methods We used electronic patient records from Clinical Practice Research Datalink (CPRD). Using a case-control design, we selected patients aged >65y with a diagnosis of dementia (cases) and matched them 1:1 by sex and age to patients with no evidence of dementia (controls). We developed a list of 70 clinical entities related to the onset of dementia and recorded in the 5 years before diagnosis. After creating binary features, we trialled machine learning classifiers to discriminate between cases and controls (logistic regression, naïve Bayes, support vector machines, random forest and neural networks). We examined the most important features contributing to discrimination. Results The final analysis included data on 93,120 patients, with a median age of 82.6 years; 64.8% were female. The naïve Bayes model performed least well. The logistic regression, support vector machine, neural network and random forest performed very similarly with an AUROC of 0.74. The top features retained in the logistic regression model were disorientation and wandering, behaviour change, schizophrenia, self-neglect, and difficulty managing. Conclusions Our model could aid GPs or health service planners with the early detection of dementia. Future work could improve the model by exploring the longitudinal nature of patient data and modelling decline in function over time.

Download Full-text

On the Analysis of Machine Learning Classifiers to Detect Traffic Congestion in Vehicular Networks

10.5753/eniac.2019.9290 ◽

2019 ◽

Author(s):

Lucas Carvalho ◽

Maycon Silva ◽

Edimilson Santos ◽

Daniel Guidoni

Keyword(s):

Machine Learning ◽

Traffic Congestion ◽

Vehicular Networks ◽

Naive Bayes ◽

Naïve Bayes ◽

Machine Learning Techniques ◽

Support Vector ◽

K Nearest Neighbors ◽

Applied Machine Learning ◽

Routing Methods

Problems related to traffic congestion and management have become common in many cities. Thus, vehicle re-routing methods have been proposed to minimize the congestion. Some of these methods have applied machine learning techniques, more specifically classifiers, to verify road conditions and detect congestion. However, better results may be obtained by applying a classifier more suitable to domain. In this sense, this paper presents an evaluation of different classifiers applied to the identification of the level of road congestion. Our main goal is to analyze the characteristics of each classifier in this task. The classifiers involved in the experiments here are: Multiple Layer Neural Network (MLP), K-Nearest Neighbors (KNN), Decision Trees (J48), Support Vector Machines (SVM), Naive Bayes and Tree Augment Naive Bayes.

Download Full-text

Machine Learning Based Coronary Artery Disease Prediction

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2020.9007 ◽

2020 ◽

Vol 17 (9) ◽

pp. 3999-4002

Author(s):

A. C. Bhavani ◽

K. Aditya Shastry ◽

K. Deepika ◽

Nithya N. Shanbag ◽

G. C. Akshatha

Keyword(s):

Machine Learning ◽

Coronary Artery ◽

Medical Diagnosis ◽

Performance Metrics ◽

Naive Bayes ◽

Naïve Bayes ◽

Machine Learning Techniques ◽

World Health ◽

Support Vector ◽

Health Organization

The world health organization (WHO) has assessed that the death of around 12 million people across the globe is observed each year because of diseases related to cardiovascular. The dangers associated with the cardiovascular disease can be identified effectively using machine learning techniques. As per survey, around 30% of the patient suffers no symptoms during heart attacks. But the bloodstream contains unique indications of the attack for days. The medical diagnosis of a patient remains a complex task due to several factors. The accurate medical diagnosis of a patient’s heart disease is critical as it significantly leads to the saving of millions of human lives. In this regard, the automation of the medical diagnosis is significant. The goal of this work is the development of a system for predicting the disease related to coronary artery in a patient with high accuracy utilizing machine learning (ML) techniques. Several algorithms like Naïve Bayes (NB), Support Vector Machine (SVM), and Decision Tree (DT) classifiers were implemented for predicting the disease. Extensive experiments demonstrated that the naïve Bayes achieved higher accuracy than the DT and SVM with regards to accuracy, precision, F-Measure, Recall, and receiver operating characteristic (ROC) performance metrics.

Download Full-text

Machine Learning Approaches Applied to GC-FID Fatty Acid Profiles to Discriminate Wild from Farmed Salmon

Foods ◽

10.3390/foods9111622 ◽

2020 ◽

Vol 9 (11) ◽

pp. 1622

Author(s):

Liliana Grazina ◽

P. J. Rodrigues ◽

Getúlio Igrejas ◽

Maria A. Nunes ◽

Isabel Mafra ◽

...

Keyword(s):

Machine Learning ◽

Fatty Acid ◽

Random Forest ◽

Naive Bayes ◽

Naïve Bayes ◽

Support Vector ◽

Learning Approaches ◽

Machine Learning Classifiers ◽

Farmed Salmon ◽

Learning Classifiers

In the last decade, there has been an increasing demand for wild-captured fish, which attains higher prices compared to farmed species, thus being prone to mislabeling practices. In this work, fatty acid composition coupled to advanced chemometrics was used to discriminate wild from farmed salmon. The lipids extracted from salmon muscles of different production methods and origins (26 wild from Canada, 25 farmed from Canada, 24 farmed from Chile and 25 farmed from Norway) were analyzed by gas chromatography with flame ionization detector (GC-FID). All the tested chemometric approaches, namely principal components analysis (PCA), t-distributed stochastic neighbor embedding (t-SNE) and seven machine learning classifiers, namely k-nearest neighbors (kNN), decision tree, support vector machine (SVM), random forest, artificial neural networks (ANN), naïve Bayes and AdaBoost, allowed for differentiation between farmed and wild salmons using the 17 features obtained from chemical analysis. PCA did not allow clear distinguishing between salmon geographical origin since farmed samples from Canada and Chile overlapped. Nevertheless, using the 17 features in the models, six out of the seven tested machine learning classifiers allowed a classification accuracy of ≥99%, with ANN, naïve Bayes, random forest, SVM and kNN presenting 100% accuracy on the test dataset. The classification models were also assayed using only the best features selected by a reduction algorithm and the best input features mapped by t-SNE. The classifier kNN provided the best discrimination results because it correctly classified all samples according to production method and origin, ultimately using only the three most important features (16:0, 18:2n6c and 20:3n3 + 20:4n6). In general, the classifiers presented good generalization with the herein proposed approach being simple and presenting the advantage of requiring only common equipment existing in most labs.

Download Full-text

Classification of Aggressive Movements Using Smartwatches

Sensors ◽

10.3390/s20216377 ◽

2020 ◽

Vol 20 (21) ◽

pp. 6377

Author(s):

Franck Tchuente ◽

Natalie Baddour ◽

Edward D. Lemaire

Keyword(s):

Machine Learning ◽

Random Forest ◽

Aggressive Behavior ◽

Naive Bayes ◽

Poor Performance ◽

Naïve Bayes ◽

Classification Model ◽

Support Vector ◽

Care Providers ◽

K Nearest Neighbors

Recognizing aggressive movements is a challenging task in human activity recognition. Wearable smartwatch technology with machine learning may be a viable approach for human aggressive behavior classification. This research identified a viable classification model and feature selector (CM-FS) combination for separating aggressive from non-aggressive movements using smartwatch data and determined if only one smartwatch is sufficient for this task. A ranking method was used to select relevant CM-FS models across accuracy, sensitivity, specificity, precision, F-score, and Matthews correlation coefficient (MCC). The Waikato environment for knowledge analysis (WEKA) was used to run 6 machine learning classifiers (random forest, k-nearest neighbors (kNN), multilayer perceptron neural network (MP), support vector machine, naïve Bayes, decision tree) coupled with three feature selectors (ReliefF, InfoGain, Correlation). Microsoft Band 2 accelerometer and gyroscope data were collected during an activity circuit that included aggressive (punching, shoving, slapping, shaking) and non-aggressive (clapping hands, waving, handshaking, opening/closing a door, typing on a keyboard) tasks. A combination of kNN and ReliefF was the best CM-FS model for separating aggressive actions from non-aggressive actions, with 99.6% accuracy, 98.4% sensitivity, 99.8% specificity, 98.9% precision, 0.987 F-score, and 0.984 MCC. kNN and random forest classifiers, combined with any of the feature selectors, generated the top models. Models with naïve Bayes or support vector machines had poor performance for sensitivity, F-score, and MCC. Wearing the smartwatch on the dominant wrist produced the best single-watch results. The kNN and ReliefF combination demonstrated that this smartwatch-based approach is a viable solution for identifying aggressive behavior. This wrist-based wearable sensor approach could be used by care providers in settings where people suffer from dementia or mental health disorders, where random aggressive behaviors often occur.

Download Full-text

A Comparative Analysis to Visualize the Behavior of Different Machine Learning Algorithms for Normalized and Un-Normalized Data in Predicting Alzheimer’s Disease

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2019.8259 ◽

2019 ◽

Vol 16 (9) ◽

pp. 3840-3848

Author(s):

Neeraj Kumar ◽

Jatinder Manhas ◽

Vinod Sharma

Keyword(s):

Machine Learning ◽

Naive Bayes ◽

Neurodegenerative Disorder ◽

Learning Algorithms ◽

Naïve Bayes ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Linear Discriminant ◽

Age Related

Advancement in technology has helped people to live a long and better life. But the increased life expectancy has also elevated the risk of age related disorders, especially the neurodegenerative disorders. Alzheimer’s is one such neurodegenerative disorder, which is also the leading contributor towards dementia in elderly people. Despite of extensive research in this field, scientists have failed to find a cure for the disease till date. This makes early diagnosis of Alzheimer’s very crucial so as to delay its progression and improve the condition of the patient. Various techniques are being employed for diagnosing Alzheimer’s which include neuropsychological tests, medical imaging, blood based biomarkers, etc. Apart from this, various machine learning algorithms have been employed so far to diagnose Alzheimer’s in its early stages. In the current research, authors compared the performance of various machine learning techniques i.e., Linear Discriminant Analysis (LDA), K-Nearest Neighbour (KNN), Naïve Bayes (NB), Support Vector Machines (SVM), Decision Trees (DT), Random Forests (RF) and Multi Layer Perceptron (MLP) on Alzheimer’s dataset. This paper experimentally demonstrated that normalization exhibits a predominant role in enhancing the efficiency of some machine learning algorithms. Therefore it becomes imperative to choose the algorithms as per the available data. In this paper, the efficiency of the given machine learning methods was compared in terms of accuracy and f1-score. Naïve Bayes gave a better overall performance for both accuracy and f1-score and it also remained unaffected with the normalization of data along with LDA, DT and RF. Whereas KNN, SVM and MLP showed a drastic (17% to 86%) improvement in the performance when they are given normalized data as compared to un-normalized data from Alzheimer’s dataset.

Download Full-text

Performance Analysis of Supervised Machine Learning Algorithms on Medical Dataset

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.f7908.038620 ◽

2020 ◽

Vol 8 (6) ◽

pp. 1637-1642

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Random Forest ◽

Decision Tree ◽

Naive Bayes ◽

Naïve Bayes ◽

Learning System ◽

Supervised Machine Learning ◽

Support Vector ◽

Heart Problem

Machine learning (ML) algorithms are designed to perform prediction based on features. With the help of machine learning, system can automatically learn and improve by experience. Machine learning comes under Artificial intelligence. Machine learning is broadly categorized in two types: supervised and unsupervised. Supervised ML performs classification and unsupervised is for clustering. In present scenario, machine learning is used in various areas. It can be used for biometric recognition, hand writing recognition, medical diagnosis etc. In medical field, machine learning plays an important role in identifying diseases based on patient’s features. Presently,doctors use software application based on machine learning algorithm in various disease diagnosis like cancer, cardiac arrest and many more. In this paper we used an ensemble learning method to predict heart problem. Our study described the performance of ML algorithms by comparing various evaluating parameters such as F-measure, Recall, ROC, precision and accuracy. The study done with various combination ML classifiers such as, Decision Tree (DT), Naïve Bayes (NB), Support Vector Machine (SVM), Random Forest (RF) algorithm to predict heart problem. The result showed that by combining two ML algorithm, DT with NB, 81.1% accuracy was achieved. Simultaneously, the models like Support Vector machine (SVM), Decision tree, Naïve Bayes, Random Forest models were also trained and tested individually.

Download Full-text