Prediction of COVID-19 Risk in Public Areas Using IoT and Machine Learning

Ersin Elbasi; Ahmet E. Topcu; Shinu Mathew

doi:10.3390/electronics10141677

Prediction of COVID-19 Risk in Public Areas Using IoT and Machine Learning

Electronics ◽

10.3390/electronics10141677 ◽

2021 ◽

Vol 10 (14) ◽

pp. 1677

Author(s):

Ersin Elbasi ◽

Ahmet E. Topcu ◽

Shinu Mathew

Keyword(s):

Machine Learning ◽

Random Forest ◽

Decision Tree ◽

Naive Bayes ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Bayes Classifier ◽

Social Distancing ◽

Public Areas ◽

Iot Devices

COVID-19 is a community-acquired infection with symptoms that resemble those of influenza and bacterial pneumonia. Creating an infection control policy involving isolation, disinfection of surfaces, and identification of contagions is crucial in eradicating such pandemics. Incorporating social distancing could also help stop the spread of community-acquired infections like COVID-19. Social distancing entails maintaining certain distances between people and reducing the frequency of contact between people. Meanwhile, a significant increase in the development of different Internet of Things (IoT) devices has been seen together with cyber-physical systems that connect with physical environments. Machine learning is strengthening current technologies by adding new approaches to quickly and correctly solve problems utilizing this surge of available IoT devices. We propose a new approach using machine learning algorithms for monitoring the risk of COVID-19 in public areas. Extracted features from IoT sensors are used as input for several machine learning algorithms such as decision tree, neural network, naïve Bayes classifier, support vector machine, and random forest to predict the risks of the COVID-19 pandemic and calculate the risk probability of public places. This research aims to find vulnerable populations and reduce the impact of the disease on certain groups using machine learning models. We build a model to calculate and predict the risk factors of populated areas. This model generates automated alerts for security authorities in the case of any abnormal detection. Experimental results show that we have high accuracy with random forest of 97.32%, with decision tree of 94.50%, and with the naïve Bayes classifier of 99.37%. These algorithms indicate great potential for crowd risk prediction in public areas.

Download Full-text

Data Driven Approach for Eye Disease Classification with Machine Learning

Applied Sciences ◽

10.3390/app9142789 ◽

2019 ◽

Vol 9 (14) ◽

pp. 2789 ◽

Cited By ~ 3

Author(s):

Sadaf Malik ◽

Nadia Kanwal ◽

Mamoona Naveed Asghar ◽

Mohammad Ali A. Sadiq ◽

Irfan Karamat ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Decision Tree ◽

Naive Bayes ◽

Learning Algorithms ◽

Naïve Bayes ◽

Machine Learning Algorithms ◽

Multiple Features ◽

Standard Format ◽

Free Data

Medical health systems have been concentrating on artificial intelligence techniques for speedy diagnosis. However, the recording of health data in a standard form still requires attention so that machine learning can be more accurate and reliable by considering multiple features. The aim of this study is to develop a general framework for recording diagnostic data in an international standard format to facilitate prediction of disease diagnosis based on symptoms using machine learning algorithms. Efforts were made to ensure error-free data entry by developing a user-friendly interface. Furthermore, multiple machine learning algorithms including Decision Tree, Random Forest, Naive Bayes and Neural Network algorithms were used to analyze patient data based on multiple features, including age, illness history and clinical observations. This data was formatted according to structured hierarchies designed by medical experts, whereas diagnosis was made as per the ICD-10 coding developed by the American Academy of Ophthalmology. Furthermore, the system is designed to evolve through self-learning by adding new classifications for both diagnosis and symptoms. The classification results from tree-based methods demonstrated that the proposed framework performs satisfactorily, given a sufficient amount of data. Owing to a structured data arrangement, the random forest and decision tree algorithms’ prediction rate is more than 90% as compared to more complex methods such as neural networks and the naïve Bayes algorithm.

Download Full-text

Towards Near-Real-Time Intrusion Detection for IoT Devices using Supervised Learning and Apache Spark

Electronics ◽

10.3390/electronics9030444 ◽

2020 ◽

Vol 9 (3) ◽

pp. 444 ◽

Cited By ~ 1

Author(s):

Valerio Morfino ◽

Salvatore Rampone

Keyword(s):

Machine Learning ◽

Random Forest ◽

Learning Algorithms ◽

Hybrid Approach ◽

Cyber Attacks ◽

Machine Learning Algorithms ◽

Apache Spark ◽

Identification Accuracy ◽

Supervised Machine Learning ◽

Iot Devices

In the fields of Internet of Things (IoT) infrastructures, attack and anomaly detection are rising concerns. With the increased use of IoT infrastructure in every domain, threats and attacks in these infrastructures are also growing proportionally. In this paper the performances of several machine learning algorithms in identifying cyber-attacks (namely SYN-DOS attacks) to IoT systems are compared both in terms of application performances, and in training/application times. We use supervised machine learning algorithms included in the MLlib library of Apache Spark, a fast and general engine for big data processing. We show the implementation details and the performance of those algorithms on public datasets using a training set of up to 2 million instances. We adopt a Cloud environment, emphasizing the importance of the scalability and of the elasticity of use. Results show that all the Spark algorithms used result in a very good identification accuracy (>99%). Overall, one of them, Random Forest, achieves an accuracy of 1. We also report a very short training time (23.22 sec for Decision Tree with 2 million rows). The experiments also show a very low application time (0.13 sec for over than 600,000 instances for Random Forest) using Apache Spark in the Cloud. Furthermore, the explicit model generated by Random Forest is very easy-to-implement using high- or low-level programming languages. In light of the results obtained, both in terms of computation times and identification performance, a hybrid approach for the detection of SYN-DOS cyber-attacks on IoT devices is proposed: the application of an explicit Random Forest model, implemented directly on the IoT device, along with a second level analysis (training) performed in the Cloud.

Download Full-text

Comparative Study of Machine Learning Algorithms for Breast Cancer Prediction - A Review

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit1952278 ◽

2019 ◽

pp. 979-985

Author(s):

Akshya Yadav ◽

Imlikumla Jamir ◽

Raj Rajeshwari Jain ◽

Mayank Sohani

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Random Forest ◽

Learning Algorithms ◽

Accurate Diagnosis ◽

Machine Learning Algorithms ◽

Bayes Classifier ◽

Cancer Prediction ◽

Short Span ◽

Cancerous Cells

Cancer has been characterized as one of the leading diseases that causes death in humans. Breast cancer being a subtype of cancer causes death in one out of every eight women worldwide. The solution to counter this is by conducting early and accurate diagnosis for faster treatment. To achieve such accuracy in a short span of time proves difficult with existing techniques. In this paper, different machine learning algorithms which can be used as tools by physicians for early and effective detection and prediction of cancerous cells have been studied and introduced. The different algorithms introduced here are ANN, DT, Random Forest (RF), Naive Bayes Classifier (NBC), SVM and KNN. These algorithms are trained with a dataset that contain parameters describing the tumor of a person having breast cancer and are then used to classify and predict whether the cell is cancerous.

Download Full-text

Machine Learning Algorithms for Visualization and Prediction Modeling of Boston Crime Data

10.20944/preprints202002.0108.v1 ◽

2020 ◽

Author(s):

Jiarui Yin ◽

Inikuro Afa Michael ◽

Iduabo John Afa

Keyword(s):

Machine Learning ◽

Random Forest ◽

Decision Tree ◽

Data Science ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Crime Data ◽

Detection Analysis ◽

Supervised Learning Algorithms ◽

Supervised Methods

Machine learning plays a key role in present day crime detection, analysis and prediction. The goal of this work is to propose methods for predicting crimes classified into different categories of severity. We implemented visualization and analysis of crime data statistics in recent years in the city of Boston. We then carried out a comparative study between two supervised learning algorithms, which are decision tree and random forest based on the accuracy and processing time of the models to make predictions using geographical and temporal information provided by splitting the data into training and test sets. The result shows that random forest as expected gives a better result by 1.54% more accuracy in comparison to decision tree, although this comes at a cost of at least 4.37 times the time consumed in processing. The study opens doors to application of similar supervised methods in crime data analytics and other fields of data science

Download Full-text

Optimized Naive-Bayes and Decision Tree Approaches for fMRI Smoking Cessation Classification

Complexity ◽

10.1155/2018/2740817 ◽

2018 ◽

Vol 2018 ◽

pp. 1-24 ◽

Cited By ~ 8

Author(s):

Amirhessam Tahmassebi ◽

Amir H. Gandomi ◽

Mieke H. J. Schulte ◽

Anna E. Goudriaan ◽

Simon Y. Foo ◽

...

Keyword(s):

Machine Learning ◽

Smoking Cessation ◽

Decision Tree ◽

Data Reduction ◽

Resting State ◽

Naive Bayes ◽

Resting State Fmri ◽

Naïve Bayes ◽

Machine Learning Algorithms ◽

Bayes Classifier

This paper aims at developing new theory-driven biomarkers by implementing and evaluating novel techniques from resting-state scans that can be used in relapse prediction for nicotine-dependent patients and future treatment efficacy. Two classes of patients were studied. One class took the drug N-acetylcysteine and the other class took a placebo. Then, the patients underwent a double-blind smoking cessation treatment and the resting-state fMRI scans of their brains before and after treatment were recorded. The scientific research goal of this study was to interpret the fMRI connectivity maps based on machine learning algorithms to predict the patient who will relapse and the one who will not. In this regard, the feature matrix was extracted from the image slices of brain employing voxel selection schemes and data reduction algorithms. Then, the feature matrix was fed into the machine learning classifiers including optimized CART decision tree and Naive-Bayes classifier with standard and optimized implementation employing 10-fold cross-validation. Out of all the data reduction techniques and the machine learning algorithms employed, the best accuracy was obtained using the singular value decomposition along with the optimized Naive-Bayes classifier. This gave an accuracy of 93% with sensitivity-specificity of 99% which suggests that the relapse in nicotine-dependent patients can be predicted based on the resting-state fMRI images. The use of these approaches may result in clinical applications in the future.

Download Full-text

Statistical Analysis for Selective Identifications of VOCs by Using Surface Functionalized MoS2 Based Sensor Array

Chemistry Proceedings ◽

10.3390/csac2021-10451 ◽

2021 ◽

Vol 5 (1) ◽

pp. 35

Author(s):

Uttam Narendra Thakur ◽

Radha Bhardwaj ◽

Arnab Hazra

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Random Forest ◽

Decision Tree ◽

Sensor Array ◽

Multinomial Logistic Regression ◽

Learning Algorithms ◽

Disease Diagnosis ◽

Machine Learning Algorithms ◽

Human Breath

Disease diagnosis through breath analysis has attracted significant attention in recent years due to its noninvasive nature, rapid testing ability, and applicability for patients of all ages. More than 1000 volatile organic components (VOCs) exist in human breath, but only selected VOCs are associated with specific diseases. Selective identification of those disease marker VOCs using an array of multiple sensors are highly desirable in the current scenario. The use of efficient sensors and the use of suitable classification algorithms is essential for the selective and reliable detection of those disease markers in complex breath. In the current study, we fabricated a noble metal (Au, Pd and Pt) nanoparticle-functionalized MoS2 (Chalcogenides, Sigma Aldrich, St. Louis, MO, USA)-based sensor array for the selective identification of different VOCs. Four sensors, i.e., pure MoS2, Au/MoS2, Pd/MoS2, and Pt/MoS2 were tested under exposure to different VOCs, such as acetone, benzene, ethanol, xylene, 2-propenol, methanol and toluene, at 50 °C. Initially, principal component analysis (PCA) and linear discriminant analysis (LDA) were used to discriminate those seven VOCs. As compared to the PCA, LDA was able to discriminate well between the seven VOCs. Four different machine learning algorithms such as k-nearest neighbors (kNN), decision tree, random forest, and multinomial logistic regression were used to further identify those VOCs. The classification accuracy of those seven VOCs using KNN, decision tree, random forest, and multinomial logistic regression was 97.14%, 92.43%, 84.1%, and 98.97%, respectively. These results authenticated that multinomial logistic regression performed best between the four machine learning algorithms to discriminate and differentiate the multiple VOCs that generally exist in human breath.

Download Full-text

An Effective Stratified K-Fold Algorithm with Logistic Regression for Drug Feedback Data

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.f8166.038620 ◽

2020 ◽

Vol 8 (6) ◽

pp. 1964-1968

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Random Forest ◽

Pharmaceutical Industry ◽

Naive Bayes ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

New Approach ◽

Feedback Data ◽

Drug Review

Drug reviews are commonly used in pharmaceutical industry to improve the medications given to patients. Generally, drug review contains details of drug name, usage, ratings and comments by the patients. However, these reviews are not clean, and there is a need to improve the cleanness of the review so that they can be benefited for both pharmacists and patients. To do this, we propose a new approach that includes different steps. First, we add extra parameters in the review data by applying VADER sentimental analysis to clean the review data. Then, we apply different machine learning algorithms, namely linear SVC, logistic regression, SVM, random forest, and Naive Bayes on the drug review specify dataset names. However, we found that the accuracy of these algorithms for these datasets is limited. To improve this, we apply stratified K-fold algorithm in combination with Logistic regression. With this approach, the accuracy is increased to 96%.

Download Full-text

Wind Power Prediction Based on Three Machine-Learning Algorithms: Decision Tree, K-Nearest Neighbors and Random Forest

Proceedings of the Fifteenth International Conference on Management Science and Engineering Management - Lecture Notes on Data Engineering and Communications Technologies ◽

10.1007/978-3-030-79203-9_38 ◽

2021 ◽

pp. 490-499

Author(s):

Tingting Liu ◽

Lurong Fan

Keyword(s):

Machine Learning ◽

Random Forest ◽

Decision Tree ◽

Wind Power ◽

Learning Algorithms ◽

Nearest Neighbors ◽

Machine Learning Algorithms ◽

Power Prediction ◽

K Nearest Neighbors ◽

Wind Power Prediction

Download Full-text

Detecting malicious software using machine learning

Issues of radio electronics ◽

10.21778/2218-5453-2019-11-42-45 ◽

2019 ◽

pp. 42-45

Author(s):

A. V. Chevychelov ◽

A. V. Burmistrov ◽

K. Yu. Voyshhev

Keyword(s):

Machine Learning ◽

Random Forest ◽

Decision Tree ◽

Learning Algorithms ◽

Malware Detection ◽

Malicious Code ◽

Machine Learning Algorithms ◽

Fisher Criterion ◽

Malicious Software ◽

Informative Parameters

Today, most malware detection tools (Trojans): trojans, spyware, adware, worms, viruses, and ransomware are based on a signature approach that is ineffective for detecting polymorphs and malware whose signatures have not been recorded in antivirus database. This article explores methods for detecting opcodes in malware using machine learning algorithms. The study is carried on a Microsoft dataset containing 21653 examples of malicious code. The 20 most informative parameters based on the Fisher criterion are distinguished, methods for selecting parameters and various classifiers (logistic decision tree, random forest, naive Bayesian classifier, random tree) are compared, as a result of which an accuracy close to 100% is achieved.

Download Full-text

Plasma d-glutamate levels for detecting mild cognitive impairment and Alzheimer’s disease: Machine learning approaches

Journal of Psychopharmacology ◽

10.1177/0269881120972331 ◽

2021 ◽

Vol 35 (3) ◽

pp. 265-272 ◽

Cited By ~ 1

Author(s):

Chun-Hung Chang ◽

Chieh-Hsin Lin ◽

Chieh-Yu Liu ◽

Chih-Sheng Huang ◽

Shaw-Ji Chen ◽

...

Keyword(s):

Machine Learning ◽

Cognitive Impairment ◽

Mild Cognitive Impairment ◽

Random Forest ◽

Naive Bayes ◽

Mmse Score ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Healthy Controls ◽

Peripheral Plasma

Background: d-glutamate, which is involved in N-methyl-d-aspartate receptor modulation, may be associated with cognitive ageing. Aims: This study aimed to use peripheral plasma d-glutamate levels to differentiate patients with mild cognitive impairment (MCI) and Alzheimer’s disease (AD) from healthy individuals and to evaluate its prediction ability using machine learning. Methods: Overall, 31 healthy controls, 21 patients with MCI and 133 patients with AD were recruited. Serum d-glutamate levels were measured using high-performance liquid chromatography (HPLC). Cognitive deficit severity was assessed using the Clinical Dementia Rating scale and the Mini-Mental Status Examination (MMSE). We employed four machine learning algorithms (support vector machine, logistic regression, random forest and naïve Bayes) to build an optimal predictive model to distinguish patients with MCI or AD from healthy controls. Results: The MCI and AD groups had lower plasma d-glutamate levels (1097.79 ± 283.99 and 785.10 ± 720.06 ng/mL, respectively) compared to healthy controls (1620.08 ± 548.80 ng/mL). The naïve Bayes model and random forest model appeared to be the best models for determining MCI and AD susceptibility, respectively (area under the receiver operating characteristic curve: 0.8207 and 0.7900; sensitivity: 0.8438 and 0.6997; and specificity: 0.8158 and 0.9188, respectively). The total MMSE score was positively correlated with d-glutamate levels ( r = 0.368, p < 0.001). Multivariate regression analysis indicated that d-glutamate levels were significantly associated with the total MMSE score ( B = 0.003, 95% confidence interval 0.002–0.005, p < 0.001). Conclusions: Peripheral plasma d-glutamate levels were associated with cognitive impairment and may therefore be a suitable peripheral biomarker for detecting MCI and AD. Rapid and cost-effective HPLC for biomarkers and machine learning algorithms may assist physicians in diagnosing MCI and AD in outpatient clinics.

Download Full-text