scholarly journals An Overview of Supervised Machine Learning Methods and Data Analysis for COVID-19 Detection

2021 ◽  
Vol 2021 ◽  
pp. 1-18
Author(s):  
Aurelle Tchagna Kouanou ◽  
Thomas Mih Attia ◽  
Cyrille Feudjio ◽  
Anges Fleurio Djeumo ◽  
Adèle Ngo Mouelas ◽  
...  

Background and Objective. To mitigate the spread of the virus responsible for COVID-19, known as SARS-CoV-2, there is an urgent need for massive population testing. Due to the constant shortage of PCR (polymerase chain reaction) test reagents, which are the tests for COVID-19 by excellence, several medical centers have opted for immunological tests to look for the presence of antibodies produced against this virus. However, these tests have a high rate of false positives (positive but actually negative test results) and false negatives (negative but actually positive test results) and are therefore not always reliable. In this paper, we proposed a solution based on Data Analysis and Machine Learning to detect COVID-19 infections. Methods. Our analysis and machine learning algorithm is based on most cited two clinical datasets from the literature: one from San Raffaele Hospital Milan Italia and the other from Hospital Israelita Albert Einstein São Paulo Brasilia. The datasets were processed to select the best features that most influence the target, and it turned out that almost all of them are blood parameters. EDA (Exploratory Data Analysis) methods were applied to the datasets, and a comparative study of supervised machine learning models was done, after which the support vector machine (SVM) was selected as the one with the best performance. Results. SVM being the best performant is used as our proposed supervised machine learning algorithm. An accuracy of 99.29%, sensitivity of 92.79%, and specificity of 100% were obtained with the dataset from Kaggle (https://www.kaggle.com/einsteindata4u/covid19) after applying optimization to SVM. The same procedure and work were performed with the dataset taken from San Raffaele Hospital (https://zenodo.org/record/3886927#.YIluB5AzbMV). Once more, the SVM presented the best performance among other machine learning algorithms, and 92.86%, 93.55%, and 90.91% for accuracy, sensitivity, and specificity, respectively, were obtained. Conclusion. The obtained results, when compared with others from the literature based on these same datasets, are superior, leading us to conclude that our proposed solution is reliable for the COVID-19 diagnosis.

Author(s):  
Shahadat Uddin ◽  
Arif Khan ◽  
Md Ekramul Hossain ◽  
Mohammad Ali Moni

Abstract Background Supervised machine learning algorithms have been a dominant method in the data mining field. Disease prediction using health data has recently shown a potential application area for these methods. This study aims to identify the key trends among different types of supervised machine learning algorithms, and their performance and usage for disease risk prediction. Methods In this study, extensive research efforts were made to identify those studies that applied more than one supervised machine learning algorithm on single disease prediction. Two databases (i.e., Scopus and PubMed) were searched for different types of search items. Thus, we selected 48 articles in total for the comparison among variants supervised machine learning algorithms for disease prediction. Results We found that the Support Vector Machine (SVM) algorithm is applied most frequently (in 29 studies) followed by the Naïve Bayes algorithm (in 23 studies). However, the Random Forest (RF) algorithm showed superior accuracy comparatively. Of the 17 studies where it was applied, RF showed the highest accuracy in 9 of them, i.e., 53%. This was followed by SVM which topped in 41% of the studies it was considered. Conclusion This study provides a wide overview of the relative performance of different variants of supervised machine learning algorithms for disease prediction. This important information of relative performance can be used to aid researchers in the selection of an appropriate supervised machine learning algorithm for their studies.


2018 ◽  
Vol 7 (4.15) ◽  
pp. 400 ◽  
Author(s):  
Thuy Nguyen Thi Thu ◽  
Vuong Dang Xuan

The exchange rate of each money pair can be predicted by using machine learning algorithm during classification process. With the help of supervised machine learning model, the predicted uptrend or downtrend of FoRex rate might help traders to have right decision on FoRex transactions. The installation of machine learning algorithms in the FoRex trading online market can automatically make the transactions of buying/selling. All the transactions in the experiment are performed by using scripts added-on in transaction application. The capital, profits results of use support vector machine (SVM) models are higher than the normal one (without use of SVM). 


Sensors ◽  
2020 ◽  
Vol 20 (6) ◽  
pp. 1557 ◽  
Author(s):  
Ilaria Conforti ◽  
Ilaria Mileti ◽  
Zaccaria Del Prete ◽  
Eduardo Palermo

Ergonomics evaluation through measurements of biomechanical parameters in real time has a great potential in reducing non-fatal occupational injuries, such as work-related musculoskeletal disorders. Assuming a correct posture guarantees the avoidance of high stress on the back and on the lower extremities, while an incorrect posture increases spinal stress. Here, we propose a solution for the recognition of postural patterns through wearable sensors and machine-learning algorithms fed with kinematic data. Twenty-six healthy subjects equipped with eight wireless inertial measurement units (IMUs) performed manual material handling tasks, such as lifting and releasing small loads, with two postural patterns: correctly and incorrectly. Measurements of kinematic parameters, such as the range of motion of lower limb and lumbosacral joints, along with the displacement of the trunk with respect to the pelvis, were estimated from IMU measurements through a biomechanical model. Statistical differences were found for all kinematic parameters between the correct and the incorrect postures (p < 0.01). Moreover, with the weight increase of load in the lifting task, changes in hip and trunk kinematics were observed (p < 0.01). To automatically identify the two postures, a supervised machine-learning algorithm, a support vector machine, was trained, and an accuracy of 99.4% (specificity of 100%) was reached by using the measurements of all kinematic parameters as features. Meanwhile, an accuracy of 76.9% (specificity of 76.9%) was reached by using the measurements of kinematic parameters related to the trunk body segment.


2021 ◽  
Author(s):  
Omar Alfarisi ◽  
Zeyar Aung ◽  
Mohamed Sassi

For defining the optimal machine learning algorithm, the decision was not easy for which we shall choose. To help future researchers, we describe in this paper the optimal among the best of the algorithms. We built a synthetic data set and performed the supervised machine learning runs for five different algorithms. For heterogeneity, we identified Random Forest, among others, to be the best algorithm.


2020 ◽  
pp. 45-49
Author(s):  
Gajendra Sharma ◽  

Fault tolerance is an important issue in the field of cloud computing which is concerned with the techniques or mechanism needed to enable a system to tolerate the faults that may encounter during its functioning. Fault tolerance policy can be categorized into three categories viz. proactive, reactive and adaptive. Providing a systematic solution the loss can be minimized and guarantee the availability and reliability of the critical services. The purpose and scope of this study is to recommend Support Vector Machine, a supervised machine learning algorithm to proactively monitor the fault so as to increase the availability and reliability by combining the strength of machine learning algorithm with cloud computing.


2013 ◽  
Vol 10 (2) ◽  
pp. 1376-1383
Author(s):  
Dr.Vijay Pal Dhaka ◽  
Swati Agrawal

Maintainability is an important quality attribute and a difficult concept as it involves a number of measurements. Quality estimation means estimating maintainability of software. Maintainability is a set of attribute that bear on the effort needed to make specified modification. The main goal of this paper is to propose use of few machine learning algorithms with an objective to predict software maintainability and evaluate them. The propose models are Gaussian process regression networks (GPRN), probably approximately correct learning (PAC), Genetic algorithm (GA). This paper predicts the maintenance effort. The QUES (Quality evaluation system) dataset are used in this study. The QUES datasets contains 71 classes. To measure the maintainability, number of “CHANGE” is observed over a period of few years. We can define CHANGE as the number of lines of code which were added, deleted or modified during few year maintenance periods. After this study these machine learning algorithm was compared with few models such as GRNN (General regression neural network) model, RT (Regression tree), MARS (Multiple adaptive regression splines), SVM (Support vector machine), MLR (Multiple linear regression) models. Based on experiments, it was found that GPRN can be predicting the maintainability more accurately and precisely than prevailing models. We also include object oriented software metric to measure the software maintainability. The use of machine learning algorithms to establish the relationship between metrics and maintainability would be much better approach as these are based on quantity as well as quality. 


Circulation ◽  
2020 ◽  
Vol 142 (Suppl_3) ◽  
Author(s):  
Aliaskar Z Hasani ◽  
Kusha Rahgozar ◽  
Aaron Wengrofsky ◽  
Narasimha Kuchimanchi ◽  
Mohammad Hashim Mustehsan ◽  
...  

Introduction: Aortic Stenosis is the most common valvular disorder with a predominance in the elderly. Trans-Aortic Valve Replacement (TAVR) has been an effective procedure with marked improvement in quality of life for patients. The procedure carries a small, yet clinically significant risk of stroke. The use of Neutrophil-Lymphocyte Ratios (NLR) and Platelet-Lymphocyte Ratios (PLR) have been growing as novel markers of systemic inflammation. We investigated the ability of a machine learning algorithm (Light GBM) to predict and weigh these ratios along with other clinical parameters for prediction of stroke after TAVR. Objective: To demonstrate the efficacy of the Supervised Machine Learning algorithm, Light GBM, in identifying important variables to predict stroke after TAVR. Methods: We performed a retrospective analysis of 291 patients who underwent TAVR from 2015-2019 at Montefiore Medical Center. Age (80±8), 50.2% Female, BMI (28.7 ±6.3). Clinical data was collected through our Hospital EMR. NLR and PLR averages were obtained using the mean of baseline (prior to surgery), Immediate Post-operative, and Post-operative Day 1 values. A supervised machine learning algorithm, Light GBM, used decision tree algorithms with both level-wise growth and leaf-wise growth. The algorithm was trained on 80% of the data and internally validated on the other 20%. Results: We identified NLR and PLR as the second and third most important feature of importance (Table 1) Clinical and demographic features of importance included BMI, Age, and Sex. Our model when internally validated yield a Sensitivity of 75.0%, Specificity of 91.5%, Accuracy of 91.5%, and F1 of 0.75. The AUC for the model was 0.84. Conclusions: Using Novel hematological parameters in conjunction with machine learning algorithms have highlighted important variables in predicting stroke after TAVR. Extrapolated, average NLR and PLR can be an inexpensive tool in stratifying patients those patients most at risk.


2020 ◽  
Vol 39 (4) ◽  
pp. 5687-5698
Author(s):  
Chunfeng Guo

There are currently few studies on the stress of athletes, so it is impossible to provide effective stadium guidance for athletes. Based on this, this study combines machine learning algorithms to identify athletes’ pre-game emotions. At the same time, this study obtains the data related to the research through the survey access form and obtains the physiological parameters of the athletes under stress in the experimental way and processes the physiological parameters of the athletes with the machine learning algorithm. In order to improve the efficiency of data processing, this study improves the traditional machine learning algorithm, and combines the particle optimization algorithm with the support vector machine to realize the effective recognition of the athlete’s physiological state. In addition, through the experimental method combined with the contrast method, this paper compares the performance of the improved algorithm with the traditional algorithm and combines the data analysis to analyze the test results. Finally, this study analyzes the effectiveness of the proposed algorithm by example analysis. The research shows that the proposed algorithm has better performance than the traditional algorithm and has certain practical significance and can provide theoretical reference for subsequent related research.


10.2196/20840 ◽  
2020 ◽  
Vol 22 (11) ◽  
pp. e20840
Author(s):  
Aaqib Shehzad ◽  
Kenneth Rockwood ◽  
Justin Stanley ◽  
Taylor Dunn ◽  
Susan E Howlett

Background SymptomGuide Dementia (DGI Clinical Inc) is a publicly available online symptom tracking tool to support caregivers of persons living with dementia. The value of such data are enhanced when the specific dementia stage is identified. Objective We aimed to develop a supervised machine learning algorithm to classify dementia stages based on tracked symptoms. Methods We employed clinical data from 717 people from 3 sources: (1) a memory clinic; (2) long-term care; and (3) an open-label trial of donepezil in vascular and mixed dementia (VASPECT). Symptoms were captured with SymptomGuide Dementia. A clinician classified participants into 4 groups using either the Functional Assessment Staging Test or the Global Deterioration Scale as mild cognitive impairment, mild dementia, moderate dementia, or severe dementia. Individualized symptom profiles from the pooled data were used to train machine learning models to predict dementia severity. Models trained with 6 different machine learning algorithms were compared using nested cross-validation to identify the best performing model. Model performance was assessed using measures of balanced accuracy, precision, recall, Cohen κ, area under the receiver operating characteristic curve (AUROC), and area under the precision-recall curve (AUPRC). The best performing algorithm was used to train a model optimized for balanced accuracy. Results The study population was mostly female (424/717, 59.1%), older adults (mean 77.3 years, SD 10.6, range 40-100) with mild to moderate dementia (332/717, 46.3%). Age, duration of symptoms, 37 unique dementia symptoms, and 10 symptom-derived variables were used to distinguish dementia stages. A model trained with a support vector machine learning algorithm using a one-versus-rest approach showed the best performance. The correct dementia stage was identified with 83% balanced accuracy (Cohen κ=0.81, AUPRC 0.91, AUROC 0.96). The best performance was seen when classifying severe dementia (AUROC 0.99). Conclusions A supervised machine learning algorithm exhibited excellent performance in identifying dementia stages based on dementia symptoms reported in an online environment. This novel dementia staging algorithm can be used to describe dementia stage based on user-reported symptoms. This type of symptom recording offers real-world data that reflect important symptoms in people with dementia.


2012 ◽  
Vol 9 (73) ◽  
pp. 1934-1942 ◽  
Author(s):  
Philip J. Hepworth ◽  
Alexey V. Nefedov ◽  
Ilya B. Muchnik ◽  
Kenton L. Morgan

Machine-learning algorithms pervade our daily lives. In epidemiology, supervised machine learning has the potential for classification, diagnosis and risk factor identification. Here, we report the use of support vector machine learning to identify the features associated with hock burn on commercial broiler farms, using routinely collected farm management data. These data lend themselves to analysis using machine-learning techniques. Hock burn, dermatitis of the skin over the hock, is an important indicator of broiler health and welfare. Remarkably, this classifier can predict the occurrence of high hock burn prevalence with accuracy of 0.78 on unseen data, as measured by the area under the receiver operating characteristic curve. We also compare the results with those obtained by standard multi-variable logistic regression and suggest that this technique provides new insights into the data. This novel application of a machine-learning algorithm, embedded in poultry management systems could offer significant improvements in broiler health and welfare worldwide.


Sign in / Sign up

Export Citation Format

Share Document