scholarly journals COVID-19 Future Predictions Using 4 Supervised Machine Learning Models

Author(s):  
Aditi Vadhavkar ◽  
Pratiksha Thombare ◽  
Priyanka Bhalerao ◽  
Utkarsha Auti

Forecasting Mechanisms like Machine Learning (ML) models having been proving their significance to anticipate perioperative outcomes in the domain of decision making on the future course of actions. Many application domains have witnessed the use of ML models for identification and prioritization of adverse factors for a threat. The spread of COVID-19 has proven to be a great threat to a mankind announcing it a worldwide pandemic throughout. Many assets throughout the world has faced enormous infectivity and contagiousness of this illness. To look at the figure of undermining components of COVID-19 we’ve specifically used four Machine Learning Models Linear Regression (LR), Least shrinkage and determination administrator (LASSO), Support vector machine (SVM) and Exponential smoothing (ES). The results depict that the ES performs best among the four models employed in this study, followed by LR and LASSO which performs well in forecasting the newly confirmed cases, death rates yet recovery rates, but SVM performs poorly all told the prediction scenarios given the available dataset.

Author(s):  
Brijesh Patel ◽  
Dr. Sheshang Degadwala

Several episode expectation models for COVID-19 are being used by officials all over the world to make informed decisions and maintain necessary control steps. AI (ML)-based deciding elements have proven their worth in forecasting perioperative outcomes in order to enhance the dynamic of the predicted course of activities. For a long time, ML models have been used in a variety of application areas that needed identifiable evidence and prioritization of unfavorable factors for a danger. To cope with expecting problems, a few anticipation strategies are commonly used. This study demonstrates the ability of ML models to predict the number of future patients affected by COVID-19, which is now regarded as a potential threat to humanity. In particular, four standard evaluating models, such as Linear Regression, Support Vector Machine, LASSO, Exponential Smoothing, and Decision Tree, were used in this investigation to hypothesis the compromising variables of COVID-19. Any one of the models makes three types of predictions, for example, the number of recently Positive cases after and before preliminary vexing, the amount of passing's after and before preliminary lockdown, and the number of recuperations after and before lockdown. The outcomes demonstrate with parameters like R2 Score, Adjust R2 score, MSE, MAE and RMSE on Indian datasets.


2020 ◽  
Vol 11 (40) ◽  
pp. 8-23
Author(s):  
Pius MARTHIN ◽  
Duygu İÇEN

Online product reviews have become a valuable source of information which facilitate customer decision with respect to a particular product. With the wealthy information regarding user's satisfaction and experiences about a particular drug, pharmaceutical companies make the use of online drug reviews to improve the quality of their products. Machine learning has enabled scientists to train more efficient models which facilitate decision making in various fields. In this manuscript we applied a drug review dataset used by (Gräβer, Kallumadi, Malberg,& Zaunseder, 2018), available freely from machine learning repository website of the University of California Irvine (UCI) to identify best machine learning model which provide a better prediction of the overall drug performance with respect to users' reviews. Apart from several manipulations done to improve model accuracy, all necessary procedures required for text analysis were followed including text cleaning and transformation of texts to numeric format for easy training machine learning models. Prior to modeling, we obtained overall sentiment scores for the reviews. Customer's reviews were summarized and visualized using a bar plot and word cloud to explore the most frequent terms. Due to scalability issues, we were able to use only the sample of the dataset. We randomly sampled 15000 observations from the 161297 training dataset and 10000 observations were randomly sampled from the 53766 testing dataset. Several machine learning models were trained using 10 folds cross-validation performed under stratified random sampling. The trained models include Classification and Regression Trees (CART), classification tree by C5.0, logistic regression (GLM), Multivariate Adaptive Regression Spline (MARS), Support vector machine (SVM) with both radial and linear kernels and a classification tree using random forest (Random Forest). Model selection was done through a comparison of accuracies and computational efficiency. Support vector machine (SVM) with linear kernel was significantly best with an accuracy of 83% compared to the rest. Using only a small portion of the dataset, we managed to attain reasonable accuracy in our models by applying the TF-IDF transformation and Latent Semantic Analysis (LSA) technique to our TDM.


Author(s):  
Pratyush Kaware

In this paper a cost-effective sensor has been implemented to read finger bend signals, by attaching the sensor to a finger, so as to classify them based on the degree of bent as well as the joint about which the finger was being bent. This was done by testing with various machine learning algorithms to get the most accurate and consistent classifier. Finally, we found that Support Vector Machine was the best algorithm suited to classify our data, using we were able predict live state of a finger, i.e., the degree of bent and the joints involved. The live voltage values from the sensor were transmitted using a NodeMCU micro-controller which were converted to digital and uploaded on a database for analysis.


Sensors ◽  
2020 ◽  
Vol 20 (24) ◽  
pp. 7096
Author(s):  
Julianna P. Kadar ◽  
Monique A. Ladds ◽  
Joanna Day ◽  
Brianne Lyall ◽  
Culum Brown

Movement ecology has traditionally focused on the movements of animals over large time scales, but, with advancements in sensor technology, the focus can become increasingly fine scale. Accelerometers are commonly applied to quantify animal behaviours and can elucidate fine-scale (<2 s) behaviours. Machine learning methods are commonly applied to animal accelerometry data; however, they require the trial of multiple methods to find an ideal solution. We used tri-axial accelerometers (10 Hz) to quantify four behaviours in Port Jackson sharks (Heterodontus portusjacksoni): two fine-scale behaviours (<2 s)—(1) vertical swimming and (2) chewing as proxy for foraging, and two broad-scale behaviours (>2 s–mins)—(3) resting and (4) swimming. We used validated data to calculate 66 summary statistics from tri-axial accelerometry and assessed the most important features that allowed for differentiation between the behaviours. One and two second epoch testing sets were created consisting of 10 and 20 samples from each behaviour event, respectively. We developed eight machine learning models to assess their overall accuracy and behaviour-specific accuracy (one classification tree, five ensemble learners and two neural networks). The support vector machine model classified the four behaviours better when using the longer 2 s time epoch (F-measure 89%; macro-averaged F-measure: 90%). Here, we show that this support vector machine (SVM) model can reliably classify both fine- and broad-scale behaviours in Port Jackson sharks.


Author(s):  
Tsehay Admassu Assegie

Machine-learning approaches have become greatly applicable in disease diagnosis and prediction process. This is because of the accuracy and better precision of the machine learning models in disease prediction. However, different machine learning models have different accuracy and precision on disease prediction. Selecting the better model that would result in better disease prediction accuracy and precision is an open research problem. In this study, we have proposed machine learning model for liver disease prediction using Support Vector Machine (SVM) and K-Nearest Neighbors (KNN) learning algorithms and we have evaluated the accuracy and precision of the models on liver disease prediction using the Indian liver disease data repository. The analysis of result showed 82.90% accuracy for SVM and 72.64% accuracy for the KNN algorithm. Based on the accuracy score of SVM and KNN on experimental test results, the SVM is better in performance on the liver disease prediction than the KNN algorithm.  


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Morshedul Bari Antor ◽  
A. H. M. Shafayet Jamil ◽  
Maliha Mamtaz ◽  
Mohammad Monirujjaman Khan ◽  
Sultan Aljahdali ◽  
...  

Alzheimer’s disease has been one of the major concerns recently. Around 45 million people are suffering from this disease. Alzheimer’s is a degenerative brain disease with an unspecified cause and pathogenesis which primarily affects older people. The main cause of Alzheimer’s disease is Dementia, which progressively damages the brain cells. People lost their thinking ability, reading ability, and many more from this disease. A machine learning system can reduce this problem by predicting the disease. The main aim is to recognize Dementia among various patients. This paper represents the result and analysis regarding detecting Dementia from various machine learning models. The Open Access Series of Imaging Studies (OASIS) dataset has been used for the development of the system. The dataset is small, but it has some significant values. The dataset has been analyzed and applied in several machine learning models. Support vector machine, logistic regression, decision tree, and random forest have been used for prediction. First, the system has been run without fine-tuning and then with fine-tuning. Comparing the results, it is found that the support vector machine provides the best results among the models. It has the best accuracy in detecting Dementia among numerous patients. The system is simple and can easily help people by detecting Dementia among them.


PLoS ONE ◽  
2021 ◽  
Vol 16 (9) ◽  
pp. e0257069
Author(s):  
Jae-Geum Shim ◽  
Kyoung-Ho Ryu ◽  
Sung Hyun Lee ◽  
Eun-Ah Cho ◽  
Sungho Lee ◽  
...  

Objective To construct a prediction model for optimal tracheal tube depth in pediatric patients using machine learning. Methods Pediatric patients aged <7 years who received post-operative ventilation after undergoing surgery between January 2015 and December 2018 were investigated in this retrospective study. The optimal location of the tracheal tube was defined as the median of the distance between the upper margin of the first thoracic(T1) vertebral body and the lower margin of the third thoracic(T3) vertebral body. We applied four machine learning models: random forest, elastic net, support vector machine, and artificial neural network and compared their prediction accuracy to three formula-based methods, which were based on age, height, and tracheal tube internal diameter(ID). Results For each method, the percentage with optimal tracheal tube depth predictions in the test set was calculated as follows: 79.0 (95% confidence interval [CI], 73.5 to 83.6) for random forest, 77.4 (95% CI, 71.8 to 82.2; P = 0.719) for elastic net, 77.0 (95% CI, 71.4 to 81.8; P = 0.486) for support vector machine, 76.6 (95% CI, 71.0 to 81.5; P = 1.0) for artificial neural network, 66.9 (95% CI, 60.9 to 72.5; P < 0.001) for the age-based formula, 58.5 (95% CI, 52.3 to 64.4; P< 0.001) for the tube ID-based formula, and 44.4 (95% CI, 38.3 to 50.6; P < 0.001) for the height-based formula. Conclusions In this study, the machine learning models predicted the optimal tracheal tube tip location for pediatric patients more accurately than the formula-based methods. Machine learning models using biometric variables may help clinicians make decisions regarding optimal tracheal tube depth in pediatric patients.


2017 ◽  
Author(s):  
Chin Lin ◽  
Chia-Jung Hsu ◽  
Yu-Sheng Lou ◽  
Shih-Jen Yeh ◽  
Chia-Cheng Lee ◽  
...  

BACKGROUND Automated disease code classification using free-text medical information is important for public health surveillance. However, traditional natural language processing (NLP) pipelines are limited, so we propose a method combining word embedding with a convolutional neural network (CNN). OBJECTIVE Our objective was to compare the performance of traditional pipelines (NLP plus supervised machine learning models) with that of word embedding combined with a CNN in conducting a classification task identifying International Classification of Diseases, Tenth Revision, Clinical Modification (ICD-10-CM) diagnosis codes in discharge notes. METHODS We used 2 classification methods: (1) extracting from discharge notes some features (terms, n-gram phrases, and SNOMED CT categories) that we used to train a set of supervised machine learning models (support vector machine, random forests, and gradient boosting machine), and (2) building a feature matrix, by a pretrained word embedding model, that we used to train a CNN. We used these methods to identify the chapter-level ICD-10-CM diagnosis codes in a set of discharge notes. We conducted the evaluation using 103,390 discharge notes covering patients hospitalized from June 1, 2015 to January 31, 2017 in the Tri-Service General Hospital in Taipei, Taiwan. We used the receiver operating characteristic curve as an evaluation measure, and calculated the area under the curve (AUC) and F-measure as the global measure of effectiveness. RESULTS In 5-fold cross-validation tests, our method had a higher testing accuracy (mean AUC 0.9696; mean F-measure 0.9086) than traditional NLP-based approaches (mean AUC range 0.8183-0.9571; mean F-measure range 0.5050-0.8739). A real-world simulation that split the training sample and the testing sample by date verified this result (mean AUC 0.9645; mean F-measure 0.9003 using the proposed method). Further analysis showed that the convolutional layers of the CNN effectively identified a large number of keywords and automatically extracted enough concepts to predict the diagnosis codes. CONCLUSIONS Word embedding combined with a CNN showed outstanding performance compared with traditional methods, needing very little data preprocessing. This shows that future studies will not be limited by incomplete dictionaries. A large amount of unstructured information from free-text medical writing will be extracted by automated approaches in the future, and we believe that the health care field is about to enter the age of big data.


2021 ◽  
Author(s):  
Mohammed Alghazal ◽  
Dimitrios Krinis

Abstract Dielectric log is a specialized tool with proprietary procedures to predict oil saturation independent of water salinity. Conventional resistivity logging is more routinely used but dependent on water salinity and Archie's parameters, leading to high measurement uncertainty in mixed salinity environments. This paper presents a novel machine learning approach of propagating the coverage of dielectric-based oil saturation driven by features extracted from commonly available reservoir information, petrophysical properties and conventional log data. More than 20 features were extracted from several sources. Based on sampling frequency, extracted features are divided into well-based discrete features and petrophysical-based continuous features. Examples of well-based features include well location with respect to flank (east or west), fluid viscosities and densities, total dissolved solids from surface water, distance to nearest water injector and injection volume. Petrophysical-based features include height above free water level (HAFWL), porosity, modelled permeability, initial water saturation, resistivity-based saturation, rock-type and caliper. In addition, we engineered two new depth-related and continuous features, we call them Height-Below-Crest (HBC) and Height-Above-Top-Injector-Zone (HATIZ). Initial data exploration was performed using Pearson's correlation heat map. Fluid densities and viscosities show strong correlation (60-80%) to the engineered features (HBC and HATIZ), which helped to capture the viscous and gravity forces effect across the well's vertical depth. The heat map also shows weak correlation between the features and the target variable, the oil saturation from dielectric log. The dataset, with 5000 samples, was randomly split into 80% training and 20% testing. A robust scaling technique to outliers is used to scale the features prior to modeling. The preliminary performance of various supervised machine learning models, including decision trees, ensemble methods, neural network and support vector machines, were benchmarked using K-Fold cross-validation on the training data prior to testing. Ensemble-based methods, random forest and gradient boosting, produced the least mean absolute error compared to other methods and thus were selected for further hyper-parameter tuning. Exhaustive grid search was performed on both models to find the best-fit parameters, achieving a correlation coefficient of 70% on the testing dataset. Features analysis indicate that the engineered features, HBC and HATIZ, along with the porosity, HAFWL and resistivity-based saturation are the most importance features for predicting the oil saturation from dielectric log. Dielectric log provides an edge over resistivity-based logging technique in mixed salinity formations, but with more elaborate interpretation procedures. In this paper, we present a soft-computing and economical alternative of using ensemble machine learning models to predict oil saturation from dielectric log given some extracted features from common reservoir information, petrophysical properties and conventional log data.


Sign in / Sign up

Export Citation Format

Share Document