Comparison Decision Tree and Logistic Regression Machine Learning Classification Algorithms to determine Covid-19

Many people today are unsure whether they have COVID-19. The frequent fever, dry cough, and sore throat are all signs and symptoms of COVID-19. If a person has signs or symptoms of coronavirus disease 2019 (COVID-19), he/she should see the doctor or go to a clinic as soon as possible. As a result, it's vital to learn and comprehend the fundamental differences. COVID-19 can cause a wide range of symptoms. The experiments were carried out using two Machine Learning Classification Algorithms, namely Decision Tree (DT) and Logistic Regression (LR). Both algorithms were written and analyzed using the Python program in Jupyter Notebook 6.4.5. From the results obtained in the experiments of covid symptoms dataset, on average, the DT model has obtained the best cross-validation average and the testing performance average compared to the LR machine learning models. For cross-validation results, the DT model has achieved an accuracy of 98.0%. For performance testing, the DT model has achieved an accuracy of 98.0%. The LR has obtained the second-best result on the average of cross-validation performance and the testing results. For cross-validation results, the LR model has achieved an accuracy of 96.0%. For performance testing, the LR model has achieved an accuracy of 97.0%. Consequently, the DT for the COVID-19 symptoms dataset is outperforming the LR for cross-validation and testing results.

Download Full-text

Student Academic Performance Prediction under Various Machine Learning Classification Algorithms

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.38786 ◽

2021 ◽

Vol 9 (11) ◽

pp. 221-237

Author(s):

M. Nirmala

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Academic Performance ◽

Random Forest ◽

Decision Tree ◽

Problem Definition ◽

Classification Algorithms ◽

Data Set ◽

Machine Learning Classification ◽

Student Academic Performance

Abstract: Data Mining in Educational System has increased tremendously in the past and still increasing in present era. This study focusses on the academic stand point and the performance of the student is evaluated by various parameters such as Scholastic Features, Demographic Features and Emotional Features are carried out. Various Machine learning methodologies are adopted to extract the masked knowledge from the educational data set provided, which helps in identifying the features giving more impact to the student academic performance and there by knowing the impacting features, helps us to predict deeper insights about student performance in academics. Various Machine learning workflow starting from problem definition to Model Prediction has been carried out in this study. The supervised learning methodology has been adopted and various Feature engineering methods has been adopted to make the ML model appropriate for training and evaluation. It is a prediction problem and various Classification algorithms such as Logistic Regression, Random Forest, SVM, KNN, XGBOOST, Decision Tree modelling has been done to fit the student data appropriately. Keywords: Scholastic, Demographic, Emotional, Logistic Regression, Random Forest, SVM, KNN, XGBOOST, Decision Tree.

Download Full-text

An Ontology Driven System to Predict Diabetes with Machine Learning Techniques

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.b7586.129219 ◽

2019 ◽

Vol 9 (2) ◽

pp. 4005-4011

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Decision Tree ◽

Early Stage ◽

Machine Learning Techniques ◽

Support Vector ◽

Classification Algorithms ◽

Machine Learning Classification ◽

Diagnostic Center ◽

Mental Trauma

Diabetes Mellitus is considered one of the chronic diseases of humankind which causes an increase in blood sugar. Many complications are reported if DM remains untreated and unidentified. Identification of this disease requires a lot of physical and mental trauma and effort which involves visiting a doctor, blood and urine test at the diagnostic center which consumes more time. Difficulties can be over crossed using the trending technology of Machine learning. The idea of the model is to prognosticate the occurrence of a diabetic with high accuracy. Therefore, two machine learning classification algorithms namely Fine Decision Tree and Support Vector Machine are used in this experiment to detect diabetes at an early stage. Therefore two machine learning classification algorithms namely Fine Decision Tree and Support Vector Machine are used in this experiment to detect diabetes at an early stage.

Download Full-text

COVID-19 World Vaccination Progress Using Machine Learning Classification Algorithms

Qubahan Academic Journal ◽

10.48161/qaj.v1n2a53 ◽

2021 ◽

Vol 1 (2) ◽

pp. 100-105

Author(s):

Nasiba M. Abdulkareem ◽

Adnan Mohsin Abdulazeez ◽

Diyar Qader Zeebaree ◽

Dathar A. Hasan

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Vaccine Development ◽

Machine Learning Techniques ◽

Classification Algorithms ◽

Real World Data ◽

K Nearest Neighbors ◽

Machine Learning Classification ◽

Learning Techniques ◽

And Performance

In December 2019, SARS-CoV-2 caused coronavirus disease (COVID-19) distributed to all countries, infecting thousands of people and causing deaths. COVID-19 induces mild sickness in most cases, although it may render some people very ill. Therefore, vaccines are in various phases of clinical progress, and some of them being approved for national use. The current state reveals that there is a critical need for a quick and timely solution to the Covid-19 vaccine development. Non-clinical methods such as data mining and machine learning techniques may help do this. This study will focus on the COVID-19 World Vaccination Progress using Machine learning classification Algorithms. The findings of the paper show which algorithm is better for a given dataset. Weka is used to run tests on real-world data, and four output classification algorithms (Decision Tree, K-nearest neighbors, Random Tree, and Naive Bayes) are used to analyze and draw conclusions. The comparison is based on accuracy and performance period, and it was discovered that the Decision Tree outperforms other algorithms in terms of time and accuracy.

Download Full-text

Classification of Categorical Outcome VariablenBased on Logistic Regression and Tree Algorithm

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.e6844.018520 ◽

2020 ◽

Vol 8 (5) ◽

pp. 4685-4690

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Decision Tree ◽

Classification Tree ◽

Research Work ◽

Large Data ◽

Training Set ◽

Tree Algorithm ◽

Machine Learning Classification

Logistic regression is most popular techniques incorporated in traditional statistics. Usually, this regression is applicable when the dependent variable is of categorical binary in nature. In the field of Statistics and Machine learning, classification of data is critical to discriminate to which set of clusters a new observation belongs, in the base of training set of a data containing observation whose group relationship is known. In this paper, we are focusing on the concepts of Logistic regression and classification tree. A large data taken from UCI (Machine learning Repository) incorporated for this research work. The aim of study is to distinguish the results obtained from Logistic regression and decision tree. At the end, decision tree gives better results than Logistic regression.

Download Full-text

Intelligent Techniques Analysis for Glycosylation Site Prediction

Current Bioinformatics ◽

10.2174/1574893615666210108094847 ◽

2021 ◽

Vol 15 ◽

Author(s):

Alhassan Alkuhlani ◽

Walaa Gad ◽

Mohamed Roushdy ◽

Abdel-Badeeh M. Salem

Keyword(s):

Machine Learning ◽

Prediction Models ◽

Cell Interaction ◽

Glycosylation Site ◽

Machine Learning Classification ◽

Site Prediction ◽

Glycosylation Sites ◽

Wide Range ◽

Feature Extraction And Selection ◽

Computational Intelligent

Background: Glycosylation is one of the most common post-translation modifications (PTMs) in organism cells. It plays important roles in several biological processes including cell-cell interaction, protein folding, antigen’s recognition, and immune response. In addition, glycosylation is associated with many human diseases such as cancer, diabetes and coronaviruses. The experimental techniques for identifying glycosylation sites are time-consuming, extensive laboratory work, and expensive. Therefore, computational intelligence techniques are becoming very important for glycosylation site prediction. Objective: This paper is a theoretical discussion of the technical aspects of the biotechnological (e.g., using artificial intelligence and machine learning) to digital bioinformatics research and intelligent biocomputing. The computational intelligent techniques have shown efficient results for predicting N-linked, O-linked and C-linked glycosylation sites. In the last two decades, many studies have been conducted for glycosylation site prediction using these techniques. In this paper, we analyze and compare a wide range of intelligent techniques of these studies from multiple aspects. The current challenges and difficulties facing the software developers and knowledge engineers for predicting glycosylation sites are also included. Method: The comparison between these different studies is introduced including many criteria such as databases, feature extraction and selection, machine learning classification methods, evaluation measures and the performance results. Results and conclusions: Many challenges and problems are presented. Consequently, more efforts are needed to get more accurate prediction models for the three basic types of glycosylation sites.

Download Full-text

AN EFFICIENT MACHINE LEARNING MODEL FOR PREDICTION OF ACUTE MYOCARDIAL INFARCTION

Recent Advances in Computer Science and Communications ◽

10.2174/2666255813666200325104317 ◽

2020 ◽

Vol 13 ◽

Author(s):

Dhilsath Fathima.M ◽

S. Justin Samuel ◽

R. Hari Haran

Keyword(s):

Machine Learning ◽

Myocardial Infarction ◽

Acute Myocardial Infarction ◽

Logistic Regression ◽

Decision Tree ◽

Learning Model ◽

Training Dataset ◽

Data Set ◽

Machine Learning Model ◽

Proposed Model

Aim: This proposed work is used to develop an improved and robust machine learning model for predicting Myocardial Infarction (MI) could have substantial clinical impact. Objectives: This paper explains how to build machine learning based computer-aided analysis system for an early and accurate prediction of Myocardial Infarction (MI) which utilizes framingham heart study dataset for validation and evaluation. This proposed computer-aided analysis model will support medical professionals to predict myocardial infarction proficiently. Methods: The proposed model utilize the mean imputation to remove the missing values from the data set, then applied principal component analysis to extract the optimal features from the data set to enhance the performance of the classifiers. After PCA, the reduced features are partitioned into training dataset and testing dataset where 70% of the training dataset are given as an input to the four well-liked classifiers as support vector machine, k-nearest neighbor, logistic regression and decision tree to train the classifiers and 30% of test dataset is used to evaluate an output of machine learning model using performance metrics as confusion matrix, classifier accuracy, precision, sensitivity, F1-score, AUC-ROC curve. Results: Output of the classifiers are evaluated using performance measures and we observed that logistic regression provides high accuracy than K-NN, SVM, decision tree classifiers and PCA performs sound as a good feature extraction method to enhance the performance of proposed model. From these analyses, we conclude that logistic regression having good mean accuracy level and standard deviation accuracy compared with the other three algorithms. AUC-ROC curve of the proposed classifiers is analyzed from the output figure.4, figure.5 that logistic regression exhibits good AUC-ROC score, i.e. around 70% compared to k-NN and decision tree algorithm. Conclusion: From the result analysis, we infer that this proposed machine learning model will act as an optimal decision making system to predict the acute myocardial infarction at an early stage than an existing machine learning based prediction models and it is capable to predict the presence of an acute myocardial Infarction with human using the heart disease risk factors, in order to decide when to start lifestyle modification and medical treatment to prevent the heart disease.

Download Full-text

Severity Prediction of COVID-19 Patients Using Machine Learning Classification Algorithms: A Case Study of Small City in Pakistan with Minimal Health Facility

2020 IEEE 6th International Conference on Computer and Communications (ICCC) ◽

10.1109/iccc51575.2020.9344984 ◽

2020 ◽

Author(s):

Hina Gull ◽

Gomathi Krishna ◽

May Issa Aldossary ◽

Sardar Zafar Iqbal

Keyword(s):

Machine Learning ◽

Health Facility ◽

Small City ◽

Classification Algorithms ◽

Machine Learning Classification ◽

Severity Prediction

Download Full-text

Classification Model Simulator: A simulator for different Machine Learning Classification Algorithms

2021 2nd International Conference for Emerging Technology (INCET) ◽

10.1109/incet51464.2021.9456348 ◽

2021 ◽

Author(s):

Abhinandan Singla ◽

Unnati Chaturvedi ◽

Preet Kanwal

Keyword(s):

Machine Learning ◽

Classification Model ◽

Classification Algorithms ◽

Machine Learning Classification

Download Full-text

A proof-of-concept study applying machine learning methods to putative risk factors for eating disorders: results from the multi-centre European project on healthy eating

Psychological Medicine ◽

10.1017/s003329172100489x ◽

2021 ◽

pp. 1-10

Author(s):

I. Krug ◽

J. Linardon ◽

C. Greenwood ◽

G. Youssef ◽

J. Treasure ◽

...

Keyword(s):

Machine Learning ◽

Risk Factors ◽

Logistic Regression ◽

Predictive Accuracy ◽

Area Under The Curve ◽

Prediction Rule ◽

Predictive Performance ◽

Individual Risk ◽

European Project ◽

Wide Range

Abstract Background Despite a wide range of proposed risk factors and theoretical models, prediction of eating disorder (ED) onset remains poor. This study undertook the first comparison of two machine learning (ML) approaches [penalised logistic regression (LASSO), and prediction rule ensembles (PREs)] to conventional logistic regression (LR) models to enhance prediction of ED onset and differential ED diagnoses from a range of putative risk factors. Method Data were part of a European Project and comprised 1402 participants, 642 ED patients [52% with anorexia nervosa (AN) and 40% with bulimia nervosa (BN)] and 760 controls. The Cross-Cultural Risk Factor Questionnaire, which assesses retrospectively a range of sociocultural and psychological ED risk factors occurring before the age of 12 years (46 predictors in total), was used. Results All three statistical approaches had satisfactory model accuracy, with an average area under the curve (AUC) of 86% for predicting ED onset and 70% for predicting AN v. BN. Predictive performance was greatest for the two regression methods (LR and LASSO), although the PRE technique relied on fewer predictors with comparable accuracy. The individual risk factors differed depending on the outcome classification (EDs v. non-EDs and AN v. BN). Conclusions Even though the conventional LR performed comparably to the ML approaches in terms of predictive accuracy, the ML methods produced more parsimonious predictive models. ML approaches offer a viable way to modify screening practices for ED risk that balance accuracy against participant burden.

Download Full-text