HYPER-PARAMETER OPTIMIZATION AND EVALUATION ON SELECTED MACHINE LEARNING ALGORITHM USING HEPATITIS DATASET

Despite the popularity and utility of most machine learning techniques, expert knowledge is required in guiding choices about the suitable technique and settings that are good for solving a specific problem. The lack of expert information renders the procedures vulnerable to poor parameter settings. Several of these machine learning techniques configurations are offered under default settings. However, since different classification problems required suitable machine learning techniques, selecting the appropriate technique and tuning its settings are vital works that will rightly improve predictions in terms of reliability and accuracy. This study aims to perform grid search parameters tuning on 5-selected machine learning techniques on hepatitis disease. Comparative performance is drawn side-by-side with the default settings. The experimental results of the five tuning techniques show that using the configurations suggested in our work yield predictions of a greatly sophisticated quality than choice under its default settings. The result proves that tuning parameters of Support Vector Machine via grid search yields the best accuracy outcomes of 90% and has a competitive performance relative towards criteria of precision, recall, accuracy and Area Under the Curve. Present combinations of parameter settings for each of the techniques by identifying ranges of values for each setting that give good Hepatitis disease outcomes

Download Full-text

A Comparative Study of Different Machine Learning Algorithms for Disease Prediction

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse/v7i7/0177 ◽

2017 ◽

Vol 7 (7) ◽

pp. 172

Author(s):

Anantvir Singh Romana

Keyword(s):

Machine Learning ◽

Subsequent Treatment ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Disease Prediction ◽

Classification Problems ◽

Learning Techniques ◽

Neural Network Classifiers ◽

Diagnostic Detection

Accurate diagnostic detection of the disease in a patient is critical and may alter the subsequent treatment and increase the chances of survival rate. Machine learning techniques have been instrumental in disease detection and are currently being used in various classification problems due to their accurate prediction performance. Various techniques may provide different desired accuracies and it is therefore imperative to use the most suitable method which provides the best desired results. This research seeks to provide comparative analysis of Support Vector Machine, Naïve bayes, J48 Decision Tree and neural network classifiers breast cancer and diabetes datsets.

Download Full-text

Comparison of Machine Learning algorithm for COVID-19 Death Risk Prediction

10.21203/rs.3.rs-196077/v1 ◽

2021 ◽

Author(s):

Praveeen Anandhanathan ◽

Priyanka Gopalan

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Machine Learning Techniques ◽

Support Vector ◽

Nearest Neighbour ◽

Decision Tree Algorithm ◽

The Past ◽

Random Forest Method ◽

Learning Techniques ◽

The World

Abstract Coronavirus disease (COVID-19) is spreading across the world. Since at first it has appeared in Wuhan, China in December 2019, it has become a serious issue across the globe. There are no accurate resources to predict and find the disease. So, by knowing the past patients’ records, it could guide the clinicians to fight against the pandemic. Therefore, for the prediction of healthiness from symptoms Machine learning techniques can be implemented. From this we are going to analyse only the symptoms which occurs in every patient. These predictions can help clinicians in the easier manner to cure the patients. Already for prediction of many of the diseases, techniques like SVM (Support vector Machine), Fuzzy k-Means Clustering, Decision Tree algorithm, Random Forest Method, ANN (Artificial Neural Network), KNN (k-Nearest Neighbour), Naïve Bayes, Linear Regression model are used. As we haven’t faced this disease before, we can’t say which technique will give the maximum accuracy. So, we are going to provide an efficient result by comparing all the such algorithms in RStudio.

Download Full-text

Predicting in-Hospital Mortality of Patients with COVID-19 Using Machine Learning Techniques

Journal of Personalized Medicine ◽

10.3390/jpm11050343 ◽

2021 ◽

Vol 11 (5) ◽

pp. 343

Author(s):

Fabiana Tezza ◽

Giulia Lorenzoni ◽

Danila Azzolina ◽

Sofia Barbar ◽

Lucia Anna Carmela Leone ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Hospital Mortality ◽

Learning Algorithm ◽

Vital Signs ◽

Mortality Prediction ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Support Vector ◽

Learning Techniques

The present work aims to identify the predictors of COVID-19 in-hospital mortality testing a set of Machine Learning Techniques (MLTs), comparing their ability to predict the outcome of interest. The model with the best performance will be used to identify in-hospital mortality predictors and to build an in-hospital mortality prediction tool. The study involved patients with COVID-19, proved by PCR test, admitted to the “Ospedali Riuniti Padova Sud” COVID-19 referral center in the Veneto region, Italy. The algorithms considered were the Recursive Partition Tree (RPART), the Support Vector Machine (SVM), the Gradient Boosting Machine (GBM), and Random Forest. The resampled performances were reported for each MLT, considering the sensitivity, specificity, and the Receiving Operative Characteristic (ROC) curve measures. The study enrolled 341 patients. The median age was 74 years, and the male gender was the most prevalent. The Random Forest algorithm outperformed the other MLTs in predicting in-hospital mortality, with a ROC of 0.84 (95% C.I. 0.78–0.9). Age, together with vital signs (oxygen saturation and the quick SOFA) and lab parameters (creatinine, AST, lymphocytes, platelets, and hemoglobin), were found to be the strongest predictors of in-hospital mortality. The present work provides insights for the prediction of in-hospital mortality of COVID-19 patients using a machine-learning algorithm.

Download Full-text

Prediction of Protein–ATP Binding Residues Based on Ensemble of Deep Convolutional Neural Networks and LightGBM Algorithm

International Journal of Molecular Sciences ◽

10.3390/ijms22020939 ◽

2021 ◽

Vol 22 (2) ◽

pp. 939

Author(s):

Jiazhi Song ◽

Guixia Liu ◽

Jingqing Jiang ◽

Ping Zhang ◽

Yanchun Liang

Keyword(s):

Machine Learning ◽

Protein Function ◽

Learning Algorithm ◽

Area Under The Curve ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Atp Binding ◽

Deep Convolutional Neural Networks ◽

Binding Residues

Accurately identifying protein–ATP binding residues is important for protein function annotation and drug design. Previous studies have used classic machine-learning algorithms like support vector machine (SVM) and random forest to predict protein–ATP binding residues; however, as new machine-learning techniques are being developed, the prediction performance could be further improved. In this paper, an ensemble predictor that combines deep convolutional neural network and LightGBM with ensemble learning algorithm is proposed. Three subclassifiers have been developed, including a multi-incepResNet-based predictor, a multi-Xception-based predictor, and a LightGBM predictor. The final prediction result is the combination of outputs from three subclassifiers with optimized weight distribution. We examined the performance of our proposed predictor using two datasets: a classic ATP-binding benchmark dataset and a newly proposed ATP-binding dataset. Our predictor achieved area under the curve (AUC) values of 0.925 and 0.902 and Matthews Correlation Coefficient (MCC) values of 0.639 and 0.642, respectively, which are both better than other state-of-art prediction methods.

Download Full-text

A Comparitive Study of E-Mail Spam Detection using Various Machine Learning Techniques

10.21467/proceedings.114.56 ◽

2021 ◽

Author(s):

Simarjeet Kaur ◽

Meenakshi Bansal ◽

Ashok Kumar Bathla

Keyword(s):

Machine Learning ◽

Prediction Accuracy ◽

Learning Algorithm ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Spam Detection ◽

Learning Techniques ◽

E Mail ◽

Email Spam

Due to the rise in the use of messaging and mailing services, spam detection tasks are of much greater importance than before. In such a set of communications, efficient classification is a comparatively onerous job. For an addressee or any email that the user does not want to have in his inbox, spam can be defined as redundant or trash email. After pre-processing and feature extraction, various machine learning algorithms were applied to a Spam base dataset from the UCI Machine Learning repository in order to classify incoming emails into two categories: spam and non-spam. The outcomes of various algorithms have been compared. This paper used random forest, naive bayes, support vector machine (SVM), logistic regression, and the k nearest (KNN) machine learning algorithm to successfully classify email spam messages. The main goal of this study is to improve the prediction accuracy of spam email filters.

Download Full-text

Discrimination of SARS-Cov 2 and arboviruses (DENV, ZIKV and CHIKV) clinical features using machine learning techniques: a fast and inexpensive clinical screening for countries simultaneously affected by both diseases

10.1101/2021.01.28.21250714 ◽

2021 ◽

Author(s):

João Daniel S. Castro

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Learning Algorithm ◽

Machine Learning Techniques ◽

Support Vector ◽

Machine Learning Algorithm ◽

Clinical Screening ◽

Learning Techniques ◽

The World ◽

Area Under Roc Curve

AbstractSARS-Cov-2 (Covid-19) has spread rapidly throughout the world, and especially in tropical countries already affected by outbreaks of arboviruses, such as Dengue, Zika and Chikungunya, and may lead these locations to a collapse of health systems. Thus, the present work aims to develop a methodology using a machine learning algorithm (Support Vector Machine) for the prediction and discrimination of patients affected by Covid-19 and arboviruses (DENV, ZIKV and CHIKV). Clinical data from 204 patients with both Covid-19 and arboviruses obtained from 23 scientific articles and 1 dataset were used. The developed model was able to predict 93.1% of Covid-19 cases and 82.1% of arbovirus cases, with an accuracy of 89.1% and Area under Roc Curve of 95.6%, proving to be effective in prediction and possible screening of these patients, especially those affected by Covid-19, allowing early isolation.

Download Full-text

Performance Evaluation of Several Machine Learning Techniques Used in the Diagnosis of Mammograms

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.i7891.0881019 ◽

2019 ◽

Vol 8 (10) ◽

pp. 228-232 ◽

Cited By ~ 1

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Machine Learning Techniques ◽

Theoretical Research ◽

Support Vector ◽

Common Disease ◽

Learning Approaches ◽

Learning Techniques ◽

Analysis Society ◽

Life Threatening

Throughout the world breast cancer has become a common disease among the women and it is also a life threatening diseases. Machine learning(ML) approach has been widely used for the diagnosis of benign and malignant masses in the mammogram. In this manuscript, I have represented the theoretical research and practical advances on various machine learning techniques the diagnosis of benign and malignant masses in the mammogram. The objective of this manuscript is to analyze the performance of distinct machine learning techniques used in the diagnosis of the Digital Mammography Image Analysis Society (MIAS) database. In this work I have compared performance of four machine learning approaches i.e. Support Vector, Naive Bayes, K-Nearest Neighbours and Multilayer Perceptron. The above four types of machine learning algorithm are used to categorize mammograms image. The achievements of these four techniques were recognized to discover the most acceptable classifier. On the end of the examine, derived outcomes indicates that support vector is a successful approach compares to other approach.

Download Full-text

Breast Cancer Detection with Revamped Dataset Using Machine Learning Techniques

Journal of Medical Imaging and Health Informatics ◽

10.1166/jmihi.2021.3892 ◽

2021 ◽

Vol 11 (12) ◽

pp. 2996-3009

Author(s):

Sundarambal Balaraman ◽

Ramesh Ramamoorthy ◽

Raja Krishnamoorthi

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Logistic Regression ◽

Learning Algorithm ◽

Machine Learning Techniques ◽

Support Vector ◽

Data Set ◽

Cancer Data ◽

Learning Techniques ◽

Incidence And Mortality

Machine learning is a current topic of interest in research and industry, with the implementation of novel strategies all the time. The main purpose of this research activity is to determine the efficiency of machine learning techniques in the detection research of breast cancer. The incidence and mortality of breast cancer in women are increasing day by day. Worldwide, researchers have worked hard to help clinicians provide the best model for detecting diagnosis and breast cancer. In this work, learning UCI machine Wisconsin breast cancer data from a set of databases, model, and analyze the performance of existing work use, compared to the same data set. The dataset is analyzed, and the revamped dataset is constructed by eliminating redundant features and appending new features essential for prediction. Logistic regression, K nearest neighbors (KNN), support vector machine (SVM), decision trees, random forest, XGBoost, using a machine learning algorithm, such as re-organized data set of artificial neural network AdaBoost, 8 one of prediction build the model application (ANN). Standard to analyze the accuracy rate. In the experiment, these classifications have been shown to work for breast cancer with >97% accuracy. Logistic regression, XGBoost and Adaboost, stand on top with 99.28 percent accuracy. The experiment also, the balanced data set of removal outliers and balance, shows that have a significant impact on the model’s prediction performance.

Download Full-text

Using Machine Learning Algorithms on Prediction of Stock Price

Journal of Modeling and Optimization ◽

10.32732/jmo.2020.12.2.84 ◽

2020 ◽

Vol 12 (2) ◽

pp. 84-99

Author(s):

Li-Pang Chen

Keyword(s):

Machine Learning ◽

Stock Price ◽

Short Term Memory ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Short Term ◽

Learning Techniques ◽

Historical Database ◽

Long Short Term Memory

In this paper, we investigate analysis and prediction of the time-dependent data. We focus our attention on four different stocks are selected from Yahoo Finance historical database. To build up models and predict the future stock price, we consider three different machine learning techniques including Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN) and Support Vector Regression (SVR). By treating close price, open price, daily low, daily high, adjusted close price, and volume of trades as predictors in machine learning methods, it can be shown that the prediction accuracy is improved.

Download Full-text

Predictive Modelling of Employee Turnover in Indian IT Industry Using Machine Learning Techniques

Vision The Journal of Business Perspective ◽

10.1177/0972262918821221 ◽

2019 ◽

Vol 23 (1) ◽

pp. 12-21 ◽

Cited By ~ 2

Author(s):

Shikha N. Khera ◽

Divya

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Confusion Matrix ◽

Predictive Modelling ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Support Vector ◽

It Industry ◽

Knowledge Based ◽

Employee Attrition

Information technology (IT) industry in India has been facing a systemic issue of high attrition in the past few years, resulting in monetary and knowledge-based loses to the companies. The aim of this research is to develop a model to predict employee attrition and provide the organizations opportunities to address any issue and improve retention. Predictive model was developed based on supervised machine learning algorithm, support vector machine (SVM). Archival employee data (consisting of 22 input features) were collected from Human Resource databases of three IT companies in India, including their employment status (response variable) at the time of collection. Accuracy results from the confusion matrix for the SVM model showed that the model has an accuracy of 85 per cent. Also, results show that the model performs better in predicting who will leave the firm as compared to predicting who will not leave the company.

Download Full-text