Predicting Product Purchase using Linear Classification Algorithms

The customer buys the product based on many factors. There is no adequate and properly defined logic for such matter. The customer must satisfy when they see their product itself. They have to trust its quality, price, lifetime of the product, no side effect behavior, name of the product, packing of the product and finally cost. These factors may vary time to time, day to day and even sec to sec. The competition among sellers is also increasing day by day. The choice of choosing the product for customer is more, confused and risky also. Establishing a good relationship among seller and buyer will increase the customer. The retaining of customer is a challenging task. To solve this problem, a model is developed using machine learning algorithms svm, Naïve Bayes, Logistic Regression and fisher’s linear discriminant analysis. This model predicts the buying habit of a user/customer. The classification is performed on product purchase dataset and its performance is compared to find which algorithm performs well for this particular dataset. This work is implemented in R software.

Download Full-text

A Comparative Study of Classification Techniques for P300 Speller

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.g1020.0597s20 ◽

2020 ◽

Vol 9 (7S) ◽

pp. 102-106

Keyword(s):

Machine Learning ◽

Event Related Potential ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Computer Interface ◽

Support Vector ◽

Classification Algorithms ◽

Linear Discriminant ◽

P300 Speller ◽

Learning Techniques

P300 speller in Brain Computer Interface (BCI) allows locked-in or completely paralyzed patients to communicate with humans. To achieve the performance of characterization and increase accuracy, machine learning techniques are used. The study is about an event related potential (ERP) P300 signal detection and classification using various machine learning algorithms. Linear Discriminant Analysis (LDA) and Support Vector Machine (SVM) are used to classify P300 and Non-P300 signal from Electroencephalography (EEG) signal. The performance of the system is evaluated based on f1-score using BCI competition III dataset II. In our system, we used LDA and SVM classification algorithms. Both the classifiers gave 91.0% classification accuracy.

Download Full-text

Comparison of Classification Models for Breast Cancer Identification using Google Colab

10.20944/preprints202005.0328.v1 ◽

2020 ◽

Author(s):

SUNDARAMBAL BALARAMAN

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Logistic Regression ◽

Learning Algorithms ◽

Research Work ◽

Machine Learning Algorithms ◽

Support Vector ◽

Classification Algorithms ◽

Classification Models ◽

K Nearest Neighbor

Classification algorithms are very widely used algorithms for the study of various categories of data located in multiple databases that have real-world implementations. The main purpose of this research work is to identify the efficiency of classification algorithms in the study of breast cancer analysis. Mortality rate of women increases due to frequent cases of breast cancer. The conventional method of diagnosing breast cancer is time consuming and hence research works are being carried out in multiple dimensions to address this issue. In this research work, Google colab, an excellent environment for Python coders, is used as a tool to implement machine learning algorithms for predicting the type of cancer. The performance of machine learning algorithms is analyzed based on the accuracy obtained from various classification models such as logistic regression, K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Naïve Bayes, Decision Tree and Random forest. Experiments show that these classifiers work well for the classification of breast cancers with accuracy>90% and the logistic regression stood top with an accuracy of 98.5%. Also implementation using Google colab made the task very easier without spending hours of installation of environment and supporting libraries which we used to do earlier.

Download Full-text

Predicting COVID-19 Disease Progression with Chest CT Images

10.21203/rs.3.rs-80956/v1 ◽

2020 ◽

Author(s):

Hongqin Liang ◽

Xiaoming Qiu ◽

Liqiang Zhu ◽

Lihua Chen ◽

Xiaofei Hu ◽

...

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Youden Index ◽

Unconditional Logistic Regression ◽

Linear Discriminant ◽

Halo Sign ◽

Reversed Halo Sign ◽

Sensitivity Specificity ◽

Ct Features ◽

Fisher’S Linear Discriminant

Abstract Background: Some mild patients can deteriorate to moderate or severe within a week with the natural progression of COVID-19.it has been crucial to early identify those mild cases and give timely treatment . The chest computed tomography (CT) has shown to be useful to assist clinical diagnosis of COVID-19.In this study, machine learning was used to develop an early-warning CT feature model for predicting mild patients with potential malignant progression.Methods：The total of 140 COVID-19 mild patients were collected. All patients at admission were divided into groups (alleviation group and exacerbation group) with or without malignant progression.The clinical and laboratory data at admission, the first CT, and the follow-up CT at critical stage of the two groups were compared with Chi-square test,.The CT features data (distribution, morphology,etc) were used to establish the prediction model by Fisher's linear discriminant method and Unconditional logistic regression algorithm. And the model was validated with 40 exception data.and the Area Under ROC curve (AUC) was used to evaluate the models.Results：The model filtered out three variables of CT features including distal air bronchogram, fibrosis,and reversed halo sign. Notably, the distal air bronchograms was less common in alleviation group, while the fibrosis and reversed halo sign were more common.The sensitivity, specificity and Youden index of unconditional logistic regression were 86.1%, 92.6% and 78.7%, For the analysis of Fisher's linear discriminant, the sensitivity, specificity and Youden index were 83.3%, 94.1% and 77.4%. The generalization ability of both models were consistent with sensitivity of 95.89%, specificity of 100%, and Youden index of 83.33%.Conclusions: The CT imaging features-based machine learning model has a high sensitivity for finding out the mild patients who are easy to deteriorate into severe/critical cases efficiently so that timely treatments came true for those patients,while largely help to relieve the medical pressure.

Download Full-text

Monitoring and Smart Decision Architecture for DRONE-FOG Integrated Environment

10.5753/sbcup.2021.16008 ◽

2021 ◽

Author(s):

Wendel Serra ◽

Warley Junior ◽

Isaac Barros ◽

Hugo Kuribayashi ◽

João Carmona

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Performance Metrics ◽

Weather Conditions ◽

Computational Effort ◽

Machine Learning Algorithms ◽

Computation Offloading ◽

Classification Algorithms ◽

Multi Layer Perceptron ◽

Integrated Environment

Due to the limited computing resources of drones, it is difficult to handle computation-intensive tasks locally, hence, fog-based computation offloading has been widely adopted. The effectiveness of an offloading operation, however, is determined by its ability to infer where the execution of code/data represents less computational effort for the drone, so that, by deciding where to offload correctly, the device benefits. Thus, this paper proposes MonDroneFog, a novel fog-based architecture that supports image offloading, as well as monitoring and storing the performance metrics related to the drone, wireless network, and cloudlet. It takes advantage of the main machine-learning algorithms to provide offloading decisions with high levels of accuracy, F1, and G-mean. We evaluate the main classification algorithms under our database and the results show that Multi-Layer Perceptron (MLP) and Logistic Regression classifiers achieve 99.64% and 99.20% accuracy, respectively. Under these conditions, MonDrone-Fog works well in dense forests when weather conditions are favorable and can be useful as a support system for SAR missions by providing a shorter runtime for image operations.

Download Full-text

A survey on prediction of diabetes using classification algorithms

Journal of Achievements of Materials and Manufacturing Engineering ◽

10.5604/01.3001.0014.8490 ◽

2021 ◽

Vol 2 (104) ◽

pp. 77-84

Author(s):

A. Khanwalkar ◽

R. Soni

Keyword(s):

Machine Learning ◽

Data Collection ◽

Learning Algorithm ◽

Algorithm Design ◽

Machine Learning Algorithms ◽

Classification Algorithms ◽

Machine Learning Algorithm ◽

Collection Method ◽

Data Collection Method ◽

Diagnostic Center

Purpose: Diabetes is a chronic disease that pays for a large proportion of the nation's healthcare expenses when people with diabetes want medical care continuously. Several complications will occur if the polymer disorder is not treated and unrecognizable. The prescribed condition leads to a diagnostic center and a doctor's intention. One of the real-world subjects essential is to find the first phase of the polytechnic. In this work, basically a survey that has been analyzed in several parameters within the poly-infected disorder diagnosis. It resembles the classification algorithms of data collection that plays an important role in the data collection method. Automation of polygenic disorder analysis, as well as another machine learning algorithm. Design/methodology/approach: This paper provides extensive surveys of different analogies which have been used for the analysis of medical data, For the purpose of early detection of polygenic disorder. This paper takes into consideration methods such as J48, CART, SVMs and KNN square, this paper also conducts a formal surveying of all the studies, and provides a conclusion at the end. Findings: This surveying has been analyzed on several parameters within the poly-infected disorder diagnosis. It resembles that the classification algorithms of data collection plays an important role in the data collection method in Automation of polygenic disorder analysis, as well as another machine learning algorithm. Practical implications: This paper will help future researchers in the field of Healthcare, specifically in the domain of diabetes, to understand differences between classification algorithms. Originality/value: This paper will help in comparing machine learning algorithms by going through results and selecting the appropriate approach based on requirements.

Download Full-text

Implementation of Machine Learning Algorithms for Prediction of Fluidelastic Instability in Tube Arrays

Journal of Pressure Vessel Technology ◽

10.1115/1.4049876 ◽

2021 ◽

Vol 143 (2) ◽

Author(s):

Joaquin E. Moran ◽

Yasser Selima

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Learning Algorithm ◽

Machine Learning Algorithms ◽

Two Phase ◽

Factors Affecting ◽

Logistic Regression Models ◽

Number Of Factors ◽

Tube Arrays ◽

Fluidelastic Instability

Abstract Fluidelastic instability (FEI) in tube arrays has been studied extensively experimentally and theoretically for the last 50 years, due to its potential to cause significant damage in short periods. Incidents similar to those observed at San Onofre Nuclear Generating Station indicate that the problem is not yet fully understood, probably due to the large number of factors affecting the phenomenon. In this study, a new approach for the analysis and interpretation of FEI data using machine learning (ML) algorithms is explored. FEI data for both single and two-phase flows have been collected from the literature and utilized for training a machine learning algorithm in order to either provide estimates of the reduced velocity (single and two-phase) or indicate if the bundle is stable or unstable under certain conditions (two-phase). The analysis included the use of logistic regression as a classification algorithm for two-phase flow problems to determine if specific conditions produce a stable or unstable response. The results of this study provide some insight into the capability and potential of logistic regression models to analyze FEI if appropriate quantities of experimental data are available.

Download Full-text

Predicting hospitalization following psychiatric crisis care using machine learning

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-020-01361-1 ◽

2020 ◽

Vol 20 (1) ◽

Author(s):

Matthijs Blankers ◽

Louk F. M. van der Post ◽

Jack J. M. Dekker

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Prediction Models ◽

Learning Algorithms ◽

Nearest Neighbors ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Ensemble Model ◽

K Nearest Neighbors ◽

Crisis Care

Abstract Background Accurate prediction models for whether patients on the verge of a psychiatric criseis need hospitalization are lacking and machine learning methods may help improve the accuracy of psychiatric hospitalization prediction models. In this paper we evaluate the accuracy of ten machine learning algorithms, including the generalized linear model (GLM/logistic regression) to predict psychiatric hospitalization in the first 12 months after a psychiatric crisis care contact. We also evaluate an ensemble model to optimize the accuracy and we explore individual predictors of hospitalization. Methods Data from 2084 patients included in the longitudinal Amsterdam Study of Acute Psychiatry with at least one reported psychiatric crisis care contact were included. Target variable for the prediction models was whether the patient was hospitalized in the 12 months following inclusion. The predictive power of 39 variables related to patients’ socio-demographics, clinical characteristics and previous mental health care contacts was evaluated. The accuracy and area under the receiver operating characteristic curve (AUC) of the machine learning algorithms were compared and we also estimated the relative importance of each predictor variable. The best and least performing algorithms were compared with GLM/logistic regression using net reclassification improvement analysis and the five best performing algorithms were combined in an ensemble model using stacking. Results All models performed above chance level. We found Gradient Boosting to be the best performing algorithm (AUC = 0.774) and K-Nearest Neighbors to be the least performing (AUC = 0.702). The performance of GLM/logistic regression (AUC = 0.76) was slightly above average among the tested algorithms. In a Net Reclassification Improvement analysis Gradient Boosting outperformed GLM/logistic regression by 2.9% and K-Nearest Neighbors by 11.3%. GLM/logistic regression outperformed K-Nearest Neighbors by 8.7%. Nine of the top-10 most important predictor variables were related to previous mental health care use. Conclusions Gradient Boosting led to the highest predictive accuracy and AUC while GLM/logistic regression performed average among the tested algorithms. Although statistically significant, the magnitude of the differences between the machine learning algorithms was in most cases modest. The results show that a predictive accuracy similar to the best performing model can be achieved when combining multiple algorithms in an ensemble model.

Download Full-text

Predicting Hospitalization following Psychiatric Crisis Care using Machine Learning

10.21203/rs.2.12338/v1 ◽

2019 ◽

Author(s):

Matthijs Blankers ◽

Louk F. M. van der Post ◽

Jack J. M. Dekker

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Learning Algorithms ◽

Nearest Neighbors ◽

Machine Learning Algorithms ◽

Predictor Variables ◽

Gradient Boosting ◽

K Nearest Neighbors ◽

Psychiatric Crisis ◽

Crisis Care

Abstract Background: It is difficult to accurately predict whether a patient on the verge of a potential psychiatric crisis will need to be hospitalized. Machine learning may be helpful to improve the accuracy of psychiatric hospitalization prediction models. In this paper we evaluate and compare the accuracy of ten machine learning algorithms including the commonly used generalized linear model (GLM/logistic regression) to predict psychiatric hospitalization in the first 12 months after a psychiatric crisis care contact, and explore the most important predictor variables of hospitalization. Methods: Data from 2,084 patients with at least one reported psychiatric crisis care contact included in the longitudinal Amsterdam Study of Acute Psychiatry were used. The accuracy and area under the receiver operating characteristic curve (AUC) of the machine learning algorithms were compared. We also estimated the relative importance of each predictor variable. The best and least performing algorithms were compared with GLM/logistic regression using net reclassification improvement analysis. Target variable for the prediction models was whether or not the patient was hospitalized in the 12 months following inclusion in the study. The 39 predictor variables were related to patients’ socio-demographics, clinical characteristics and previous mental health care contacts. Results: We found Gradient Boosting to perform the best (AUC=0.774) and K-Nearest Neighbors performing the least (AUC=0.702). The performance of GLM/logistic regression (AUC=0.76) was above average among the tested algorithms. Gradient Boosting outperformed GLM/logistic regression and K-Nearest Neighbors, and GLM outperformed K-Nearest Neighbors in a Net Reclassification Improvement analysis, although the differences between Gradient Boosting and GLM/logistic regression were small. Nine of the top-10 most important predictor variables were related to previous mental health care use. Conclusions: Gradient Boosting led to the highest predictive accuracy and AUC while GLM/logistic regression performed average among the tested algorithms. Although statistically significant, the magnitude of the differences between the machine learning algorithms was modest. Future studies may consider to combine multiple algorithms in an ensemble model for optimal performance and to mitigate the risk of choosing suboptimal performing algorithms.

Download Full-text

Comparison of the Performance of Machine Learning Algorithms in Predicting Heart Disease

Frontiers in Health Informatics ◽

10.30699/fhi.v10i1.349 ◽

2021 ◽

Vol 10 (1) ◽

pp. 99

Author(s):

Sajad Yousefi

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Heart Disease ◽

Decision Tree ◽

Roc Curve ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Learning Models ◽

Algorithm Performance ◽

Machine Learning Models

Introduction: Heart disease is often associated with conditions such as clogged arteries due to the sediment accumulation which causes chest pain and heart attack. Many people die due to the heart disease annually. Most countries have a shortage of cardiovascular specialists and thus, a significant percentage of misdiagnosis occurs. Hence, predicting this disease is a serious issue. Using machine learning models performed on multidimensional dataset, this article aims to find the most efficient and accurate machine learning models for disease prediction.Material and Methods: Several algorithms were utilized to predict heart disease among which Decision Tree, Random Forest and KNN supervised machine learning are highly mentioned. The algorithms are applied to the dataset taken from the UCI repository including 294 samples. The dataset includes heart disease features. To enhance the algorithm performance, these features are analyzed, the feature importance scores and cross validation are considered.Results: The algorithm performance is compared with each other, so that performance based on ROC curve and some criteria such as accuracy, precision, sensitivity and F1 score were evaluated for each model. As a result of evaluation, Accuracy, AUC ROC are 83% and 99% respectively for Decision Tree algorithm. Logistic Regression algorithm with accuracy and AUC ROC are 88% and 91% respectively has better performance than other algorithms. Therefore, these techniques can be useful for physicians to predict heart disease patients and prescribe them correctly.Conclusion: Machine learning technique can be used in medicine for analyzing the related data collections to a disease and its prediction. The area under the ROC curve and evaluating criteria related to a number of classifying algorithms of machine learning to evaluate heart disease and indeed, the prediction of heart disease is compared to determine the most appropriate classification. As a result of evaluation, better performance was observed in both Decision Tree and Logistic Regression models.

Download Full-text

Data Aggregation and Terror Group Prediction using Machine Learning Algorithms

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.d7590.118419 ◽

2019 ◽

Vol 8 (4) ◽

pp. 1467-1469 ◽

Cited By ~ 1

Keyword(s):

Machine Learning ◽

Detailed Analysis ◽

Data Aggregation ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Classification Algorithms ◽

Terrorist Groups ◽

Different Sources

This paper is about to introduce a proposed system that examines growth or decay of the terrorist groups by the time, active locations, types of attack they carry out, motive targets, Weapon mastery and availability and many parameters to analyze the patterns and hidden structures in their activity and to predict the occasion and type of their future attack. We have done a detailed analysis of data we get from different sources and we also performed different classification algorithms on the available data to find the chances of probable attack on different regions.Based on results finding which of the algorithms works with highest accuracy.

Download Full-text