Malicious URL Detection using Logistic Regression

This paper profiles mobile banking users using machine learning techniques viz. Decision Tree, Logistic Regression, Multilayer Perceptron, and SVM to test a research model with fourteen independent variables and a dependent variable (adoption). A survey was conducted and the results were analysed using these techniques. Using Decision Trees the profile of the mobile banking adopter’s profile was identified. Comparing different machine learning techniques it was found that Decision Trees outperformed the Logistic Regression and Multilayer Perceptron and SVM. Out of all the techniques, Decision Tree is recommended for profiling studies because apart from obtaining high accurate results, it also yields ‘if–then’ classification rules. The classification rules provided here can be used to target potential customers to adopt mobile banking by offering them appropriate incentives.

Download Full-text

Machine learning versus logistic regression methods for 2-year mortality prognostication in a small, heterogeneous glioma database

10.1101/472555 ◽

2018 ◽

Cited By ~ 2

Author(s):

Sandip S Panesar ◽

Rhett N D’Souza ◽

Fang-Cheng Yeh ◽

Juan C Fernandez-Miranda

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Machine Learning Techniques ◽

World Health ◽

Support Vector ◽

Molecular Characteristics ◽

Regression Methods ◽

Learning Techniques ◽

The World ◽

Health Organization

AbstractBackgroundMachine learning (ML) is the application of specialized algorithms to datasets for trend delineation, categorization or prediction. ML techniques have been traditionally applied to large, highly-dimensional databases. Gliomas are a heterogeneous group of primary brain tumors, traditionally graded using histopathological features. Recently the World Health Organization proposed a novel grading system for gliomas incorporating molecular characteristics. We aimed to study whether ML could achieve accurate prognostication of 2-year mortality in a small, highly-dimensional database of glioma patients.MethodsWe applied three machine learning techniques: artificial neural networks (ANN), decision trees (DT), support vector machine (SVM), and classical logistic regression (LR) to a dataset consisting of 76 glioma patients of all grades. We compared the effect of applying the algorithms to the raw database, versus a database where only statistically significant features were included into the algorithmic inputs (feature selection).ResultsRaw input consisted of 21 variables, and achieved performance of (accuracy/AUC): 70.7%/0.70 for ANN, 68%/0.72 for SVM, 66.7%/0.64 for LR and 65%/0.70 for DT. Feature selected input consisted of 14 variables and achieved performance of 73.4%/0.75 for ANN, 73.3%/0.74 for SVM, 69.3%/0.73 for LR and 65.2%/0.63 for DT.ConclusionsWe demonstrate that these techniques can also be applied to small, yet highly-dimensional datasets. Our ML techniques achieved reasonable performance compared to similar studies in the literature. Though local databases may be small versus larger cancer repositories, we demonstrate that ML techniques can still be applied to their analysis, though traditional statistical methods are of similar benefit.

Download Full-text

A comparative study of logistic regression based machine learning techniques for prediction of early virological suppression in antiretroviral initiating HIV patients

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-018-0659-x ◽

2018 ◽

Vol 18 (1) ◽

Cited By ~ 5

Author(s):

Kuteesa R. Bisaso ◽

Susan A. Karungi ◽

Agnes Kiragga ◽

Jackson K. Mukonzo ◽

Barbara Castelnuovo

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Comparative Study ◽

Virological Suppression ◽

Machine Learning Techniques ◽

Hiv Patients ◽

Learning Techniques

Download Full-text

Don’t Dismiss Logistic Regression: The Case for Sensible Extraction of Interactions in the Era of Machine Learning

10.1101/2019.12.15.877134 ◽

2019 ◽

Cited By ~ 1

Author(s):

Joshua J. Levy ◽

A. James O’Malley

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Random Forest ◽

Model Building ◽

Machine Learning Techniques ◽

Learning Approaches ◽

Statistical Machine Learning ◽

Forest Model ◽

Learning Techniques ◽

Modeling Techniques

AbstractBackgroundMachine learning approaches have become increasingly popular modeling techniques, relying on data-driven heuristics to arrive at its solutions. Recent comparisons between these algorithms and traditional statistical modeling techniques have largely ignored the superiority gained by the former approaches due to involvement of model-building search algorithms. This has led to alignment of statistical and machine learning approaches with different types of problems and the under-development of procedures that combine their attributes. In this context, we hoped to understand the domains of applicability for each approach and to identify areas where a marriage between the two approaches is warranted. We then sought to develop a hybrid statistical-machine learning procedure with the best attributes of each.MethodsWe present three simple examples to illustrate when to use each modeling approach and posit a general framework for combining them into an enhanced logistic regression model building procedure that aids interpretation. We study 556 benchmark machine learning datasets to uncover when machine learning techniques outperformed rudimentary logistic regression models and so are potentially well-equipped to enhance them. We illustrate a software package, InteractionTransformer, which embeds logistic regression with advanced model building capacity by using machine learning algorithms to extract candidate interaction features from a random forest model for inclusion in the model. Finally, we apply our enhanced logistic regression analysis to two real-word biomedical examples, one where predictors vary linearly with the outcome and another with extensive second-order interactions.ResultsPreliminary statistical analysis demonstrated that across 556 benchmark datasets, the random forest approach significantly outperformed the logistic regression approach. We found a statistically significant increase in predictive performance when using hybrid procedures and greater clarity in the association with the outcome of terms acquired compared to directly interpreting the random forest output.ConclusionsWhen a random forest model is closer to the true model, hybrid statistical-machine learning procedures can substantially enhance the performance of statistical procedures in an automated manner while preserving easy interpretation of the results. Such hybrid methods may help facilitate widespread adoption of machine learning techniques in the biomedical setting.

Download Full-text

Comparison of Statistical Logistic Regression and RandomForest Machine Learning Techniques in Predicting Diabetes

Journal of Advances in Information Technology ◽

10.12720/jait.11.2.78-83 ◽

2020 ◽

pp. 78-83

Author(s):

Tahani Daghistani ◽

◽

Riyad Alshammari

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Machine Learning Techniques ◽

Learning Techniques

Download Full-text

Aprendizado de Máquina Aplicado à Predição de Doenças Cardiometabólicas com Utilização de Indicadores Metabólicos e Comportamentais de Risco à Saúde

10.14210/cotb.v12.p301-308 ◽

2021 ◽

Author(s):

Alan Lopes de Sousa Freitas ◽

Ana Silvia Degasperi Ieker ◽

Josiane Melchiori Pinheiro ◽

Wilson Rinaldi ◽

Heloise Manica Paris Teixeira

Keyword(s):

Machine Learning ◽

Risk Factors ◽

Logistic Regression ◽

Decision Tree ◽

Causes Of Death ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Cardiometabolic Diseases ◽

Learning Techniques ◽

Good Classification

Cardiometabolic diseases, developed throughout the worker’s life,such as hypertension, diabetes, dyslipidemia and obesity are amongthe main causes of death and are associated with modifiable andcontrollable risk factors. The general objective of this study wasto apply supervised Machine Learning techniques and to comparetheir performance to predict the risk of developing cardiometabolicdisease from servers working at the School Hospital of south inBrazil. We sought to map the characteristics of individuals who aremore likely to develop cardiometabolic diseases. The machine learningmodels evaluated were Naive Bayes, Decision Tree, RandomForest, KNN, Logistic Regression and SVM. The results obtained inthe experiments showed that some supervised machine learningmodels produce a good classification, depending on the attributesand hyperparameters used.

Download Full-text

Predicting Job Satisfaction and Employee Turnover Using Machine Learning

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2020.9024 ◽

2020 ◽

Vol 17 (9) ◽

pp. 4092-4097

Author(s):

Inchara Yogesh ◽

K. R. Suresh Kumar ◽

Niveditha Candrashekaran ◽

Dhrithi Reddy ◽

Harshitha Sampath

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Succession Planning ◽

Employee Turnover ◽

Employee Retention ◽

Random Forest Classifier ◽

Machine Learning Techniques ◽

Positive Feedbacks ◽

Learning Techniques ◽

New Employee

Employee turn_over inflicts costs on the company. The employee must be supplanted, and the new employee trained. These quits may likewise make critical and exorbitant interruptions the production process. This gives lucid motivation to the firm to forestall stops or, in any event, to have the option to anticipate when and where stops can be anticipated. On the off chance that employees are approached to assess their superiors and the appropriate responses will be made accessible to the superior, it is most obvious that only positive feedbacks will be provided. Along these lines, the point is to utilize Machine Learning techniques to foresee employee turn_over. Appropriate predictions cause companies to take necessary decisions on employee retention or succession planning. Algorithms: One-Sample T-Test (T-Test), Decision Tree (DT), AdaBoost (AB), Logistic Regression (LR), Random Forest Classifier (RFC).

Download Full-text

Methodology for Analyzing the Traditional Algorithms Performance of User Reviews Using Machine Learning Techniques

Algorithms ◽

10.3390/a13080202 ◽

2020 ◽

Vol 13 (8) ◽

pp. 202

Author(s):

Abdul Karim ◽

Azhari Azhari ◽

Samir Brahim Belhaouri ◽

Ali Adil Qureshi ◽

Maqsood Ahmad

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Data Sets ◽

User Reviews ◽

Almost Everywhere ◽

Document Frequency ◽

Learning Techniques ◽

Google Play

Android-based applications are widely used by almost everyone around the globe. Due to the availability of the Internet almost everywhere at no charge, almost half of the globe is engaged with social networking, social media surfing, messaging, browsing and plugins. In the Google Play Store, which is one of the most popular Internet application stores, users are encouraged to download thousands of applications and various types of software. In this research study, we have scraped thousands of user reviews and the ratings of different applications. We scraped 148 application reviews from 14 different categories. A total of 506,259 reviews were accumulated and assessed. Based on the semantics of reviews of the applications, the results of the reviews were classified negative, positive or neutral. In this research, different machine-learning algorithms such as logistic regression, random forest and naïve Bayes were tuned and tested. We also evaluated the outcome of term frequency (TF) and inverse document frequency (IDF), measured different parameters such as accuracy, precision, recall and F1 score (F1) and present the results in the form of a bar graph. In conclusion, we compared the outcome of each algorithm and found that logistic regression is one of the best algorithms for the review-analysis of the Google Play Store from an accuracy perspective. Furthermore, we were able to prove and demonstrate that logistic regression is better in terms of speed, rate of accuracy, recall and F1 perspective. This conclusion was achieved after preprocessing a number of data values from these data sets.

Download Full-text

Sentiment Analysis of Movie Review using Machine Learning Techniques

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.7.10921 ◽

2018 ◽

Vol 7 (2.7) ◽

pp. 676 ◽

Cited By ~ 1

Author(s):

V Uma Ramya ◽

K Thirupathi Rao

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Sentiment Analysis ◽

Text Analysis ◽

Naive Bayes ◽

Multinomial Logistic Regression ◽

Machine Learning Techniques ◽

Analysis Algorithm ◽

Learning Techniques ◽

Almost All

Today's online world was fully filled up with blogs, views, comments, posts through various websites and social-surfs. People were habituated with posting every incident into blogs, messed with comments like text and emotions, which are a mixed bag of sad, happy, worry, cry etc. Analysing such data was called as Sentimental Analysis. To analysis, these unordered data we use new emerged technology algorithms. Machine learning a transpire technology which is engaged with almost all the fields, where its algorithms are more powerful that give with better faultless results. In this paper, we are analyzing tweets based on movie reviews using the Multinomial Logistic Regression, Naïve Bayes, and SVM algorithms to compare score value to show the best text analysis algorithm.

Download Full-text

Prediction and Classification of Low Birth Weight Data Using Machine Learning Techniques

Indonesian Journal of Science and Technology ◽

10.17509/ijost.v3i1.10799 ◽

2018 ◽

Vol 3 (1) ◽

pp. 18 ◽

Cited By ~ 4

Author(s):

Alfensi Faruk ◽

Endro Setyo Cahyono

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Birth Weight ◽

Low Birth Weight ◽

Binary Logistic Regression ◽

Machine Learning Techniques ◽

Binary Logistic Regression Model ◽

Data Set ◽

Learning Techniques

Machine learning (ML) is a subject that focuses on the data analysis using various statistical tools and learning processes in order to gain more knowledge from the data. The objective of this research was to apply one of the ML techniques on the low birth weight (LBW) data in Indonesia. This research conducts two ML tasks; including prediction and classification. The binary logistic regression model was firstly employed on the train and the test data. Then; the random approach was also applied to the data set. The results showed that the binary logistic regression had a good performance for prediction; but it was a poor approach for classification. On the other hand; random forest approach has a very good performance for both prediction and classification of the LBW data set

Download Full-text

Malicious URL Detection using Logistic Regression

Machine Learning Techniques Applied to Profile Mobile Banking Users in India

Machine learning versus logistic regression methods for 2-year mortality prognostication in a small, heterogeneous glioma database

A comparative study of logistic regression based machine learning techniques for prediction of early virological suppression in antiretroviral initiating HIV patients

Don’t Dismiss Logistic Regression: The Case for Sensible Extraction of Interactions in the Era of Machine Learning

Comparison of Statistical Logistic Regression and RandomForest Machine Learning Techniques in Predicting Diabetes

Aprendizado de Máquina Aplicado à Predição de Doenças Cardiometabólicas com Utilização de Indicadores Metabólicos e Comportamentais de Risco à Saúde

Predicting Job Satisfaction and Employee Turnover Using Machine Learning

Methodology for Analyzing the Traditional Algorithms Performance of User Reviews Using Machine Learning Techniques

Sentiment Analysis of Movie Review using Machine Learning Techniques

Prediction and Classification of Low Birth Weight Data Using Machine Learning Techniques

Export Citation Format