Logistic Regression for Binary Classification

2018 ◽  
pp. 169-184
2018 ◽  
Vol 2 (334) ◽  
Author(s):  
Mirosław Krzyśko ◽  
Łukasz Smaga

In this paper, the binary classification problem of multi‑dimensional functional data is considered. To solve this problem a regression technique based on functional logistic regression model is used. This model is re‑expressed as a particular logistic regression model by using the basis expansions of functional coefficients and explanatory variables. Based on re‑expressed model, a classification rule is proposed. To handle with outlying observations, robust methods of estimation of unknown parameters are also considered. Numerical experiments suggest that the proposed methods may behave satisfactory in practice.


Author(s):  
Michaela Staňková ◽  
David Hampel

This article focuses on the problem of binary classification of 902 small- and medium‑sized engineering companies active in the EU, together with additional 51 companies which went bankrupt in 2014. For classification purposes, the basic statistical method of logistic regression has been selected, together with a representative of machine learning (support vector machines and classification trees method) to construct models for bankruptcy prediction. Different settings have been tested for each method. Furthermore, the models were estimated based on complete data and also using identified artificial factors. To evaluate the quality of prediction we observe not only the total accuracy with the type I and II errors but also the area under ROC curve criterion. The results clearly show that increasing distance to bankruptcy decreases the predictive ability of all models. The classification tree method leads us to rather simple models. The best classification results were achieved through logistic regression based on artificial factors. Moreover, this procedure provides good and stable results regardless of other settings. Artificial factors also seem to be a suitable variable for support vector machines models, but classification trees achieved better results using original data.


Entropy ◽  
2020 ◽  
Vol 22 (5) ◽  
pp. 543 ◽  
Author(s):  
Konrad Furmańczyk ◽  
Wojciech Rejchel

In this paper, we consider prediction and variable selection in the misspecified binary classification models under the high-dimensional scenario. We focus on two approaches to classification, which are computationally efficient, but lead to model misspecification. The first one is to apply penalized logistic regression to the classification data, which possibly do not follow the logistic model. The second method is even more radical: we just treat class labels of objects as they were numbers and apply penalized linear regression. In this paper, we investigate thoroughly these two approaches and provide conditions, which guarantee that they are successful in prediction and variable selection. Our results hold even if the number of predictors is much larger than the sample size. The paper is completed by the experimental results.


2015 ◽  
Vol 713-715 ◽  
pp. 1757-1760
Author(s):  
Sheng Ma

In the paper, the survival time of patients with lung cancer is inferred based on binary classification variables Logistic regression linear model and the data analysis is implemented by R software.


in an event when there is lots of risk factor then the logistic regression is used for predicting the probability. For binary and ordinal data the medical researcher increase the use of logistic analysis. Several classification problems like spam detection used logistic regression. If a customer purchases a specific product in Diabetes prediction or they will inspire with any other competitor, whether customer click on given advertisement link or not are some example. For two class classification the Logistic Regression is one of the most simple and common machine Learning algorithms. For any binary classification problem it is very easy to use as a basic approach. Deep learning is also its fundamental concept. The relationship measurement and description between dependent binary variable and independent variables can be done by logistic regression.


2020 ◽  
Vol 2020 ◽  
pp. 1-13
Author(s):  
Yingying Wang ◽  
Jixiang Du ◽  
Hongbo Zhang ◽  
Xiuhong Yang

Due to the tastiness of mushroom, this edible fungus often appears in people’s daily meals. Nevertheless, there are still various mushroom species that have not been identified. Thus, the automatic identification of mushroom toxicity is of great value. A number of methods are commonly employed to recognize mushroom toxicity, such as folk experience, chemical testing, animal experiments, and fungal classification, all of which cannot produce quick, accurate results and have a complicated cycle. To solve these problems, in this paper, we proposed an automatic toxicity identification method based on visual features. The proposed method regards toxicity identification as a binary classification problem. First, intuitive and easily accessible appearance data, such as the cap shape and color of mushrooms, were taken as features. Second, the missing data in any of the features were handled in two ways. Finally, three pattern-recognition methods, including logistic regression, support vector machine, and multigrained cascade forest, were used to construct 3 different toxicity classifiers for mushrooms. Compared with the logistic regression and support vector machine classifiers, the multigrained cascade forest classifier had better performance with an accuracy of approximately 98%, enhancing the possibility of preventing food poisoning. These classifiers can recognize the toxicity of mushrooms—even that of some unknown species—according to their appearance features and important social and application value.


Biometrics ◽  
2011 ◽  
Vol 68 (1) ◽  
pp. 23-30 ◽  
Author(s):  
Tyler H. McCormick ◽  
Adrian E. Raftery ◽  
David Madigan ◽  
Randall S. Burd

2021 ◽  
Vol 9 ◽  
Author(s):  
Keiko Ogawa ◽  
Seikou Nakamura ◽  
Haruka Oguri ◽  
Kaori Ryu ◽  
Taichi Yoneda ◽  
...  

Natural products are an excellent source of skeletons for medicinal seeds. Triterpenes and saponins are representative natural products that exhibit anti-herpes simplex virus type 1 (HSV-1) activity. However, there has been a lack of comprehensive information on the anti-HSV-1 activity of triterpenes. Therefore, expanding information on the anti-HSV-1 activity of triterpenes and improving the efficiency of their exploration are urgently required. To improve the efficiency of the development of anti-HSV-1 active compounds, we constructed a predictive model for the anti-HSV-1 activity of triterpenes by using the information obtained from previous studies using machine learning methods. In this study, we constructed a binary classification model (i.e., active or inactive) using a logistic regression algorithm. As a result of the evaluation of predictive model, the accuracy for the test data is 0.79, and the area under the curve (AUC) is 0.86. Additionally, to enrich the information on the anti-HSV-1 activity of triterpenes, a plaque reduction assay was performed on 20 triterpenes. As a result, chikusetsusaponin IVa (11: IC50 = 13.06 μM) was found to have potent anti-HSV-1 with three potentially anti-HSV-1 active triterpenes. The assay result was further used for external validation of predictive model. The prediction of the test compounds in the activity test showed a high accuracy (0.83) and AUC (0.81). We also found that this predictive model was found to be able to successfully narrow down the active compounds. This study provides more information on the anti-HSV-1 activity of triterpenes. Moreover, the predictive model can improve the efficiency of the development of active triterpenes by integrating many previous studies to clarify potential relationships.


2018 ◽  
Vol 1 (3) ◽  
pp. e00065
Author(s):  
O.A. Raevsky ◽  
D.E. Polianczyk ◽  
O.E. Raevskaja

Stable classification predictive models of 83 drugs with different blood-brain barrier penetration capacity have been constructed by the logistic regression method using physicochemical descriptors characterizing steric, electrostatic interactions and hydrogen bond energy. The models are balanced, with the prediction level of 75-80%.


Sign in / Sign up

Export Citation Format

Share Document