scholarly journals Utilizing the Logistic Regression Model in Analyzing the Categorical Data of Economic Effects

Author(s):  
Mahdi Wahhab Neamah, Et. al.

The categorical data has a significant role in representing statistical binary variables, and they are analyzed by means of grouping the response variable into ordered categories. Thereby, the dependent variable becomes of type binary qualitative variable. The data related to the financial position of world countries is classified within the categorical data. This work is to study the economic effects of an individual's different factors on determining the richness or poorness levels of a selected population of countries. Moreover, a logistic regression model is to be created to estimate these levels. As a sample of research, the categorical data relevant to the financial status of 20 Arabic countries were drawn from the website of the World Bank, WB. In addition, for comparison purpose, another similar set of categorical data was generated by MATLAB too. The paper has been based on two hypotheses, first is the well-known regression models, like the ordinary least squares or maximum likelihood, are not accurate in case of binary qualitative variables. Second, is utilizing the logistic regression model as an alternative model to achieve the paper goal.  The paper results, for both WB data and MATLAB data, have successfully proved the ability of the logistic regression model in manipulating the categorical data and predicting the coefficients of the corresponding regression models.   

2015 ◽  
Vol 32 (1) ◽  
pp. 288 ◽  
Author(s):  
Daniel Lapresa ◽  
Javier Arana ◽  
M.Teresa Anguera ◽  
J.Ignacio Pérez-Castellanos ◽  
Mario Amatria

This study shows how simple and multiple logistic regression can be used in observational methodology and more specifically, in the fields of physical activity and sport. We demonstrate this in a study designed to determine whether three-a-side futsal or five-a-side futsal is more suited to the needs and potential of children aged 6-to-8 years. We constructed a multiple logistic regression model to analyze use of space (depth of play) and three simple logistic regression models to determine which game format is more likely to potentiate effective technical and tactical performance.


Entropy ◽  
2021 ◽  
Vol 23 (11) ◽  
pp. 1517
Author(s):  
Hao Yang Teng ◽  
Zhengjun Zhang

Logistic regression is widely used in the analysis of medical data with binary outcomes to study treatment effects through (absolute) treatment effect parameters in the models. However, the indicative parameters of relative treatment effects are not introduced in logistic regression models, which can be a severe problem in efficiently modeling treatment effects and lead to the wrong conclusions with regard to treatment effects. This paper introduces a new enhanced logistic regression model that offers a new way of studying treatment effects by measuring the relative changes in the treatment effects and also incorporates the way in which logistic regression models the treatment effects. The new model, called the Absolute and Relative Treatment Effects (AbRelaTEs) model, is viewed as a generalization of logistic regression and an enhanced model with increased flexibility, interpretability, and applicability in real data applications than the logistic regression. The AbRelaTEs model is capable of modeling significant treatment effects via an absolute or relative or both ways. The new model can be easily implemented using statistical software, with the logistic regression model being treated as a special case. As a result, the classical logistic regression models can be replaced by the AbRelaTEs model to gain greater applicability and have a new benchmark model for more efficiently studying treatment effects in clinical trials, economic developments, and many applied areas. Moreover, the estimators of the coefficients are consistent and asymptotically normal under regularity conditions. In both simulation and real data applications, the model provides both significant and more meaningful results.


2020 ◽  
Author(s):  
Niema Ghanad Poor ◽  
Nicholas C West ◽  
Rama Syamala Sreepada ◽  
Srinivas Murthy ◽  
Matthias Görges

BACKGROUND In the pediatric intensive care unit (PICU), quantifying illness severity can be guided by risk models to enable timely identification and appropriate intervention. Logistic regression models, including the pediatric index of mortality 2 (PIM-2) and pediatric risk of mortality III (PRISM-III), produce a mortality risk score using data that are routinely available at PICU admission. Artificial neural networks (ANNs) outperform regression models in some medical fields. OBJECTIVE In light of this potential, we aim to examine ANN performance, compared to that of logistic regression, for mortality risk estimation in the PICU. METHODS The analyzed data set included patients from North American PICUs whose discharge diagnostic codes indicated evidence of infection and included the data used for the PIM-2 and PRISM-III calculations and their corresponding scores. We stratified the data set into training and test sets, with approximately equal mortality rates, in an effort to replicate real-world data. Data preprocessing included imputing missing data through simple substitution and normalizing data into binary variables using PRISM-III thresholds. A 2-layer ANN model was built to predict pediatric mortality, along with a simple logistic regression model for comparison. Both models used the same features required by PIM-2 and PRISM-III. Alternative ANN models using single-layer or unnormalized data were also evaluated. Model performance was compared using the area under the receiver operating characteristic curve (AUROC) and the area under the precision recall curve (AUPRC) and their empirical 95% CIs. RESULTS Data from 102,945 patients (including 4068 deaths) were included in the analysis. The highest performing ANN (AUROC 0.871, 95% CI 0.862-0.880; AUPRC 0.372, 95% CI 0.345-0.396) that used normalized data performed better than PIM-2 (AUROC 0.805, 95% CI 0.801-0.816; AUPRC 0.234, 95% CI 0.213-0.255) and PRISM-III (AUROC 0.844, 95% CI 0.841-0.855; AUPRC 0.348, 95% CI 0.322-0.367). The performance of this ANN was also significantly better than that of the logistic regression model (AUROC 0.862, 95% CI 0.852-0.872; AUPRC 0.329, 95% CI 0.304-0.351). The performance of the ANN that used unnormalized data (AUROC 0.865, 95% CI 0.856-0.874) was slightly inferior to our highest performing ANN; the single-layer ANN architecture performed poorly and was not investigated further. CONCLUSIONS A simple ANN model performed slightly better than the benchmark PIM-2 and PRISM-III scores and a traditional logistic regression model trained on the same data set. The small performance gains achieved by this two-layer ANN model may not offer clinically significant improvement; however, further research with other or more sophisticated model designs and better imputation of missing data may be warranted. CLINICALTRIAL


Spinal Cord ◽  
2020 ◽  
Author(s):  
Omar Khan ◽  
Jetan H. Badhiwala ◽  
Michael G. Fehlings

Abstract Study design Retrospective analysis of prospectively collected data. Objectives Recently, logistic regression models were developed to predict independence in bowel function 1 year after spinal cord injury (SCI) on a multicenter European SCI (EMSCI) dataset. Here, we evaluated the external validity of these models against a prospectively accrued North American SCI dataset. Setting Twenty-five SCI centers in the United States and Canada. Methods Two logistic regression models developed by the EMSCI group were applied to data for 277 patients derived from three prospective multicenter SCI studies based in North America. External validation was evaluated for both models by assessing their discrimination, calibration, and clinical utility. Discrimination and calibration were assessed using ROC curves and calibration curves, respectively, while clinical utility was assessed using decision curve analysis. Results The simplified logistic regression model, which used baseline total motor score as the predictor, demonstrated the best performance, with an area under the ROC curve of 0.869 (95% confidence interval: 0.826–0.911), a sensitivity of 75.5%, and a specificity of 88.5%. Moreover, the model was well calibrated across the full range of observed probabilities and displayed superior clinical benefit on the decision curve. Conclusions A logistic regression model using baseline total motor score as a predictor of independent bowel function 1 year after SCI was successfully validated against an external dataset. These findings provide evidence supporting the use of this model to enhance the care for individuals with SCI.


2020 ◽  
Vol 35 (6) ◽  
pp. 933-933
Author(s):  
Rolin S ◽  
Kitchen Andren K ◽  
Mullen C ◽  
Kurniadi N ◽  
Davis J

Abstract Objective Previous research in a Veterans Affairs sample proposed using single items on the Neurobehavioral Symptom Inventory (NSI) to screen for anxiety (item 19) and depression (item 20). This study examined the approach in an outpatient physical medicine and rehabilitation sample. Method Participants (N = 84) underwent outpatient neuropsychological evaluation using the NSI, BDI-II, GAD-7, MMPI-2-RF, and Memory Complaints Inventory (MCI) among other measures. Anxiety and depression were psychometrically determined via cutoffs on the GAD-7 (>4) and MMPI-2-RF ANX (>64 T), and BDI-II (>13) and MMPI-2-RF RC2 (>64 T), respectively. Analyses included receiver operating characteristic analysis (ROC) and logistic regression. Logistic regression models used dichotomous anxiety and depression as outcomes and relevant NSI items and MCI average score as predictors. Results ROC analysis using NSI items to classify cases showed area under the curve (AUC) values of .77 for anxiety and .85 for depression. The logistic regression model predicting anxiety correctly classified 80% of cases with AUC of .86. The logistic regression model predicting depression correctly classified 79% of cases with AUC of .88. Conclusion Findings support the utility of NSI anxiety and depression items as screening measures in a rehabilitation population. Consideration of symptom validity via the MCI improved classification accuracy of the regression models. The approach may be useful in other clinical settings for quick assessment of psychological issues warranting further evaluation.


2019 ◽  
Vol 2019 ◽  
pp. 1-15
Author(s):  
Daiquan Xiao ◽  
Xuecai Xu ◽  
Li Duan

This study is intended to investigate the influencing factors of injury severity by considering the heterogeneity issue of unobserved factors at different arterials and the spatial attributes in geographically weighted regression models. To achieve the objectives, geographically weighted panel logistic regression model was developed, in which the geographically weighted logistic regression model addressed the injury severity from the spatial perspective, while the panel data model accommodated the heterogeneity attributed to unobserved factors from the temporal perspective. The geo-crash data of Las Vegas metropolitan area from 2014 to 2016 was collected, involving 27 arterials with 25,029 injury samples. By comparing the conventional logistic regression model and geographically weighted logistic regression models, the geographically weighted panel logistic regression model showed preference to the other models. Results revealed that four main factors, human-beings (drivers/pedestrians/cyclists), vehicles, roadway, and environment, were potentially significant factors of increasing the injury severity. The findings provide useful insights for practitioners and policy makers to improve safety along arterials.


Author(s):  
B. M. Fernandez-Felix ◽  
E. García-Esquinas ◽  
A. Muriel ◽  
A. Royuela ◽  
J. Zamora

Overfitting is a common problem in the development of predictive models. It leads to an optimistic estimation of apparent model performance. Internal validation using bootstrapping techniques allows one to quantify the optimism of a predictive model and provide a more realistic estimate of its performance measures. Our objective is to build an easy-to-use command, bsvalidation, aimed to perform a bootstrap internal validation of a logistic regression model.


2018 ◽  
Vol 24 (109) ◽  
pp. 535
Author(s):  
اياد حبيب شمال

Abstract: This paper discusses the problem of semi maulticollinearity in the nonlinear regression model (the multi-logistic regression model) When the dependent variable is a qualitative variable, the binary response is either equal to one for a response or zero for no response, Through the use of Iterative principal component estimatorsWhich are based on the normal weights and conditional Bays weights . If the appliede Estimates this model Through the use of two types of drugs concentrations thy concentration of ciprodar (variable X1) On a number of people with Patients with renal disease represent the dependent variable (The person heals from the disease  , The person has not recovered from the disease )from through Mean Error Squares (MSE) The results were indicative of Iterative principal component estemaite   Depending on the conditional Bays weights prefer the Iterative principal component estimators Depending on the the normal weights.


Author(s):  
Moza S. Al-Balushi ◽  
Mohammed S. Ahmed ◽  
M. Mazharul Islam

In this paper, multilevel logistic regression models are developed for examining the hierarchical effects of contraceptive use and its selected determinants in Oman using the 2008 Oman National Reproductive Health Survey (ONRHS). Comparison between single level and multilevel logistic regression models has been made to examine the plausibility of multilevel effects of contraceptive use. From the multilevel logistic regression model analysis, it was found that there is real multilevel variation among contraceptive users in Oman. The results indicate that a multilevel logistic regression model is the best fit over ordinary multiple logistic regression models. Generally, this study revealed that women’s age, education, number of living children and region of residence are important factors that affect contraceptive use in Oman. The effect of regional variation for age of women, education of women and number of living children further implies that there exists considerable differences in modern contraceptive use among regions, and a model with a random coefficient or slope is more appropriate to explain the regional variation than a model with fixed coefficients or without random effects. The study suggests that researchers should use multilevel models rather than traditional regression methods when their data structure is hierarchal.  


2018 ◽  
Vol 2 (2) ◽  
pp. 28-35
Author(s):  
Gatri Eka Kusumawardhani ◽  
Vera Maya Santi ◽  
Suyono Suyono

Survival analysis is an analysis used to determine the length of time required by an object in order to survive. That time is sometimes influenced by several factors called independent variables. One way to know relationship is through a regression model. The dependent variable in this regression model is a survival time which is log-logistic distributed. The data used in this study were right censored survival data. Log-logistic regression models for survival data can be expressed by transformation Y=lnT= θ0+θ1xi1+...+θixij+σԑ. The parameter of the log-logistic regression models for right censored survival data are estimated with the maximum likelihood method. In this study, the application of log-logistic regression model for survival data is in data of lung cancer patients. Based on the data already performed, best log-logistic regression model is obtained yi=1.92458+0.0242393 xi1+0.639037ԑi.


Sign in / Sign up

Export Citation Format

Share Document