Analysis of categorical response data: Use logistic regression rather than endpoint-difference scores or discriminant analysis

2009 ◽  
Vol 126 (5) ◽  
pp. 2159-2162 ◽  
Author(s):  
Geoffrey Stewart Morrison ◽  
Maria V. Kondaurova
Author(s):  
Mahshid Namdari ◽  
S. Mahmoud Taheri ◽  
Alireza Abadi ◽  
Mansour Rezaei ◽  
Naser Kalantari

Author(s):  
Zoryna Yurynets ◽  
Rostyslav Yurynets ◽  
Nataliya Kunanets ◽  
Ivanna Myshchyshyn

In the current conditions of economic development, it is important to pay attention to the study of the main types of risks, effective methods of evaluation, monitoring, analysis of banking risks. One of the main approaches to quantitatively assessing the creditworthiness of borrowers is credit scoring. The objective of credit scoring is to optimize management decisions regarding the possibility of providing bank loans. In the article, the scientific and methodological provisions concerning the formation of a regression model for assessing bank risks in the process of granting loans to borrowers has been proposed. The proposed model is based on the use of logistic regression tools, discriminant analysis with the use of expert evaluation. During the formation of a regression model, the relationship between risk factors and probable magnitude of loan risk has been established. In the course of calculations, the coefficient of the individual's solvency has been calculated. Direct computer data preparation, including the calculation of the indicators selected in the process of discriminant analysis, has been carried out in the Excel package environment, followed by their import into the STATISTICA package for analysis in the “Logistic regression” sub-module of the “Nonlinear evaluation” module. The adequacy of the constructed model has been determined using the Macfaden's likelihood ratio index. The calculated value of the Macfaden's likelihood ratio index indicates the adequacy of the constructed model. The ability to issue loans to new clients has been evaluated using a regression model. The conducted calculations show the possibility of granting a loan exclusively to the second and third clients. The offered method allows to conduct assessment of client's solvency and risk prevention at different stages of lending, facilitates the possibility to independently make informed decisions on credit servicing of clients and management of a loan portfolio, optimization of management decisions in banks. In order for a loan-based model to continue to perform its functions, it must be periodically adjusted.


2014 ◽  
Vol 5 (3) ◽  
pp. 30-34 ◽  
Author(s):  
Balkishan Sharma ◽  
Ravikant Jain

Objective: The clinical diagnostic tests are generally used to identify the presence of a disease. The cutoff value of a diagnostic test should be chosen to maximize the advantage that accrues from testing a population of human and others. When a diagnostic test is to be used in a clinical condition, there may be an opportunity to improve the test by changing the cutoff value. To enhance the accuracy of diagnosis is to develop new tests by using a proper statistical technique with optimum sensitivity and specificity. Method: Mean±2SD method, Logistic Regression Analysis, Receivers Operating Characteristics (ROC) curve analysis and Discriminant Analysis (DA) have been discussed with their respective applications. Results: The study highlighted some important methods to determine the cutoff points for a diagnostic test. The traditional method is to identify the cut-off values is Mean±2SD method. Logistic Regression Analysis, Receivers Operating Characteristics (ROC) curve analysis and Discriminant Analysis (DA) have been proved to be beneficial statistical tools for determination of cut-off points.Conclusion: There may be an opportunity to improve the test by changing the cut-off value with the help of a correctly identified statistical technique in a clinical condition when a diagnostic test is to be used. The traditional method is to identify the cut-off values is Mean ± 2SD method. It was evidenced in certain conditions that logistic regression is found to be a good predictor and the validity of the same can be confirmed by identifying the area under the ROC curve. Abbreviations: ROC-Receiver operating characteristics and DA-Discriminant Analysis. Asian Journal of Medical Science, Volume-5(3) 2014: 30-34 http://dx.doi.org/10.3126/ajms.v5i3.9296      


2018 ◽  
Author(s):  
Παντελής Σταυρούλιας

Οι έγκυρες προβλέψεις χρηματοοικονομικών κρίσεων διασφάλιζαν ανέκαθεν την σταθερότητα τόσο ολόκληρου του χρηματοοικονομικού οικοδομήματος γενικότερα, όσο και του τραπεζικού τομέα ειδικότερα. Με την παρούσα διατριβή επιτυγχάνεται η πρόβλεψη συστημικών τραπεζικών κρίσεων για χώρες της EE-14 αρκετά τρίμηνα προτού αυτές γίνουν αντιληπτές με την χρησιμοποίηση των πιο διαδεδομένων μεταβλητών (μακροοικονομικών, τραπεζικών και αγοράς) μέσω δύο προσεγγίσεων, της δυαδικής και της πολυεπίπεδης. Ακολουθώντας τη δυαδική προσέγγιση, εξάγονται μοντέλα ταξινόμησης με την εφαρμογή της Διακριτής Ανάλυσης (Discriminant Analysis), της Γραμμικής Παλινδρόμησης (Linear Regression), της Λογιστικής Παλινδρόμησης (Logistic Regression) και της Παλινδρόμησης Πιθανοομάδας (Probit Regression), για την έγκαιρη πρόβλεψη των κρίσεων -12 έως -7 τρίμηνα πριν την εμφάνισή τους. Επιπροσθέτως, συγκρίνεται η απόδοση της ανωτέρω ανάλυσης χρησιμοποιώντας τις νεότερες και πλέον υποσχόμενες μεθόδους του Δέντρου Ταξινόμησης (Classification Tree), του Τυχαίου Δάσους (Random Forest) και της C5. Ταυτόχρονα προτείνεται ένα νέο μέτρο επιλογής κατωφλίων και απόδοσης προσαρμογής (GoF) των μοντέλων πρόβλεψης και μια νέα συνδυαστική (combined) μέθοδος ταξινόμησης. Προκειμένου να διερευνηθεί η απόδοση της ανωτέρω ανάλυσης, χρησιμοποιείται ο εκτός του δείγματος έλεγχος (out-of-sample testing) με τη μέθοδο της ανά χώρα σταυρωτής επικύρωσης (country-blocked cross validation). Σύμφωνα με τη μέθοδο αυτή, πραγματοποιείται η ανάλυση και εξάγονται τα μοντέλα πρόβλεψης με τη χρήση των δεκατριών από τις δεκατέσσερις χώρες του δείγματος (in-sample), εφαρμόζονται τα εξαγόμενα μοντέλα για την δέκατη τέταρτη χώρα που είχε εξαιρεθεί από το αρχικό δείγμα (out-of-sample) και ελέγχονται τα αποτελέσματα πρόβλεψης με τα πραγματικά δεδομένα της χώρας αυτής. Η παραπάνω διαδικασία επαναλαμβάνεται δεκατέσσερις φορές, αφήνοντας δηλαδή κάθε φορά μια χώρα εκτός δείγματος και τελικά εξάγεται ο μέσος όρος των επαναλήψεων. Στην παρούσα διατριβή, και χρησιμοποιώντας τον εκτός του δείγματος έλεγχο, επιτυγχάνεται η κατά 82.4% σωστή ταξινόμηση (Ακρίβεια – Accuracy), 78.4% ποσοστό Αληθινών Θετικών (Τrue Ρositive Rate - TPR) και 80.6% ποσοστό Θετικής Τιμής Πρόβλεψης (Positive Predictive Value - PPV). Σύμφωνα με την πολυεπίπεδη προσέγγιση, διακρίνονται δύο επίπεδα-περίοδοι πρόβλεψης των Συστημικών Τραπεζικών Κρίσεων. Το πρώτο επίπεδο ονομάζεται έγκαιρη πρόβλεψη (early warning) και αφορά περίοδο -12 έως -7 τρίμηνα πριν την έλευση της κρίσης ενώ το δεύτερο επίπεδο ονομάζεται καθυστερημένη πρόβλεψη (late warning) και αφορά περίοδο -6 έως -1 τρίμηνα πριν την έλευση της κρίσης. Για την πολυεπίπεδη αυτή ταξινόμηση, γίνεται χρήση των Νευρωνικών Δικτύων (Neural Networks), της Πολυωνυμικής Λογιστικής Παλινδρόμησης (Multinomial Logistic Regression) και της Πολυεπίπεδης Γραμμικής Διακριτής Ανάλυσης (Multinomial Discriminant Analysis). Εφαρμόζοντας τον ίδιο εκτός του δείγματος έλεγχο με την πρώτη προσέγγιση επιτυγχάνεται η κατά 85.7% σωστή ταξινόμηση με την βέλτιστη μέθοδο που αποδεικνύεται ότι είναι η Πολυεπίπεδη Γραμμική Διακριτή Ανάλυση. Εφαρμόζοντας την ανωτέρω ανάλυση, οι ενδιαφερόμενοι φορείς άσκησης πολιτικής (policy makers) μπορούν να ανιχνεύσουν την ύπαρξης κρίσης σε βάθος χρόνου έως τριών ετών με τα προτεινόμενα μοντέλα, χρησιμοποιώντας μόνο δεδομένα που υπάρχουν ελεύθερα προσβάσιμα στο κοινό, ασκώντας με τον τρόπο αυτό την κατάλληλη ανά περίπτωση μακροπροληπτική πολιτική (macroprudential policy).


2014 ◽  
Vol 129 (6_suppl4) ◽  
pp. 166-172 ◽  
Author(s):  
Russell G. Schuh ◽  
Michelle Basque ◽  
Margaret A. Potter

Indicators for Stress Adaptation Analytics (ISAAC) is a protocol to measure the emergency response behavior of organizations within local public health systems. We used ISAAC measurements to analyze how funding and structural changes may have affected the emergency response capacity of a local health agency. We developed ISAAC profiles for an agency's consecutive fiscal years 2013 and 2014, during which funding cuts and organizational restructuring had occurred. ISAAC uses descriptive and categorical response data to obtain a function stress score and a weighted contribution score to the agency's total response. In the absence of an emergency, we simulated one by assuming that each function was stressed at an equal rate for each of the two years and then we compared the differences between the two years. The simulations revealed that seemingly minor personnel or budget changes in health departments can mask considerable variation in change at the internal function level.


2020 ◽  
Vol 8 (A) ◽  
pp. 119-124
Author(s):  
Mohammad Chehrazi ◽  
Seyed Hassan Saadat ◽  
Mahmoud Hajiahmadi ◽  
Mirko Spiroski

BACKGROUND: An important issue in modeling categorical response data is the choice of the links. The commonly used complementary log-log link is inclined to link misspecification due to its positive and fixed skewness parameter. AIM: The objective of this paper is to introduce a flexible skewed link function for modeling ordinal data with some covariates. METHODS: We introduce a flexible skewed link model for the cumulative ordinal regression model based on Chen model. RESULTS: The main advantage suggested by the proposed links is the skewed link provide much more identifiable than the existing skewed links. The propriety of posterior distributions under proper and improper priors is explored in detail. An efficient Markov chain Monte Carlo algorithm is developed for sampling from the posterior distribution. CONCLUSION: The proposed methodology is motivated and illustrated by ovary hyperstimulation syndrome data.


Worldwide, breast cancer is the leading type of cancer in women accounting for 25% of all cases. Survival rates in the developed countries are comparatively higher with that of developing countries. This had led to the importance of computer aided diagnostic methods for early detection of breast cancer disease. This eventually reduces the death rate. This paper intents the scope of the biomarker that can be used to predict the breast cancer from the anthropometric data. This experimental study aims at computing and comparing various classification models (Binary Logistic Regression, Ball Vector Machine (BVM), C4.5, Partial Least Square (PLS) for Classification, Classification Tree, Cost sensitive Classification Tree, Cost sensitive Decision Tree, Support Vector Machine for Classification, Core Vector Machine, ID3, K-Nearest Neighbor, Linear Discriminant Analysis (LDA), Log-Reg TRIRLS, Multi Layer Perceptron (MLP), Multinomial Logistic Regression (MLR), Naïve Bayes (NB), PLS for Discriminant Analysis, PLS for LDA, Random Tree (RT), Support Vector Machine SVM) for the UCI Coimbra breast cancer dataset. The feature selection algorithms (Backward Logit, Fisher Filtering, Forward Logit, ReleifF, Step disc) are worked out to find out the minimum attributes that can achieve a better accuracy. To ascertain the accuracy results, the Jack-knife cross validation method for the algorithms is conducted and validated. The Core vector machine classification algorithm outperforms the other nineteen algorithms with an accuracy of 82.76%, sensitivity of 76.92% and specificity of 87.50% for the selected three attributes, Age, Glucose and Resistin using ReleifF feature selection algorithm.


Sign in / Sign up

Export Citation Format

Share Document