scholarly journals Variable Importance Analysis in Default Prediction using Machine Learning Techniques

Author(s):  
Başak Gültekin ◽  
Betül Erdoğdu Şakar
Author(s):  
Maria Elena Laino ◽  
Elena Generali ◽  
Tobia Tommasini ◽  
Giovanni Angelotti ◽  
Alessio Aghemo ◽  
...  

IntroductionIdentifying SARS-CoV-2 patients at higher risk of mortality is crucial in the management of a pandemic. Artificial intelligence techniques allow to analyze big amount of data to find hidden patterns. We aimed to develop and validate a mortality score at admission for COVID-19 based on high-level machine learning.Material and methodsWe conducted a retrospective cohort study on hospitalized adults COVID-19 patients between March and December 2020. The primary outcome was in-hospital mortality. A machine learning approach on vital parameters, laboratory values, and demographic features was applied to develop different models. Then, a feature importance analysis was performed to reduce the number of variables included in the model, to develop a risk score with good overall performance, that was finally evaluated in terms of discrimination and calibration capabilities. All results underwent cross-validation.Results1,135 consecutive patients (median age 70 years, 64% males) were enrolled, 48 patients were excluded, the cohort was randomly divided in training (760) and test (327). During hospitalization, 251 (22%) patients died. After feature selection, the best performing classifier was random forest (AUC 0.88±0.03). Based on the relative importance of each variable, a pragmatic score was developed, showing good performances (AUC 0.85, ±0.025), and three levels were defined that correlated well with in-hospital mortality.ConclusionsMachine learning techniques were applied in order to develop an accurate in-hospital mortality risk score for COVID-19 based on ten variables. The application of the proposed score has utility in clinical settings to guide the management and prognostication of COVID-19 patients.


Risks ◽  
2021 ◽  
Vol 9 (7) ◽  
pp. 126
Author(s):  
Shengkun Xie

In insurance rate-making, the use of statistical machine learning techniques such as artificial neural networks (ANN) is an emerging approach, and many insurance companies have been using them for pricing. However, due to the complexity of model specification and its implementation, model explainability may be essential to meet insurance pricing transparency for rate regulation purposes. This requirement may imply the need for estimating or evaluating the variable importance when complicated models are used. Furthermore, from both rate-making and rate-regulation perspectives, it is critical to investigate the impact of major risk factors on the response variables, such as claim frequency or claim severity. In this work, we consider the modelling problems of how claim counts, claim amounts and average loss per claim are related to major risk factors. ANN models are applied to meet this goal, and variable importance is measured to improve the model’s explainability due to the models’ complex nature. The results obtained from different variable importance measurements are compared, and dominant risk factors are identified. The contribution of this work is in making advanced mathematical models possible for applications in auto insurance rate regulation. This study focuses on analyzing major risks only, but the proposed method can be applied to more general insurance pricing problems when additional risk factors are being considered. In addition, the proposed methodology is useful for other business applications where statistical machine learning techniques are used.


Diagnostics ◽  
2021 ◽  
Vol 11 (11) ◽  
pp. 2150
Author(s):  
Davide Stefano Sardina ◽  
Giuseppe Valenti ◽  
Francesco Papia ◽  
Carina Gabriela Uasuf

Background: Omalizumab is the best treatment for patients with chronic spontaneous urticaria (CSU). Machine learning (ML) approaches can be used to predict response to therapy and the effectiveness of a treatment. No studies are available on the use of ML techniques to predict the response to Omalizumab in CSU. Methods: Data from 132 CSU outpatients were analyzed. Urticaria Activity Score over 7 days (UAS7) and treatment efficacy were assessed. Clinical and demographic characteristics were used for training and validating ML models to predict the response to treatment. Two methodologies were used to label the data based on the response to treatment (UAS7 ³ 6): (A) at 1, 3 and 5 months; (B) classifying the patients as early responders (ER), late responders (LR) or non-responders (NR) (ER: UAS 7 ³ 6 at first month, LR: UAS 7 ³ 6 at third month, NR: if none of the previous conditions occurred). Results: ER were predominantly characterized by hypertension, while LR mainly suffered from asthma and hypothyroidism. A slight positive correlation (R2 = 0.21) was found between total IgE levels and UAS7 at 1 month. Variable Importance Analysis (VIA) reported D-dimer and C-reactive proteins as the key blood tests for the performance of learning techniques. Using methodology (A), SVM (specificity of 0.81) and k-NN (sensitivity of 0.8) are the best models to predict LR at the third month. Conclusion: k-NN plus the SVM model could be used to identify the response to treatment. D-dimer and C-reactive proteins have greater predictive power in training ML models.


Sign in / Sign up

Export Citation Format

Share Document