scholarly journals Personal Credit Default Prediction Model Based on Convolution Neural Network

2020 ◽  
Vol 2020 ◽  
pp. 1-10
Author(s):  
Xiang Zhou ◽  
Wenyu Zhang ◽  
Yefeng Jiang

It has great significance for the healthy development of credit industry to control the credit default risk by using the information technology. For some traditional research about the credit default prediction model, more attention is paid to the model accuracy, while the business characteristics of the credit risk prevention are easy to be ignored. Meanwhile, to reduce the complicity of the model, the data features need be extracted manually, which will decrease the high-dimensional correlation among the analyzing data and then result in the low prediction performance of the model. So, in the paper, the CNN (convolutional neural network) is used to establish a personal credit default prediction model, and both ACC (accuracy) and AUC (the area under the ROC curve) are taken as the performance evaluation index of the model. Experimental results show the model ACC (accuracy) is above 95% and AUC (the area under the ROC curve) is above 99%, and the model performance is much better than the classical algorithm including the SVM (support vector machine), Bayes, and RF (random forest).

2021 ◽  
pp. 097226292110362
Author(s):  
Shilpa Shetty H. ◽  
Theresa Nithila Vincent

The unprecedented pandemic COVID-19 has impacted businesses across the globe. A significant jump in the credit default risk is expected. Credit default is an indicator of financial distress experienced by the business. Credit default often leads to bankruptcy filing against the defaulting company. In India, the Insolvency and Bankruptcy Code (IBC) is the law that governs insolvency and bankruptcy. As reported by the Insolvency and Bankruptcy Board of India (IBBI), the number of companies filing for bankruptcy under IBC is on a rise, and the industrial sector has witnessed the maximum number of bankruptcy filings. The present article attempts to develop a credit default prediction model for the Indian industrial sector based on a sample of 164 companies comprising an equal number of defaulting and nondefaulting companies. A total of 120 companies are used as training samples and 44 companies as the testing samples. Binary logistic regression analysis is employed to develop the model. The diagnostic ability of the model is tested using receiver operating characteristic curve, area under the curve and annual accuracy. According to the study, return on assets, current ratio, debt to total assets ratio, sales to working capital ratio and cash flow to total assets ratio is statistically significant in predicting default. The findings of the study have significant implications in lending and investment decisions.


2020 ◽  
Vol 12 (16) ◽  
pp. 6325 ◽  
Author(s):  
Hyeongjun Kim ◽  
Hoon Cho ◽  
Doojin Ryu

Corporate default predictions play an essential role in each sector of the economy, as highlighted by the global financial crisis and the increase in credit risk. This study reviews the corporate default prediction literature from the perspectives of financial engineering and machine learning. We define three generations of statistical models: discriminant analyses, binary response models, and hazard models. In addition, we introduce three representative machine learning methodologies: support vector machines, decision trees, and artificial neural network algorithms. For both the statistical models and machine learning methodologies, we identify the key studies used in corporate default prediction. By comparing these methods with findings from the interdisciplinary literature, our review suggests some new tasks in the field of machine learning for predicting corporate defaults. First, a corporate default prediction model should be a multi-period model in which future outcomes are affected by past decisions. Second, the stock price and the corporate value determined by the stock market are important factors to use in default predictions. Finally, a corporate default prediction model should be able to suggest the cause of default.


2020 ◽  
Vol 2020 ◽  
pp. 1-17
Author(s):  
Fujun Ma ◽  
Fanghao Song ◽  
Yan Liu ◽  
Jiahui Niu

The fatigue energy consumption of independent gestures can be obtained by calculating the power spectrum of surface electromyography (sEMG) signals. The existing research studies focus on the fatigue of independent gestures, while the research studies on integrated gestures are few. However, the actual gesture operation mode is usually integrated by multiple independent gestures, so the fatigue degree of integrated gestures can be predicted by training neural network of independent gestures. Three natural gestures including browsing information, playing games, and typing are divided into nine independent gestures in this paper, and the predicted model is established and trained by calculating the energy consumption of independent gestures. The artificial neural networks (ANNs) including backpropagation (BP) neural network, recurrent neural network (RNN), and long short-term memory (LSTM) are used to predict the fatigue of gesture. The support vector machine (SVM) is used to assist verification. Mean square error (MSE), root mean square error (RMSE), and mean absolute error (MAE) are utilized to evaluate the optimal prediction model. Furthermore, the different datasets of the processed sEMG signal and its decomposed wavelet coefficients are trained, respectively, and the changes of error functions of them are compared. The experimental results show that LSTM model is more suitable for gesture fatigue prediction. The processed sEMG signals are appropriate for using as the training set the fatigue degree of one-handed gesture. It is better to use wavelet decomposition coefficients as datasets to predict the high-dimensional sEMG signals of two-handed gestures. The experimental results can be applied to predict the fatigue degree of complex human-machine interactive gestures, help to avoid unreasonable gestures, and improve the user’s interactive experience.


EP Europace ◽  
2019 ◽  
Vol 21 (9) ◽  
pp. 1307-1312 ◽  
Author(s):  
Wei-Syun Hu ◽  
Meng-Hsuen Hsieh ◽  
Cheng-Li Lin

Abstract Aims We aimed to construct a random forest model to predict atrial fibrillation (AF) in Chinese population. Methods and results This study was comprised of 682 237 subjects with or without AF. Each subject had 19 features that included the subjects’ age, gender, underlying diseases, CHA2DS2-VASc score, and follow-up period. The data were split into train and test sets at an approximate 9:1 ratio: 614 013 data points were placed into the train set and 68 224 data points were placed into the test set. In this study, weighted average F1, precision, and recall values were used to measure prediction model performance. The F1, precision, and recall values were calculated across the train set, the test set, and all data. The area under receiving operating characteristic (ROC) curve was also used to evaluate the performance of the prediction model. The prediction model achieved a k-fold cross-validation accuracy of 0.979 (k = 10). In the test set, the prediction model achieved an F1 value of 0.968, precision value of 0.958, and recall value of 0.979. The area under ROC curve of the model was 0.948 (95% confidence interval 0.947–0.949). This model was validated with a separate dataset. Conclusions This study showed a novel AF risk prediction scheme for Chinese individuals with random forest model methodology.


2017 ◽  
Vol 25 (3) ◽  
pp. 321-330 ◽  
Author(s):  
Shang Gao ◽  
Michael T Young ◽  
John X Qiu ◽  
Hong-Jun Yoon ◽  
James B Christian ◽  
...  

Abstract Objective We explored how a deep learning (DL) approach based on hierarchical attention networks (HANs) can improve model performance for multiple information extraction tasks from unstructured cancer pathology reports compared to conventional methods that do not sufficiently capture syntactic and semantic contexts from free-text documents. Materials and Methods Data for our analyses were obtained from 942 deidentified pathology reports collected by the National Cancer Institute Surveillance, Epidemiology, and End Results program. The HAN was implemented for 2 information extraction tasks: (1) primary site, matched to 12 International Classification of Diseases for Oncology topography codes (7 breast, 5 lung primary sites), and (2) histological grade classification, matched to G1–G4. Model performance metrics were compared to conventional machine learning (ML) approaches including naive Bayes, logistic regression, support vector machine, random forest, and extreme gradient boosting, and other DL models, including a recurrent neural network (RNN), a recurrent neural network with attention (RNN w/A), and a convolutional neural network. Results Our results demonstrate that for both information tasks, HAN performed significantly better compared to the conventional ML and DL techniques. In particular, across the 2 tasks, the mean micro and macroF-scores for the HAN with pretraining were (0.852,0.708), compared to naive Bayes (0.518, 0.213), logistic regression (0.682, 0.453), support vector machine (0.634, 0.434), random forest (0.698, 0.508), extreme gradient boosting (0.696, 0.522), RNN (0.505, 0.301), RNN w/A (0.637, 0.471), and convolutional neural network (0.714, 0.460). Conclusions HAN-based DL models show promise in information abstraction tasks within unstructured clinical pathology reports.


2015 ◽  
Vol 733 ◽  
pp. 893-897
Author(s):  
Peng Yu Zhang

The accuracy of short-term wind power forecast is important for the power system operation. Based on the real-time wind power data, a wind power prediction model using wavelet neural network (WNN) is proposed. In order to overcome such disadvantages of WNN as easily falling into local minimum, this paper put forward using Particle Swarm Optimization (PSO) algorithm to optimize the weight and threshold of WNN. It’s advisable to use Support Vector Machine (SVM) to comparatively do prediction and put two outcomes as input vector for Generalized Regression Neural Network (GRNN) to do nonlinear combination forecasting. Simulation shows that combination prediction model can improve the accuracy of the short-term wind power prediction.


2021 ◽  
Vol 15 (58) ◽  
pp. 308-318
Author(s):  
Tran-Hieu Nguyen ◽  
Anh-Tuan Vu

In this paper, a machine learning-based framework is developed to quickly evaluate the structural safety of trusses. Three numerical examples of a 10-bar truss, a 25-bar truss, and a 47-bar truss are used to illustrate the proposed framework. Firstly, several truss cases with different cross-sectional areas are generated by employing the Latin Hypercube Sampling method. Stresses inside truss members as well as displacements of nodes are determined through finite element analyses and obtained values are compared with design constraints. According to the constraint verification, the safety state is assigned as safe or unsafe. Members’ sectional areas and the safety state are stored as the inputs and outputs of the training dataset, respectively. Three popular machine learning classifiers including Support Vector Machine, Deep Neural Network, and Adaptive Boosting are used for evaluating the safety of structures. The comparison is conducted based on two metrics: the accuracy and the area under the ROC curve. For the two first examples, three classifiers get more than 90% of accuracy. For the 47-bar truss, the accuracies of the Support Vector Machine model and the Deep Neural Network model are lower than 70% but the Adaptive Boosting model still retains the high accuracy of approximately 98%. In terms of the area under the ROC curve, the comparative results are similar. Overall, the Adaptive Boosting model outperforms the remaining models. In addition, an investigation is carried out to show the influence of the parameters on the performance of the Adaptive Boosting model.


Flooding is a major problem globally, and especially in SuratThani province, Thailand. Along the lower Tapeeriver in SuratThani, the population density is high. Implementing an early warning system can benefit people living along the banks here. In this study, our aim was to build a flood prediction model using artificial neural network (ANN), which would utilize water and stream levels along the lower Tapeeriver to predict floods. This model was used to predict flood using a dataset of rainfall and stream levels measured at local stations. The developed flood prediction model consisted of 4 input variables, namely, the rainfall amounts and stream levels at stations located in the PhraSeang district (X.37A), the Khian Sa district (X.217), and in the Phunphin district (X.5C). Model performance was evaluated using input data spanning a period of eight years (2011–2018). The model performance was compared with support vector machine (SVM), and ANN had better accuracy. The results showed an accuracy of 97.91% for the ANN model; however, for SVM it was 97.54%. Furthermore, the recall (42.78%) and f-measure (52.24%) were better for our model, however, the precision was lower. Therefore, the designed flood prediction model can estimate the likelihood of floods around the lower Tapee river region


Sign in / Sign up

Export Citation Format

Share Document