Weight-Selected Attribute Bagging for Credit Scoring

Mathematical Problems in Engineering ◽

10.1155/2013/379690 ◽

2013 ◽

Vol 2013 ◽

pp. 1-13 ◽

Cited By ~ 3

Author(s):

Jianwu Li ◽

Haizhou Wei ◽

Wangli Hao

Keyword(s):

Credit Risk ◽

Prediction Accuracy ◽

Financial Risk ◽

Evaluation Method ◽

Credit Scoring ◽

Principal Component ◽

Support Vector ◽

Training Samples ◽

Attribute Evaluation ◽

Bagging Method

Assessment of credit risk is of great importance in financial risk management. In this paper, we propose an improved attribute bagging method, weight-selected attribute bagging (WSAB), to evaluate credit risk. Weights of attributes are first computed using attribute evaluation method such as linear support vector machine (LSVM) and principal component analysis (PCA). Subsets of attributes are then constructed according to weights of attributes. For each of attribute subsets, the larger the weights of the attributes the larger the probabilities by which they are selected into the attribute subset. Next, training samples and test samples are projected onto each attribute subset, respectively. A scoring model is then constructed based on each set of newly produced training samples. Finally, all scoring models are used to vote for test instances. An individual model that only uses selected attributes will be more accurate because of elimination of some of redundant and uninformative attributes. Besides, the way of selecting attributes by probability can also guarantee the diversity of scoring models. Experimental results based on two credit benchmark databases show that the proposed method, WSAB, is outstanding in both prediction accuracy and stability, as compared to analogous methods.

Nondestructive Testing and Visualization of Catechin Content in Black Tea Fermentation Using Hyperspectral Imaging

Sensors ◽

10.3390/s21238051 ◽

2021 ◽

Vol 21 (23) ◽

pp. 8051

Author(s):

Chunwang Dong ◽

Chongshan Yang ◽

Zhongyuan Liu ◽

Rentian Zhang ◽

Peng Yan ◽

...

Keyword(s):

Spectral Data ◽

Prediction Accuracy ◽

Visual Analysis ◽

Population Analysis ◽

Scatter Correction ◽

Principal Component ◽

Black Tea ◽

Support Vector ◽

Epicatechin Gallate ◽

Variable Combination

Catechin is a major reactive substance involved in black tea fermentation. It has a determinant effect on the final quality and taste of made teas. In this study, we applied hyperspectral technology with the chemometrics method and used different pretreatment and variable filtering algorithms to reduce noise interference. After reduction of the spectral data dimensions by principal component analysis (PCA), an optimal prediction model for catechin content was constructed, followed by visual analysis of catechin content when fermenting leaves for different periods of time. The results showed that zero mean normalization (Z-score), multiplicative scatter correction (MSC), and standard normal variate (SNV) can effectively improve model accuracy; while the shuffled frog leaping algorithm (SFLA), the variable combination population analysis genetic algorithm (VCPA-GA), and variable combination population analysis iteratively retaining informative variables (VCPA-IRIV) can significantly reduce spectral data and enhance the calculation speed of the model. We found that nonlinear models performed better than linear ones. The prediction accuracy for the total amount of catechins and for epicatechin gallate (ECG) of the extreme learning machine (ELM), based on optimal variables, reached 0.989 and 0.994, respectively, and the prediction accuracy for EGC, C, EC, and EGCG of the content support vector regression (SVR) models reached 0.972, 0.993, 0.990, and 0.994, respectively. The optimal model offers accurate prediction, and visual analysis can determine the distribution of the catechin content when fermenting leaves for different fermentation periods. The findings provide significant reference material for intelligent digital assessment of black tea during processing.

Supply chain finance credit risk assessment using support vector machine–based ensemble improved with noise elimination

International Journal of Distributed Sensor Networks ◽

10.1177/1550147720903631 ◽

2020 ◽

Vol 16 (1) ◽

pp. 155014772090363 ◽

Cited By ~ 2

Author(s):

Ying Liu ◽

Lihua Huang

Keyword(s):

Risk Assessment ◽

Support Vector Machine ◽

Supply Chain ◽

Credit Risk ◽

Financial Analysis ◽

Learning Algorithm ◽

Principal Component ◽

Support Vector ◽

Machine Model ◽

Supply Chain Finance

Recently, support vector machines, a supervised learning algorithm, have been widely used in the scope of credit risk management. However, noise may increase the complexity of the algorithm building and destroy the performance of classifier. In our work, we propose an ensemble support vector machine model to solve the risk assessment of supply chain finance, combined with reducing noises method. The main characteristics of this approach include that (1) a novel noise filtering scheme that avoids the noisy examples based on fuzzy clustering and principal component analysis algorithm is proposed to remove both attribute noise and class noise to achieve an optimal clean set, and (2) support vector machine classifiers, based on the improved particle swarm optimization algorithm, are seen as component classifiers. Then, we obtained the final classification results by combining finally individual prediction through AdaBoosting algorithm on the new sample set. Some experiments are applied on supply chain financial analysis of China’s listed companies. Results indicate that the credit assessment accuracy can be increased by applying this approach.

A New Fuzzy Support Vector Machine for Credit Scoring

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.109.636 ◽

2011 ◽

Vol 109 ◽

pp. 636-640

Author(s):

Bo Tang ◽

Min Xia

Keyword(s):

Support Vector Machine ◽

Economic Development ◽

Prediction Accuracy ◽

Credit Scoring ◽

Fuzzy Membership ◽

Support Vector ◽

Good Prediction ◽

Support Vector Machine Algorithm ◽

Fuzzy Support Vector Machine ◽

Good Prediction Accuracy

With China's rapid economic development, credit scoring has become very important. This paper presents a new fuzzy support vector machine algorithm used to solve the problems of credit scoring. The empirical results show that the proposed fuzzy membership model is valid ,the algorithm has good prediction accuracy and anti-noise ability.

An Ensemble Learning Model Based on SOM-SVM Model for Personal Credit Risk

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.271-273.1286 ◽

2011 ◽

Vol 271-273 ◽

pp. 1286-1290

Author(s):

Yan Feng Guo ◽

Na Sun ◽

Yuan Yao

Keyword(s):

Credit Risk ◽

Financial Management ◽

Financial Risk ◽

Credit Scoring ◽

Experimental Result ◽

Ensemble Model ◽

Single Model ◽

Personal Credit ◽

Svm Model ◽

Management Area

Credit risk problem is an essential problem in financial management area. People usually employ personal credit scoring to avoid financial risk problem. Although many methods have been proposed for evaluating the personal credit scoring and obtained good effects, most of these methods were called single model types, which would be disturbed by model self-parameter, data noise and other external factors. In order to overcome the weakness of single model, we believe one of best ways is to construct an ensemble model. In this paper, we proposed a new style of ensemble model and employed two public credit datasets to certify the validity of our ensemble model. The experimental result shows that the ensemble SOM-SVM model can overcome the single model weakness and improve the accuracy of classification, which is good for constructing a better credit scoring system in future.

An Object Detection Method Based on Independent Local Features

Journal of Robotics and Mechatronics ◽

10.20965/jrm.2006.p0744 ◽

2006 ◽

Vol 18 (6) ◽

pp. 744-750

Author(s):

Ryouta Nakano ◽

◽

Kazuhiro Hotta ◽

Haruhisa Takahashi

Keyword(s):

Object Detection ◽

Detection Method ◽

Principal Component ◽

Component Analysis ◽

Local Features ◽

Local Feature ◽

Superior Performance ◽

Support Vector ◽

Car Detection ◽

Training Samples

This paper presents an object detection method using independent local feature extractor. Since objects are composed of a combination of characteristic parts, a good object detector could be developed if local parts specialized for a detection target are derived automatically from training samples. To do this, we use Independent Component Analysis (ICA) which decomposes a signal into independent elementary signals. We then used the basis vectors derived by ICA as independent local feature extractors specialized for a detection target. These feature extractors are applied to a candidate area, and their outputs are used in classification. However, the number of dimension of extracted independent local features is very high. To reduce the extracted independent local features efficiently, we use Higher-order Local AutoCorrelation (HLAC) features to extract the information that relates neighboring features. This may be more effective for object detection than simple independent local features. To classify detection targets and non-targets, we use a Support Vector Machine (SVM). The proposed method is applied to a car detection problem. Superior performance is obtained by comparison with Principal Component Analysis (PCA).

Well-Logging Prediction Based on Hybrid Neural Network Model

Energies ◽

10.3390/en14248583 ◽

2021 ◽

Vol 14 (24) ◽

pp. 8583

Author(s):

Lei Wu ◽

Zhenzhen Dong ◽

Weirong Li ◽

Cheng Jing ◽

Bochao Qu

Keyword(s):

Neural Network ◽

Prediction Accuracy ◽

Evaluation Method ◽

Oil And Gas ◽

Short Term Memory ◽

Pso Algorithm ◽

Well Logging ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Support Vector

Well-logging is an important formation characterization and resource evaluation method in oil and gas exploration and development. However, there has been a shortage of well-logging data because Well-logging can only be measured by expensive and time-consuming field tests. In this study, we aimed to find effective machine learning techniques for well-logging data prediction, considering the temporal and spatial characteristics of well-logging data. To achieve this goal, the convolutional neural network (CNN) and the long short-term memory (LSTM) neural networks were combined to extract the spatial and temporal features of well-logging data, and the particle swarm optimization (PSO) algorithm was used to determine hyperparameters of the optimal CNN-LSTM architecture to predict logging curves in this study. We applied the proposed CNN-LSTM-PSO model, along with support vector regression, gradient-boosting regression, CNN-PSO, and LSTM-PSO models, to forecast photoelectric effect (PE) logs from other logs of the target well, and from logs of adjacent wells. Among the applied algorithms, the proposed CNN-LSTM-PSO model generated the best prediction of PE logs because it fully considers the spatio-temporal information of other well-logging curves. The prediction accuracy of the PE log using logs of the adjacent wells was not as good as that using the other well-logging data of the target well itself, due to geological uncertainties between the target well and adjacent wells. The results also show that the prediction accuracy of the models can be significantly improved with the PSO algorithm. The proposed CNN-LSTM-PSO model was found to enable reliable and efficient Well-logging prediction for existing and new drilled wells; further, as the reservoir complexity increases, the proxy model should be able to reduce the optimization time dramatically.

Multi-Classification Combination Algorithm Based on Logit Model and Support Vector Machine

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.734-737.2978 ◽

2013 ◽

Vol 734-737 ◽

pp. 2978-2982 ◽

Cited By ~ 1

Author(s):

Xin Lei Zhang ◽

Meng Gang Li ◽

Zuo Quan Zhang

Keyword(s):

Principal Component Analysis ◽

Support Vector Machine ◽

Regression Analysis ◽

Logit Model ◽

Prediction Accuracy ◽

Principal Component ◽

Composite Indicator ◽

Support Vector ◽

Logit Regression ◽

Multi Classification

According to the basic theories of Logit regression analysis and support vector machine, this article involves improved multi-classification combination algorithm. When applying this model, there are some innovations. First, choose optimized composite indicator as a variable through principal component analysis and get more information. Second, introduce Logit parameter model to the quadratic to increase prediction accuracy. Third, put forward a multi-classification combination model of improved Logit model with SVM to increase prediction accuracy.

PREDICTING Ms TEMPERATURE APPLYING PRINCIPAL COMPONENT ANALYSIS-ARTIFICIAL NEURAL NETWORKS

International Journal of Modern Physics B ◽

10.1142/s021797920906052x ◽

2009 ◽

Vol 23 (06n07) ◽

pp. 1099-1104 ◽

Cited By ~ 3

Author(s):

XUEXIA XU ◽

BINGZHE BAI ◽

WEI YOU

Keyword(s):

Principal Component Analysis ◽

Martensite Transformation ◽

Prediction Accuracy ◽

Principal Component ◽

Component Analysis ◽

Scatter Diagram ◽

Ann Model ◽

Training Samples ◽

Artificial Neural ◽

Input Variables

The principal component analysis-artificial neural network (PCA-ANN) model was developed to predict martensite transformation start temperature ( Ms ) of steels. Training samples were processed by principal component analysis and the number of input variables was reduced from 6 to 4, then the scores of principal components were used to establish new sample database to train the ANN model. Ms of steels were predicted by the PCA-ANN model. The predicted and measured Ms distribute along the 0-45° diagonal in the scatter diagram and the statistical errors are MSE-16.0256, MSRE-4.49% and VOF-1.97790 respectively. Comparing the prediction results of different models it is shown that the accuracy of the PCA-ANN model was the highest, which indicated that the principal component analysis was helpful to improve the prediction accuracy of ANN model.

Predicting Ash Fusibility of Coal from Coal Properties

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.354-355.216 ◽

2011 ◽

Vol 354-355 ◽

pp. 216-221 ◽

Cited By ~ 1

Author(s):

Jian Guo Yang ◽

Xiao Long Zhang ◽

Hong Zhao

Keyword(s):

Support Vector Machine ◽

Prediction Accuracy ◽

Coal Ash ◽

Softening Temperature ◽

Support Vector ◽

Safe Operation ◽

Training Samples ◽

Ash Fusibility ◽

Coal Properties ◽

Input Variables

It is significant for safe operation and energy saving to foreknow ash fusibility of coal. Ash fusibility of coal was divided into three levels according to softening temperature. The fusibility level was correlated with coal properties by a nonlinear classified model which was built using support vector machine. The model receives coal properties as input variables and would give a judgment of fusibility level as an output. Validation of the nonlinear classified model on 62 training samples yielded 100% accuracy. The prediction accuracy of 15 testing samples was 86.7%. Results indicate that the level of ash fusibility can be accurately predicted from coal properties with the nonlinear classified model.

Prediction of tRNA Based on LS-SVM Algorithm with Principal Component Analysis

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.461.753 ◽

2012 ◽

Vol 461 ◽

pp. 753-756

Author(s):

Chong Xing ◽

Yao Wang ◽

You Zhou ◽

Yan Chun Liang

Keyword(s):

Principal Component Analysis ◽

Prediction Accuracy ◽

Principal Component ◽

Component Analysis ◽

Support Vector ◽

Single Nucleotide ◽

Non Coding Rna ◽

Svm Algorithm ◽

The One ◽

Prediction Strategy

Recently, non-coding RNA prediction is the one of the most important researches in bioinformatics. In this paper, on the basis of principal component analysis, we present a tRNA prediction strategy by using least squares support vector machine (LS-SVM). Appearance frequencies of single nucleotide, 2 – nucleotides and (G-C) %, (A-T) % were chosen as characteristics inputs. Results from tests showed that the prediction accuracy was 90.51% on prokaryotic tRNA dataset. Experimental results indicate that the method is effective for prokaryotic ncRNA prediction.