Research on Default Prediction for Credit Card Users Based on XGBoost-LSTM Model

The credit card business has become an indispensable financial service for commercial banks. With the development of credit card business, commercial banks have achieved outstanding results in maintaining existing customers, tapping potential customers, and market share. During credit card operations, massive amounts of data in multiple dimensions—including basic customer information; billing, installment, and repayment information; transaction flows; and overdue records—are generated. Compared with preloan and postloan links, user default prediction of the on-loan link has a huge scale of data, which makes it difficult to identify signs of risk. With the recent growing maturity and practicality of technologies such as big data analysis and artificial intelligence, it has become possible to further mine and analyze massive amounts of transaction data. This study mined and analyzed the transaction flow data that best reflected customer behavior. XGBoost, which is widely used in financial classification models, and Long-Short Term Memory (LSTM), which is widely used in time-series information, were selected for comparative research. The accuracy of the XGBoost model depends on the degree of expertise in feature extraction, while the LSTM algorithm can achieve higher accuracy without feature extraction. The resulting XGBoost-LSTM model showed good classification performance in default prediction. The results of this study can provide a reference for the application of deep learning algorithms in the field of finance.

Download Full-text

Research on Credit Card Default Prediction Based on k-Means SMOTE and BP Neural Network

Complexity ◽

10.1155/2021/6618841 ◽

2021 ◽

Vol 2021 ◽

pp. 1-13

Author(s):

Ying Chen ◽

Ruirui Zhang

Keyword(s):

Neural Network ◽

Random Forest ◽

Bp Neural Network ◽

Credit Card ◽

Prediction Models ◽

Financial Institution ◽

Classification Performance ◽

Default Prediction ◽

Credit Card Default ◽

Default Data

Aiming at the problem that the credit card default data of a financial institution is unbalanced, which leads to unsatisfactory prediction results, this paper proposes a prediction model based on k-means SMOTE and BP neural network. In this model, k-means SMOTE algorithm is used to change the data distribution, and then the importance of data features is calculated by using random forest, and then it is substituted into the initial weights of BP neural network for prediction. The model effectively solves the problem of sample data imbalance. At the same time, this paper constructs five common machine learning models, KNN, logistics, SVM, random forest, and tree, and compares the classification performance of these six prediction models. The experimental results show that the proposed algorithm can greatly improve the prediction performance of the model, making its AUC value from 0.765 to 0.929. Moreover, when the importance of features is taken as the initial weight of BP neural network, the accuracy of model prediction is also slightly improved. In addition, compared with the other five prediction models, the comprehensive prediction effect of BP neural network is better.

Download Full-text

A Multimodal Music Emotion Classification Method Based on Multifeature Combined Network Classifier

Mathematical Problems in Engineering ◽

10.1155/2020/4606027 ◽

2020 ◽

Vol 2020 ◽

pp. 1-11

Author(s):

Changfeng Chen ◽

Qiang Li

Keyword(s):

Neural Networks ◽

Feature Extraction ◽

Classification Accuracy ◽

Short Term Memory ◽

Classification Performance ◽

Feature Representation ◽

Classification Model ◽

Emotion Classification ◽

Fusion Methods ◽

Music Audio

Aiming at the shortcomings of single network classification model, this paper applies CNN-LSTM (convolutional neural networks-long short-term memory) combined network in the field of music emotion classification and proposes a multifeature combined network classifier based on CNN-LSTM which combines 2D (two-dimensional) feature input through CNN-LSTM and 1D (single-dimensional) feature input through DNN (deep neural networks) to make up for the deficiencies of original single feature models. The model uses multiple convolution kernels in CNN for 2D feature extraction, BiLSTM (bidirectional LSTM) for serialization processing and is used, respectively, for audio and lyrics single-modal emotion classification output. In the audio feature extraction, music audio is finely divided and the human voice is separated to obtain pure background sound clips; the spectrogram and LLDs (Low Level Descriptors) are extracted therefrom. In the lyrics feature extraction, the chi-squared test vector and word embedding extracted by Word2vec are, respectively, used as the feature representation of the lyrics. Combining the two types of heterogeneous features selected by audio and lyrics through the classification model can improve the classification performance. In order to fuse the emotional information of the two modals of music audio and lyrics, this paper proposes a multimodal ensemble learning method based on stacking, which is different from existing feature-level and decision-level fusion methods, the method avoids information loss caused by direct dimensionality reduction, and the original features are converted into label results for fusion, effectively solving the problem of feature heterogeneity. Experiments on million song dataset show that the audio classification accuracy of the multifeature combined network classifier in this paper reaches 68%, and the lyrics classification accuracy reaches 74%. The average classification accuracy of the multimodal reaches 78%, which is significantly improved compared with the single-modal.

Download Full-text

Myocardial Infarction Classification Based on Convolutional Neural Network and Recurrent Neural Network

Applied Sciences ◽

10.3390/app9091879 ◽

2019 ◽

Vol 9 (9) ◽

pp. 1879 ◽

Cited By ~ 7

Author(s):

Kai Feng ◽

Xitian Pi ◽

Hongying Liu ◽

Kai Sun

Keyword(s):

Neural Network ◽

Myocardial Infarction ◽

Convolutional Neural Network ◽

Short Term Memory ◽

Rapid Development ◽

Classification Performance ◽

Human Beings ◽

Ecg Signals ◽

Good Classification Performance ◽

Memory Network

Myocardial infarction is one of the most threatening cardiovascular diseases for human beings. With the rapid development of wearable devices and portable electrocardiogram (ECG) medical devices, it is possible and conceivable to detect and monitor myocardial infarction ECG signals in time. This paper proposed a multi-channel automatic classification algorithm combining a 16-layer convolutional neural network (CNN) and long-short term memory network (LSTM) for I-lead myocardial infarction ECG. The algorithm preprocessed the raw data to first extract the heartbeat segments; then it was trained in the multi-channel CNN and LSTM to automatically learn the acquired features and complete the myocardial infarction ECG classification. We utilized the Physikalisch-Technische Bundesanstalt (PTB) database for algorithm verification, and obtained an accuracy rate of 95.4%, a sensitivity of 98.2%, a specificity of 86.5%, and an F1 score of 96.8%, indicating that the model can achieve good classification performance without complex handcrafted features.

Download Full-text

A Study on the Auxiliary Diagnosis of Thyroid Disease Images Based on Multiple Dimensional Deep Learning Algorithms

Current Medical Imaging Formerly Current Medical Imaging Reviews ◽

10.2174/1573405615666190115155223 ◽

2020 ◽

Vol 16 (3) ◽

pp. 199-205

Author(s):

Yuejun Liu ◽

Yifei Xu ◽

Xiangzheng Meng ◽

Xuguang Wang ◽

Tianxu Bai

Keyword(s):

Deep Learning ◽

Learning Algorithms ◽

Region Of Interest ◽

Classification Performance ◽

Thyroid Diseases ◽

Great Success ◽

Learning Models ◽

Good Classification Performance ◽

Spect Images

Background: Medical imaging plays an important role in the diagnosis of thyroid diseases. In the field of machine learning, multiple dimensional deep learning algorithms are widely used in image classification and recognition, and have achieved great success. Objective: The method based on multiple dimensional deep learning is employed for the auxiliary diagnosis of thyroid diseases based on SPECT images. The performances of different deep learning models are evaluated and compared. Methods: Thyroid SPECT images are collected with three types, they are hyperthyroidism, normal and hypothyroidism. In the pre-processing, the region of interest of thyroid is segmented and the amount of data sample is expanded. Four CNN models, including CNN, Inception, VGG16 and RNN, are used to evaluate deep learning methods. Results: Deep learning based methods have good classification performance, the accuracy is 92.9%-96.2%, AUC is 97.8%-99.6%. VGG16 model has the best performance, the accuracy is 96.2% and AUC is 99.6%. Especially, the VGG16 model with a changing learning rate works best. Conclusion: The standard CNN, Inception, VGG16, and RNN four deep learning models are efficient for the classification of thyroid diseases with SPECT images. The accuracy of the assisted diagnostic method based on deep learning is higher than that of other methods reported in the literature.

Download Full-text

Document Preprocessing with TF-IDF to Improve the Polarity Classification Performance of Unstructured Sentiment Analysis

Kinetik Game Technology Information System Computer Network Computing Electronics and Control ◽

10.22219/kinetik.v5i3.1066 ◽

2020 ◽

pp. 235-242

Author(s):

Farrikh Alzami ◽

Erika Devi Udayanti ◽

Dwi Puji Prabowo ◽

Rama Aria Megantara

Keyword(s):

Machine Learning ◽

Feature Extraction ◽

Random Forest ◽

Sentiment Analysis ◽

Classification Performance ◽

Document Preparation ◽

Learning Models ◽

Polarity Classification ◽

Negative Sentiment ◽

Machine Learning Models

Sentiment analysis in terms of polarity classification is very important in everyday life, with the existence of polarity, many people can find out whether the respected document has positive or negative sentiment so that it can help in choosing and making decisions. Sentiment analysis usually done manually. Therefore, an automatic sentiment analysis classification process is needed. However, it is rare to find studies that discuss extraction features and which learning models are suitable for unstructured sentiment analysis types with the Amazon food review case. This research explores some extraction features such as Word Bags, TF-IDF, Word2Vector, as well as a combination of TF-IDF and Word2Vector with several machine learning models such as Random Forest, SVM, KNN and Naïve Bayes to find out a combination of feature extraction and learning models that can help add variety to the analysis of polarity sentiments. By assisting with document preparation such as html tags and punctuation and special characters, using snowball stemming, TF-IDF results obtained with SVM are suitable for obtaining a polarity classification in unstructured sentiment analysis for the case of Amazon food review with a performance result of 87,3 percent.

Download Full-text

An LSTM-Based Method Considering History and Real-Time Data for Passenger Flow Prediction

Applied Sciences ◽

10.3390/app10113788 ◽

2020 ◽

Vol 10 (11) ◽

pp. 3788 ◽

Cited By ~ 1

Author(s):

Qi Ouyang ◽

Yongbo Lv ◽

Jihui Ma ◽

Jing Li

Keyword(s):

Feature Extraction ◽

Real Time ◽

Short Term Memory ◽

Historical Data ◽

Time Interval ◽

Information Coding ◽

Time Data ◽

Passenger Flow ◽

Flow Prediction ◽

Real Time Data

With the development of big data and deep learning, bus passenger flow prediction considering real-time data becomes possible. Real-time traffic flow prediction helps to grasp real-time passenger flow dynamics, provide early warning for a sudden passenger flow and data support for real-time bus plan changes, and improve the stability of urban transportation systems. To solve the problem of passenger flow prediction considering real-time data, this paper proposes a novel passenger flow prediction network model based on long short-term memory (LSTM) networks. The model includes four parts: feature extraction based on Xgboost model, information coding based on historical data, information coding based on real-time data, and decoding based on a multi-layer neural network. In the feature extraction part, the data dimension is increased by fusing bus data and points of interest to improve the number of parameters and model accuracy. In the historical information coding part, we use the date as the index in the LSTM structure to encode historical data and provide relevant information for prediction; in the real-time data coding part, the daily half-hour time interval is used as the index to encode real-time data and provide real-time prediction information; in the decoding part, the passenger flow data for the next two 30 min interval outputs by decoding all the information. To our best knowledge, it is the first time to real-time information has been taken into consideration in passenger flow prediction based on LSTM. The proposed model can achieve better accuracy compared to the LSTM and other baseline methods.

Download Full-text

Deep Recurrent Neural Networks for Automatic Detection of Sleep Apnea from Single Channel Respiration Signals

Sensors ◽

10.3390/s20185037 ◽

2020 ◽

Vol 20 (18) ◽

pp. 5037

Author(s):

Hisham ElMoaqet ◽

Mohammad Eid ◽

Martin Glos ◽

Mutaz Ryalat ◽

Thomas Penzel

Keyword(s):

Feature Extraction ◽

Sleep Apnea ◽

Short Term Memory ◽

Single Channel ◽

Automated Detection ◽

Detection Methods ◽

Short Term ◽

Term Memory ◽

Deep Recurrent Neural Network ◽

Long Short Term Memory

Sleep apnea is a common sleep disorder that causes repeated breathing interruption during sleep. The performance of automated apnea detection methods based on respiratory signals depend on the signals considered and feature extraction methods. Moreover, feature engineering techniques are highly dependent on the experts’ experience and their prior knowledge about different physiological signals and conditions of the subjects. To overcome these problems, a novel deep recurrent neural network (RNN) framework is developed for automated feature extraction and detection of apnea events from single respiratory channel inputs. Long short-term memory (LSTM) and bidirectional long short-term memory (BiLSTM) are investigated to develop the proposed deep RNN model. The proposed framework is evaluated over three respiration signals: Oronasal thermal airflow (FlowTh), nasal pressure (NPRE), and abdominal respiratory inductance plethysmography (ABD). To demonstrate our results, we use polysomnography (PSG) data of 17 patients with obstructive, central, and mixed apnea events. Our results indicate the effectiveness of the proposed framework in automatic extraction for temporal features and automated detection of apneic events over the different respiratory signals considered in this study. Using a deep BiLSTM-based detection model, the NPRE signal achieved the highest overall detection results with true positive rate (sensitivity) = 90.3%, true negative rate (specificity) = 83.7%, and area under receiver operator characteristic curve = 92.4%. The present results contribute a new deep learning approach for automated detection of sleep apnea events from single channel respiration signals that can potentially serve as a helpful and alternative tool for the traditional PSG method.

Download Full-text

A Multi-Branch Feature Fusion Strategy Based on an Attention Mechanism for Remote Sensing Image Scene Classification

Remote Sensing ◽

10.3390/rs13101950 ◽

2021 ◽

Vol 13 (10) ◽

pp. 1950

Author(s):

Cuiping Shi ◽

Xin Zhao ◽

Liguo Wang

Keyword(s):

Remote Sensing ◽

Feature Extraction ◽

Classification Accuracy ◽

Feature Fusion ◽

State Of The Art ◽

Rapid Development ◽

Remote Sensing Image ◽

Classification Performance ◽

Attention Mechanism ◽

Scene Classification

In recent years, with the rapid development of computer vision, increasing attention has been paid to remote sensing image scene classification. To improve the classification performance, many studies have increased the depth of convolutional neural networks (CNNs) and expanded the width of the network to extract more deep features, thereby increasing the complexity of the model. To solve this problem, in this paper, we propose a lightweight convolutional neural network based on attention-oriented multi-branch feature fusion (AMB-CNN) for remote sensing image scene classification. Firstly, we propose two convolution combination modules for feature extraction, through which the deep features of images can be fully extracted with multi convolution cooperation. Then, the weights of the feature are calculated, and the extracted deep features are sent to the attention mechanism for further feature extraction. Next, all of the extracted features are fused by multiple branches. Finally, depth separable convolution and asymmetric convolution are implemented to greatly reduce the number of parameters. The experimental results show that, compared with some state-of-the-art methods, the proposed method still has a great advantage in classification accuracy with very few parameters.

Download Full-text

Fake News Detection Using Recurrent Neural Networks and Distributed Representations for the Portuguese Language

10.21528/cbic2021-163 ◽

2021 ◽

Author(s):

Guilherme Zanini Moreira ◽

Marcelo Romero ◽

Manassés Ribeiro

Keyword(s):

Neural Networks ◽

Recurrent Neural Networks ◽

Short Term Memory ◽

Classification Performance ◽

Fake News ◽

Distributed Representations ◽

Memory Network ◽

Long Short Term Memory ◽

Media Channels ◽

Damage Rule

After the advent of Web, the number of people who abandoned traditional media channels and started receiving news only through social media has increased. However, this caused an increase of the spread of fake news due to the ease of sharing information. The consequences are various, with one of the main ones being the possible attempts to manipulate public opinion for elections or promotion of movements that can damage rule of law or the institutions that represent it. The objective of this work is to perform fake news detection using Distributed Representations and Recurrent Neural Networks (RNNs). Although fake news detection using RNNs has been already explored in the literature, there is little research on the processing of texts in Portuguese language, which is the focus of this work. For this purpose, distributed representations from texts are generated with three different algorithms (fastText, GloVe and word2vec) and used as input features for a Long Short-term Memory Network (LSTM). The approach is evaluated using a publicly available labelled news dataset. The proposed approach shows promising results for all the three distributed representation methods for feature extraction, with the combination word2vec+LSTM providing the best results. The results of the proposed approach shows a better classification performance when compared to simple architectures, while similar results are obtained when the approach is compared to deeper architectures or more complex methods.

Download Full-text

Facial Expression Analysis by Machine Learning

Advances in Face Image Analysis ◽

10.4018/978-1-61520-991-0.ch013 ◽

2010 ◽

pp. 239-258

Author(s):

Siu-Yeung Cho ◽

Teik-Toe Teoh ◽

Yok-Yen Nguwi

Keyword(s):

Feature Selection ◽

Facial Expression ◽

Real Time ◽

Facial Expression Recognition ◽

Classification Performance ◽

Difficult Problem ◽

Facial Features ◽

Expression Recognition ◽

Real Time Processing ◽

Good Classification Performance

Facial expression recognition is a challenging task. A facial expression is formed by contracting or relaxing different facial muscles on human face that results in temporally deformed facial features like wide-open mouth, raising eyebrows or etc. The challenges of such system have to address with some issues. For instances, lighting condition is a very difficult problem to constraint and regulate. On the other hand, real-time processing is also a challenging problem since there are so many facial features to be extracted and processed and sometimes, conventional classifiers are not even effective in handling those features and produce good classification performance. This chapter discusses the issues on how the advanced feature selection techniques together with good classifiers can play a vital important role of real-time facial expression recognition. Several feature selection methods and classifiers are discussed and their evaluations for real-time facial expression recognition are presented in this chapter. The content of this chapter is a way to open-up a discussion about building a real-time system to read and respond to the emotions of people from facial expressions.

Download Full-text