Handling imbalanced data in customer churn prediction using combined sampling and weighted random forest

A major and demanding issue in the telecommunications industry is the prediction of churn customers. Churn describes the customer who is attrite from one Telecom service provider to competitors searching for better services offers. Companies from the Telco sector frequently have customer relationship management offices it is the main objective in how to win back defecting clients because preserve long-term customers can be much more beneficial to a company than gain newly recruited customers. Researchers and practitioners are paying great attention and investing more in developing a robust customer churn prediction model, especially in the telecommunication business by proposed numerous machine learning approaches. Many approaches of Classification are established, but the most effective in recent times is a tree-based method. The main contribution of this research is to predict churners/non-churners in the Telecom sector based on project pursuit Random Forest (PPForest) that uses discriminant feature analysis as a novelty extension of the conventional Random Forest approach for learning oblique Project Pursuit tree (PPtree). The proposed methodology leverages the advantage of two discriminant analysis methods to calculate the project index used in the construction of PPtree. The first method used Support Vector Machines (SVM) as a classifier in the construction of PPForest to differentiate between churners and non-churners customers. The second method is a Linear Discriminant Analysis (LDA) to achieve linear splitting of variables node during oblique PPtree construction to produce individual classifiers that are robust and more diverse than classical Random Forest. It found that the proposed methods enjoy the best performance measurements e.g. Accuracy, hit rate, ROC curve, Gini coefficient, Kolmogorov-Smirnov statistic and lift coefficient, H-measure, AUC. Moreover, PPForest based on direct applied of LDA on the raw data delivers an effective evaluator for the customer churn prediction model.

Download Full-text

Customer Churn Prediction and Upselling using MRF (Modified Random Forest) technique

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.c8392.019320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 475-482

Keyword(s):

Data Mining ◽

Comparative Analysis ◽

Random Forest ◽

Prediction Model ◽

Classification Accuracy ◽

Churn Prediction ◽

Customer Churn ◽

Customer Churn Prediction ◽

Telecom Industry ◽

Fierce Competition

Customer Churn Prediction has become one of the eminent topic in the telecom industry, it has gained a lot of attention in the research industry due to fierce competition from the various, and hence companies have focused on the larger size of the data for churning and upselling prediction. The model of customer churn prediction detects and identify the customer who are willing to terminate the subscription, customer churn prediction and upselling can be done through the data mining process. Hence, In this paper we have introduce a model Named MRF(Modified Random Forest), this model helps in enhancing the accuracy and also helps in ignoring the regression issue. Our methodology has been performed on the provided orange Datasets. For the evaluation of our algorithm comparative analysis between the existing and proposed methodology is done considering the two scenario i.e. churn and upselling. Later our model is compared with the various existing churn prediction model, the result of the analysis indicates that our model outperforms the existing method including the standard random forest in terms of AUC and classification accuracy.

Download Full-text

Application of Feature Extraction Method in Customer Churn Prediction Based on Random Forest and Transduction

Journal of Convergence Information Technology ◽

10.4156/jcit.vol5.issue3.11 ◽

2010 ◽

Vol 5 (3) ◽

pp. 73-78 ◽

Cited By ~ 1

Author(s):

Qiu Yihui ◽

Mi Hong

Keyword(s):

Feature Extraction ◽

Random Forest ◽

Extraction Method ◽

Churn Prediction ◽

Feature Extraction Method ◽

Customer Churn ◽

Customer Churn Prediction

Download Full-text

Research on Ctrip Customer Churn Prediction Model Based on Random Forest

Business Intelligence and Information Technology - Lecture Notes on Data Engineering and Communications Technologies ◽

10.1007/978-3-030-92632-8_48 ◽

2021 ◽

pp. 511-523

Author(s):

Zhijie Zhao ◽

Wanting Zhou ◽

Zeguo Qiu ◽

Ang Li ◽

Jiaying Wang

Keyword(s):

Random Forest ◽

Prediction Model ◽

Churn Prediction ◽

Customer Churn ◽

Model Based ◽

Customer Churn Prediction

Download Full-text

Customer churn prediction based on LASSO and Random Forest models

IOP Conference Series Materials Science and Engineering ◽

10.1088/1757-899x/631/5/052008 ◽

2019 ◽

Vol 631 ◽

pp. 052008

Author(s):

Qiannan Zhu ◽

Xinyi Yu ◽

Yuankang Zhao ◽

Deyi Li

Keyword(s):

Random Forest ◽

Churn Prediction ◽

Customer Churn ◽

Forest Models ◽

Random Forest Models ◽

Customer Churn Prediction

Download Full-text

Customer Churn Prediction for Imbalanced Class Distribution of Data in Business Sector

Journal of Advanced College of Engineering and Management ◽

10.3126/jacem.v5i0.26693 ◽

2019 ◽

Vol 5 ◽

pp. 101-110

Author(s):

Aayush Bhattarai ◽

Elisha Shrestha ◽

Ram Prasad Sapkota

Keyword(s):

Machine Learning ◽

Imbalanced Data ◽

Telecommunications Industry ◽

Business Case ◽

Machine Learning Algorithms ◽

Business Sector ◽

Churn Prediction ◽

Telecommunication Sector ◽

Customer Churn ◽

Customer Churn Prediction

Churners are those people who are about to transfer their business to a competitor or simply who cancel a subscription to a service. This paper is based on a specific business sector, which is telecommunication sector. With a churn rate of 30%, the telecommunication sector takes the first place on the list. In this paper, we present some advanced data mining methodologies which predicts customer churn in the pre-paid mobile telecommunications industry using a call detail records dataset. To implement the predictive models, we initially propose and then apply four machine learning algorithms: Random Forest, Naïve Bayes, Logistic Regression, and XG Boost. To evaluate the models, we use various evaluation metrics and find the best model which will be suitable for any class imbalanced data and also our business case. This paper can also be viewed as a comparative study on the most popular machine learning methods applied to the challenging problem of customer churn prediction.

Download Full-text