scholarly journals Misclassified Reduced Instance and Stochastic Gradient Descent with Logistic Regression Model for Customer Churn Prediction

Customer Churn Prediction (CCP) is a difficult problem found to be helpful to make decisions due to the rapid growth in the number of telecom providers. At present, deep learning models are familiar because of the significant improvement in different areas. In this paper, a deep learning based CCP is introduced by the use of Stochastic Gradient Boosting (SGD) with Logistic regression (LR) classifier model. By the integration of SGD and LR, effective classification can be accomplished. To further improve the classifier efficiency, misclassified instances are removed from the dataset. Then, the processed data is again provided as input to the classification model. The presented SGD-LR model is validated on a benchmark dataset and the results are examiner with respect to different measures. The experimental outcome pointed out the projected model is superior to available CCP models on the identical dataset.

Churner Customer is a main tricky and one of the most important issues for large companies, due to the straight impact on the incomes of the companies especially in the telecom domain, companies are searching for advance strategies to predict churn/non-churn customer. This research focuses on the construction of a predictive model to identify each customer as churner or not and gain additional insights about their service consumers. The main contribution is to overcome the limitation of independently based on data mining strategies by developing approaches and derived network metrics such as centrality and connectivity between customers to incorporate network mining with traditional data mining. Social network measurements e.g. Leverage, flow Bet, Page Rank, Cluster Coefficients and Eccentricity are joined with other attributes in the original network dataset to enhance the performance of the proposed methodology. The risk of churn can be predictive by preparing an extensive cleaning the raw data for churn modeling, It divides customers into clusters based on Gower distance and k-medoids algorithm to help understand and predict churner users, classification model using Extreme Gradient Boosting “XGBoost”, assessment the model performance by computation the centralities metrics as new attributes appended to the original network dataset. Experiments conducted on Telecom shows that with an average value of all statistics accuracy not lower than 98.27%, while the average accuracy for the original dataset with it is clusters is not exceeded than 0.97%. The proposed method for churners detection which combines social impacts and network contents based on clustering significantly improved the prediction accuracy for telecom dataset as compared to prediction using the call log details, network information without implement of clustering , thus validate the hypothesis that combining social network attributes and Call/SMS information of the users for churn prediction could yields substantially improved of customer churn prediction.


Customer Relationship Management (CRM) is a challenging issue in marketing to better understand the customers and maintaining long-term relationships with them to increase the profitability. It plays a vital role in customer centered marketing domain which provides a better service and satisfies the customer requirements based on their characteristics in consuming patterns and smoothes the relationship where various representatives communicate and collaborate. Customer Churn prediction is one of the area in CRM that explores the transaction and communication process and analyze the customer loyalty. Data mining ease this process with classification techniques to explore pattern from large datasets. It provides a good technical support to analyze large amounts of complex customer data. This research paper applies data mining classification technique to predict churn customers in three variant sectors Banking, Ecommerce and Telecom. For Classification, enhanced logistic regression with regularization and optimization technique is applied. The work is implemented in Rapid miner tool and the performance of the prediction algorithm is assessed for three variant sectors with suitable evaluation metrics.


Sign in / Sign up

Export Citation Format

Share Document