Factors Affecting Employee Attrition and Predictive Modelling Using IBM HR Data
Attrition can be defined as the gradual reduction of a member or person in an organization due to retirement, resignation, or death. The loss can be defined as the number of employees leaving the organization, including voluntary and involuntary resignations. This study is about identifying the factors that affect the attrition and establishing a predictive model for employee attrition. The study first focuses on the problem statement and a breakdown on what attrition does to the organization. Followed by a detailed conceptual breakdown on attrition which is then discussed in the light of predictive modeling with the past supporting researches. The research involves data preprocessing with chi square versus logistic regression for feature selection, machine learning models and their comparison using the confusion matrix, precision, recall and f1-scores. The best results obtained was the logistic regression model with feature selection and the accuracy of the model is 86% with a 98% recall for the class 1 of attrition. The researcher wants to change the view on how attrition problem is tackled. Rather than knowing who to retain, the organization should know who to hire. This research sets a ground rule and tries to change the perspective on tackling the attrition problem.