Skip to content

This project conducts a comprehensive analysis of employee data from the company 'ABC' and employs various machine learning models to predict potential employee resignations. Subsequently, it evaluates and compares the performance of these models based on different evaluation metrics.

Notifications You must be signed in to change notification settings

CD-AC/DataScience-Resignation_Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 

Repository files navigation

EMPLOYEE RESIGNATION PREDICTION

Banner Image

Introduction

In this article, a predictive analysis is presented regarding the potential resignation of employees. The main objective is to develop three prediction models using machine learning techniques to identify employees who may be considering leaving the company in the near future. This analysis aims to assist companies in taking proactive measures to retain their key talent and improve job satisfaction.

Research

Cost of Hiring an Employee in 2024

This article analyzes the average cost per hire in the year 2024 and the factors influencing this cost.

Average Cost per Hire

  • 2019: $4,129
  • 2023: $4,700 (14% increase)

Factors Influencing Cost

  1. Company Size: Larger companies can spread operational costs among a greater number of hires.
  2. Industry: Certain industries, such as cybersecurity, engineering, or nursing, experience talent shortages, increasing hiring costs.
  3. Location: Hiring costs vary by geographic location.

Types of Costs

Direct Costs

  • Recruitment: Job postings, recruitment agency fees, candidate tracking software.
  • Selection: Tests, interviews, candidate travel.
  • Onboarding: Training, materials, existing employee time.

Indirect Costs

  • Employee Time: Time employees spend on the hiring process, such as reviewing resumes and conducting interviews.
  • Productivity Loss: Productivity loss incurred when a position is vacant.
  • Turnover Costs: Costs associated with hiring a new employee to replace a departing one.

This analysis provides a deeper understanding of the cost of hiring an employee in 2024 and the different factors influencing this cost. Understanding these aspects is crucial for strategic decision-making in human resource management and financial planning for companies.

Source: https://toggl.com/blog/cost-of-hiring-an-employee

Data and Methodology

Dataset

The data used in this analysis were obtained from https://www.kaggle.com/pavansubhasht/ibm-hr-analytics-attrition-dataset, containing relevant employee information, including demographic characteristics, employment history, performance evaluations, among others.

Data Preprocessing

Before training predictive models, an exhaustive data preprocessing process was performed, including data cleaning, handling categorical variables, managing missing values, and feature scaling.

Development of Predictive Model

Feature Selection

The most relevant categorical variables for predicting employee resignations were carefully selected. Various machine learning methodologies were explored and evaluated, including Random Forest, Logistic Regression, and Deep Learning. Each of these techniques was applied with the aim of capturing the complexity and diversity of the data, thus providing a comprehensive approach to building three robust and accurate predictive models.

Model Training

Several machine learning models, including Logistic Regression, Random Forests, and Deep Learning, were trained using the training dataset.

Model Evaluation

The models were evaluated using metrics such as accuracy, recall, and F1-score. A detailed analysis of the confusion matrix was conducted to assess the model's performance in predicting employee resignations.

Conclusion

After a thorough evaluation of the three predictive models, the following conclusions were reached:

To evaluate the performance of each model, key metrics such as Precision, Recall, F1-Score, and Accuracy were used. The results for predicting employees who are likely to resign ("Yes" class) are summarized below:

Métrica Regresión Logística Random Forest Deep Learning
Precisión 0.81 1.00 0.82
Recall 0.44 0.28 0.61
F1-Score 0.57 0.44 0.70
Exactitud General 0.89 0.87 0.89

Analysis of the Models:

  • Random Forest: Despite its perfect precision, its very low recall (0.28) indicates that it fails to identify the majority of employees who are about to leave, making it less practical for retention purposes.
  • Logistic Regression: Offers a decent balance of metrics but is ultimately outperformed by the Deep Learning model.
  • Deep Learning (Neural Network): This model stands out as the best performer. It achieves the highest F1-Score (0.70) and the highest Recall (0.61), meaning it correctly identifies 61% of employees who are at risk of leaving. This balance makes it the most effective tool for HR to take proactive retention measures.

Final Recommendation:

The Deep Learning model is the recommended choice for predicting employee resignation in this context. Its superior ability to identify at-risk employees (high recall) without a significant compromise in precision provides the most actionable insights for retaining valuable talent.

About

This project conducts a comprehensive analysis of employee data from the company 'ABC' and employs various machine learning models to predict potential employee resignations. Subsequently, it evaluates and compares the performance of these models based on different evaluation metrics.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published