[go: up one dir, main page]

0% found this document useful (0 votes)
49 views21 pages

Predicting Employee Attrition Using Machine Learning

The document outlines an internship presentation on predicting employee attrition using machine learning at Varcons Technologies, focusing on practical data science skills. The project utilized the IBM HR Analytics dataset and achieved a 93% accuracy with the Extra Trees Classifier model. Key objectives included applying classroom learning to real-world projects and mastering Python and machine learning techniques.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views21 pages

Predicting Employee Attrition Using Machine Learning

The document outlines an internship presentation on predicting employee attrition using machine learning at Varcons Technologies, focusing on practical data science skills. The project utilized the IBM HR Analytics dataset and achieved a 93% accuracy with the Extra Trees Classifier model. Key objectives included applying classroom learning to real-world projects and mastering Python and machine learning techniques.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 21

VISVESVARAYA TECHNOLOGICAL UNIVERSITY

“JnanaSangama”, Belagavi-590014, Karnataka, India

IMPACT COLLEGE OF ENGINEERING AND APPLIED SCIENCES

Department of Computer Science and Engineering- Data Science

INTERNSHIP PRESENTATION On

“Predicting Employee Attrition Using Machine Learning”


AT
Varcons Technologies

Under the Guidance of :

Submitted By : Mr. Krishna Mehar

SANJAY GR [1IC22CD400] Professor


Dept. of AI & ML
Internship Overview
Duration
Feb 1, 2025 – May 15, 2025

Mode
Offline experience

Focus
Practical Data Science with Python

Training
Preprocessing, ML, Model Evaluation
Company Overview:
Varcons Technologies
About Varcons
• Leading SaaS provider
• Innovative solutions
• Corporate seminars
• Industrial training

Our Goal
• Deliver smart tech
• Scalable services
• Clients of all sizes
Internship Objectives
Practical Data Science
Hands-on experience

Apply Classroom Learning


Real-world projects

Master Python & ML


New techniques

Understand Data Lifecycle


Collection to evaluation

Improve Skills
Problem-solving, critical thinking
Project Focus: Predicting
Employee Attrition
Project Title
Predicting Employee Attrition

Domain
Human Resources Analytics

Primary Goal
Analyze and predict employee turnover likelihood

Methodology
Leveraging ML models
Project Abstract
Attrition Impact Our Solution

Employees leaving incurs significant costs. Machine learning to predict turnover.

• Average cost: $4129 per new hire. • Extra Trees Classifier (ETC) performed best.
• US attrition rate (2021): 57.3%. • Achieved 93% accuracy.
• Key factors: Age, Monthly Income, Hourly Rate, Job Level.
Introduction to Attrition
Attrition Types of Attrition
Defined Attrition Rate
Employees • Voluntary Employees
leaving the • Involuntary Left / Avg.
organization Employees
• External
• Internal

ML Role
Data-driven
HR decisions
Related Work in Attrition Prediction

80%-88% 93%
Typical Accuracy Our ETC Model
Common ML models Higher accuracy with tuning and balancing

Previous studies used various ML techniques for attrition prediction. Common methods included dataset balancing (e.g.,
SMOTE) and models like Random Forest, Gradient Boosting, and Neural Networks. Most achieved accuracy rates between
80% and 88%. Our Extra Trees Classifier (ETC) model, through meticulous tuning and balancing, surpassed these
benchmarks, achieving 93% accuracy.
Methodology Overview
Step 4: Data Balancing
Step 1: Data Loading Utilize SMOTE
IBM HR Employee Attrition Dataset

Step 5: Split Data


Step 2: Exploratory Data Analysis
Perform EEDA 85% Train, 15% Test

Step 6: Model Training


Step 3: Feature Engineering
Remove low-correlation features; encode categorical data Apply ML algorithms

Step 7: Evaluate & Compare

Assess model performance


Dataset Information
Dataset Source Records Features Key Attributes

IBM HR Analytics 1470 Employees 35 (Categorical & Age, Income,


Numerical) Department,
Satisfaction, Experience,
etc.
Predictive HR Analytics: A
Data Science Approach to
Employee Attrition
Uncovering insights and building models to predict employee
attrition, enhancing HR strategies.
Employee Exploratory Data Analysis (EEDA)
• Young employees (20-25) with low income: high attrition.
• Attrition drops after 4+ years experience.
• Higher earners tend to stay longer.
Feature Engineering: Optimizing Model Inputs
Features Removed Reason for Removal
DailyRate, EmployeeCount, StandardHours, and others. Low correlation with the attrition target variable.

Technique Used Primary Goal


One-Hot Encoding for categorical data transformation. Enhance model input quality and overall accuracy.
Machine Learning Models
Utilized

Logistic Decision Tree Support Vector


Regression Classifier Machine
Baseline for binary Rule-based, Effective for high-
classification. interpretable model. dimensional data.

Extra Trees
Classifier
Ensemble method
with highest
performance.
Data Preprocessing:
Preparing Data for ML
Missing Values
Handled using appropriate imputation strategies.

Dataset Balancing
SMOTE applied to address class imbalance.

Normalization/Scaling
Applied to standardize feature ranges.

Data Split
Divided into 85% training, 15% testing sets.
Model Evaluation Metrics: A Comprehensive View
Evaluation Results: Model
Performance Comparison
ETC Outperformance
Extra Trees Classifier demonstrated superior results.

LR Performance
Logistic Regression was good but slightly lower in metrics.

SVM Limitations
Support Vector Machine less effective with high-dimensional data.

DTC Overfitting
Decision Tree Classifier showed signs of overfitting.
Tools & Technologies Used

Python
Primary programming language.

Pandas
Data manipulation and analysis.

scikit-learn
Machine learning algorithms and utilities.

Jupyter Notebook
Interactive development environment.
Skills Acquired During Internship
Data Sourcing
Effective data acquisition and integration.

Data Preprocessing
Cleaning, handling missing values, transformation.

Feature Engineering
Creating relevant features for models.

Model Training
Hyperparameter tuning for optimal performance.

Version Control
Proficient use of GitHub for collaboration.

Reporting
Clear, concise report writing and presentation.
Internship Deliverables & Conclusion
Key Deliverables

• Structured .CSV dataset.


• ML-ready preprocessed dataset.
• EEDA and missing value treatment report.
• Trained attrition prediction model.

Developed an end-to-end predictive system for employee attrition. Gained a comprehensive understanding of the
Data Science lifecycle. Applied ML to a real-world HR problem, now ready for industry projects.

You might also like