Lead Scoring Case Study
Lead Scoring Case Study
P R E S E N T E D B Y:
D H R U V S I N H A
S U D E E P M E N O N
1
PROBLEM STATEMENT:
✓ An education company named X Education sells online courses to
industry professionals.
✓ While X Education gets a lot of leads, its lead conversion rate is a
mere 30%.
✓ To make this process more efficient, the company wishes to identify
the most potential leads, also known as ‘Hot Leads’.
✓ If they successfully identify this set of leads, the lead conversion rate
should go up as the sales team will now be focusing more on
communicating with the potential leads rather than making calls to
everyone.
BUSINESS OBJECTIVE:
✓ Build a logistic regression model to identify hot leads with
a ballpark of the target lead conversion rate of ~80%
✓ Model should be flexible and should be able to incorporate
company’s future requirements.
OVERALL APPROACH:
LAST ACTIVITY:
- Maximum leads are generated from people with last activity - Email opened and SMS sent.
- Conversion rate is highest for SMS Sent (~63%), where as it is only ~38% for Email Opened.
- Olark chat conversation and Page Visited on Website generates significant number of leads but their conversion rates are
extremely low at 8% and 38% respectively.
EXPLORATORY DATA ANALYSIS:
- Total Visits:
- Median for converted and non-converted leads are same.
- People who visits the platform have equal chances(50-50) of applying and not applying for the course.
- Total Time Spent on Website:
- People spending more time on website have more chances of opting for a course
- People who spend less time on the website didn't opt for any courses.
- Page Views Per Visit:
- Median for converted and non-converted leads are same.
FINDING THE OPTIMAL CUT-OFF POINT:
Final model cut-off considered the probability threshold of 0.34 based on model performance on Train Dataset
X Education has a period of 2 months every year during which they hire some interns. The sales team, in particular, has around 10 interns
allotted to them. So during this phase, they wish to make the lead conversion more aggressive. So they want almost all of the potential
leads (i.e. the customers who have been predicted as 1 by the model) to be converted and hence, want to make phone calls to as much of
such people as possible. Suggest a good strategy they should employ at this stage.
Solution:
At times, the company reaches its target for a quarter before the deadline. During this time, the company wants the sales team to focus
on some new work as well. So during this time, the company’s aim is to not make phone calls unless it’s extremely necessary, i.e. they
want to minimize the rate of useless phone calls. Suggest a strategy they should employ at this stage.
Solution:
It was found that the variables that mattered the most in identifying potential customers are:
✓ Lead Origin: a. Lead Add Form b. Landing Page Submission
✓ Current occupation: a. Working Professional
✓ Lead source: a. Welingak website b. Olark Chat
✓ The Total Time Spent on the Website.
✓ Last activity was: a. SMS sent b. Email Opened C. Email Bounced
✓ Total time spent on the website
INSIGHTS AND RECOMMENDATIONS:
✓ Based our model, 7 of the 10 predictors belong to the below three variables:
➢ Lead Origin: Lead add Form & Landing Page Submission
➢ Lead Source: Welingak Website & Olark Chat.
➢ Last Activity: SMS Sent, Email Opened, Email Bounced
✓ Based on the coefficient values in our model, the following are the top three
categorical/dummy variables that should be focused on the most in order to increase the
probability of lead conversion:
➢ Lead Origin_Lead Add From
➢ What is your current occupation_Working Professional
➢ Lead Source_Welingak Website
THANK YOU
14