Rahul Agarwal
rahul.374@gmail.com
09711261345
ACADEMIC QUALIFICATIONS
Institution/University
Degree/Certificate
Indian Institute of Technology, Delhi
Dewan Public School, Meerut
Dewan Public School, Meerut
B.Tech. (Mechanical Engineering)
Class XII (C.B.S.E.)
Class X (C.B.S.E.)
Year
2010
2006
2004
SCHOLASTIC ACHIEVEMENTS
Ranked 579 from amongst more than 400,000 students in IIT- Joint Entrance Examination (JEE) 2006.
In the top .002 percentile from amongst 250,000 students in All India Engineering Entrance
Examination (AIEEE) 2006.
ONLINE CERTIFICATIONS
Machine Learning By Andrew Ng, Stanford University,2014
Introduction to Data Science By Bill Howe, University of Washington,2013
Verified Data Scientist Track By Johns Hopkins University, 2013-2014
Data Analysis And Statistical Inference, Duke University, 2014
Scalable Machine Learning And Introduction To Spark, Berkeley, 2015
EXPERIENCE
Data Scientist and Data Engineer, CitiGroup, Mumbai, India Jan, 2015-Present
Entity Linkage Framework: Created a Unified customer view by using PII information on credit card accounts.
Implemented Connected component algorithm taking cues from a research paper which got the process
running time from 24 hours to 18 mins.
Built load-balanced Mapreduce algorithm for Fuzzy Address Match (a comparison of the order of
quadrillion), which reduced the process time from 4 Hrs to 45 Mins.
Created a holistic visualization for connected view of all the users in the network using R.
Presented data findings to senior management on a regular basis.
Used Logistic regression to create a predictive model for detecting Wilful Defaulters based on Customer
level Risk and Network Variables.
Tools: Python, SQL, R, HQL(Hive), Hadoop, Bash
Data modelling Packages: Scikit-Learn
Data Scientist, MyCityWay India, Noida, India Oct, 2012-Dec,2014
Implemented Bayesian Bandits approach for dynamic A\B testing of campaign variants for an inhouse
product.
Created reporting layer using Pig,hive and impala for an inhouse product.
Implemented Description based classification of personas using TFIDF Vectorization and feature selection
techniques from appstore apps to derive an aggregated user persona.
Worked on caret and scikit learn packages in R and python for regression and classification modelling
tasks.
Hands on experience on Hive, Impala and HDFS System to managing and extracting data
Tools: Python, SQL, R,MATLAB/Octave, Impala, SAP HANA, Impala, HQL(Hive), Bash
Data modelling Packages: Caret,Scikit-Learn,Orange etc.
Co-Founder, LMT Learning Solutions, Mumbai, India Feb, 2012-Oct, 2012
Co-Founded LMT Learning Solutions with the motivation to provide a career head-start to school going
children.
Worked on Client procurement, Vendor Management and Marketing related to the business.
Provided inputs for website development and strategic marketing of the website on the internet.
Marketed and presented the idea in front of parents and principals.
Business Analyst, Fractal Analytics, Mumbai, India Jun, 2011-Feb, 2012
Visualization BI based reports in Spotfire to help upper management in P&G take decisions based on
data
Day to day interaction with 5 different clients, understanding their business needs, gathering data from
different databases, analyzing those data using different software and coming up with the best solution.
Kept track of all the business aspects of P&G Russia by generating periodic reports based on latest data.
Discovered a new process to automate the repetitive work being done at Fractal which showed the
potential to save 160 man hours per month using My SQL- The work was highly appreciated by higher
authorities at Fractal and I lead a team of 5 members to leverage this initiative in the best possible way
across different markets.
Tools: Advanced Excel, MySQL, Spotfire.
Research Analyst, CARTESIAN CONSULTING, Mumbai, India Jan, 2011-Jun, 2011
Key responsibilities:
Work involved designing direct marketing campaigns for companies working hand in hand with the
Marketing heads from companies like Levis and LG.
Work involved managing the customer database. Finding actionable parameters for segmenting the
customers using RFM Modelling to procure the best response rate.
SOME SELF LEARNING PROJECTS
Website: I Currently Maintain an active blog at http://mlwhiz.com, where I write about Data Science
and New things therein.
MCMC Projects: Solved the much known Knapsack problem using MCMC. Also implemented a code
breaking solution using MCMC. Code at https://github.com/MLWhiz/MCMC_Project
Regression, Classification and Recommendations With Spark: As Part of Spark Machine learning
Classwork. Code at https://github.com/MLWhiz/Spark_Projects
RedditDS: Created a simple but powerful Graph Visualization to discover interesting reddits in
DataScience Subreddit using Flask, d3 and Heroku. Hosted at http://apps.mlwhiz.com/redditapp
Simple BlackJack: A simple BlackJack Simulator. Hosted at http://bit.ly/1XdlaY1
OTHER INTERESTS
Reading, Guitar, Playing Snooker