[go: up one dir, main page]

0% found this document useful (0 votes)
34 views73 pages

Unit 1 - 1 Data Analytics Introduction

Uploaded by

Nitin bhainsora
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views73 pages

Unit 1 - 1 Data Analytics Introduction

Uploaded by

Nitin bhainsora
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 73

Data

Visualization
& Analytics
Introduction
D R . N E E T U N A R WA L
Data Analytics
❑The danger in using quantitative method does not generally lie in the inability to perform the
calculation.

❑The real threat is lack of fundamental understanding of:


❑Why to use a particular technique of procedure
❑How to use it correctly and,
❑How to correctly interpret the result

Reference : NPTEL Course on Data Analytics with Python


Data and its importance
❑Variable, Measurement and Data
❑What is generating so much data?
❑How data add value to the business?
❑Why data is important?
Variable, Measurement and Data
❑Variables – is a characteristics of any entity being studied that is capable of
taking on different values.
❑Measurements – is when a standard process is used to assign numbers to
particular attributes or characteristics of a variable.
❑Data – data are recorded measurements.
What is generating so much data?
❑Data can be generated by
❑Humans
❑Machines
❑Humans-machine combines
How data add value to business?
Data Products
Why Data is important?
❑Data helps in making better decisions
❑Data helps in solving problems by finding reasons for underperformance
❑Data helps one to evaluate the performance
❑Data helps one improve processes
❑Data helps one understand consumers and the market
Define data analytics and its types
Define data analytics
Why analytics is important?
Data analysis
Data analytics Vs Data analysis
Types of Data analytics
Data Analytics
❑Analytics is defined as “the scientific process of transforming data into
insights for making better decisions”.
❑Analytics, is the use of data, information technology, statistical analysis,
quantitative methods, and mathematical or computer based models to help
managers gain improved insights about their business operations and make
better, fact based decisions- James Evans
❑Analysis = Analytics ?
Data Analysis
❑Data analysis is the process of examining, transforming and arranging raw
data in a specific way to generate useful information from it.
❑Data analysis allows for the evaluation of data through analytical and logical
reasoning to lead to some sort of outcome or conclusion in some context.
❑Data analysis is a multi faceted process that involves a number of steps,
approaches and diverse techniques.
Analysis
Analytics
Analysis ≠ Analytics
Data Analysis ≠ Data Analytics
Business Analysis ≠ Business Analytics
Why analytics is important?
Opportunity abounds for the use of analytics and big data such as :
1. Determining credit risk
2. Developing new medicines
3. Finding more efficient ways to deliver products and services
4. Preventing fraud
5. Uncovering cyber threats
6. Retaining the most valuable customers
Customer acquisition & retention
❑It allows businesses to keep a track of customer-related patterns and trends
and delivers useful insights. Insights such as -
❑What's the best target audience for a particular product? - based on
location, age group, financial classification, etc.
❑Which channels are the best for advertising the product to their target
customer base?
❑Exactly when do leads make the purchasing decision? What's the accepted
ad frequency?
How Walmart uses analytics to
retain customers
Walmart, considered the largest retailer company in the world, launched
‘Walmart Lab’ to empower big data analytics and fuel its growth.
The loyalty program was one of the first initiatives designed by Walmart to
improve customer retention.
It uses predictive analysis to understand different types of customers and
hone loyalty programs for different segments.
Risk detection and management
Risk detection in finance was one of the
earliest applications of big data analytics.
Banks and other organizations were in dire
need of a mechanism that could rescue them
from losses incurred by fraud customers and
bad debts.
With big data in the frame, they could
segment the customers based on their past
expenditure, credit status, and other
variables to predict the probability of risk and
default.
Risk analysis examples
Kreditech, a German company runs analysis on factors such as location data,
social media analysis, online purchasing power, etc. before lending money to
their customers.
Using big data, they make sure to churn out individuals that are more likely to
default by assigning them credit scores.
Similarly, the American MNC Morgan Stanley uses a big data program to
identify potential risks using pattern recognition.
Increasing the quality of medical care
❑There is data related to the procedures of the business side of health care
such as machine usage, medicine data, etc. as well as data related to the
health of a patient or a collective population.
❑This data is collected through various technological tools used in the
healthcare industry and govt organizations, some of which are:
❑Electronic Health Records (EHRs)
❑Electronic Prescription Services (E-prescribing)
❑Personal Health Records ( PHRs)
❑Patient Portals
❑Health-Related Smart Phone Apps
There are other noticeable applications of big data in healthcare such as-
Health Tracking - Combined with the Internet of Things (IoT), big data
analytics has reformed the tracking of healthcare statistics and vitals. People
are wearing fitness bands and watches that detect vitals like heart rate,
distance walked, and sleeping patterns to blood pressure, glucose levels, and
many more.
IoT analytics
Another major application of data analytics is the Internet of Things (IoT).
❑IoT refers to the network of physical objects that uses sensors, software,
and other technologies to connect and exchange data with other devices over
the internet.
❑IoT includes everyday-use devices such as thermostats, smart lights,
refrigerators, smart watches, etc. to large-scale industrial devices such as
cars, pipelines, weather stations, and delivery trucks among others.
❑All these devices have sensors that collect different types of data which is
then analyzed according to their usage.
These vehicles process vision, sound, and radar data using IoT devices, and
when the cars pass each other, they share data which is then analyzed to
decide the correct path, traffic, and other things.
Classification of Data Analytics
Descriptive Analytics
❑Descriptive Analytics, is the conventional form of Business Intelligence and
data analysis
❑It seeks to provide a depiction or “summary view” of facts and figures in an
understandable format.
❑This either inform or prepare data for further analysis
❑Descriptive analysis or statistics can summarize raw data and convert it into
a form that can be easily understood by humans
❑They can describe in detail about an event that has occurred in the past.
Example
A common example of Descriptive Analytics are company reports that simply
provide a historic review like:
❑Data Queries
❑Reports
❑Descriptive Statistics
❑Data Visualization
❑Data dashboard
Diagnostic Analytics
❑Diagnostic Analytics is a form of advanced analytics which examines data or content to
answer the question “Why did it happen?”

❑Diagnostics analytics tools aid an analyst to dig deeper into an issue so that they can arrive
at the source of a problem.

❑In a structured business environment, tools for both descriptive and diagnostic analytics go
parallel.
Example
It uses techniques such as :

❑Data Discovery

❑Data Mining

❑Correlations
Predictive Analytics
❑Predictive analytics helps to forecast trends based on the current events
❑Predicting the probability of an event happening in future or estimating the
accurate time it will happen can all be determined with the help of predictive
analytical models.
❑Many different but co-dependent variables are analysed to predict a trend in
this type of analysis.
Example
Set of techniques that use model constructed from past data to predict the future or
ascertain impact of one variable on another:

1. Linear Regression

2. Time Series analysis and forecasting

3. Data Mining
Prescriptive Analytics
❑Set of techniques to indicate the best course of action
❑It tells what decision to make to optimize the outcome
❑The goal of prescriptive analytics is to enable;
1. Quality improvements
2. Service enhancements
3. Cost reductions and
4. Increasing productivity
Prescriptive Analytics: Example
❑Optimization Model
❑Simulation
❑Decision Analysis
Explain why analytics is important
Elements of Data Analytics
Data Analyst and Data Scientist
❑The requisite skill set
❑Difference between Data Analyst and Data Scientist
The requisite skill set
Difference between Data Analyst
And Data Scientist
Data Analytics Lifecycle
The Data analytic lifecycle is designed for Big Data problems and data
science projects.
The cycle is iterative to represent real project.
To address the distinct requirements for performing analysis on Big Data,
step – by – step methodology is needed to organize the activities and tasks
involved with acquiring, processing, analyzing, and repurposing data.
Phase 1—Discovery: In Phase 1, the team learns the business domain,
including relevant history such as whether the organization or business unit
has attempted similar projects in the past from which they can learn.

The team assesses the resources available to support the project in terms of
people, technology, time, and data.
Important activities in this phase include framing the business problem as an
analytics challenge that can be addressed in subsequent phases and
formulating initial hypotheses (IHs) to test and begin learning the data.
Phase 2—Data preparation: Phase 2 requires the presence of an analytic
sandbox, in which the team can work with data and perform analytics for the
duration of the project.

The team needs to execute extract, load, and transform (ELT) or extract,
transform and load (ETL) to get data into the sandbox. The ELT and ETL are
sometimes abbreviated as ETLT.
Data should be transformed in the ETLT process so the team can work with it
and analyze it.
In this phase, the team also needs to familiarize itself with the data
thoroughly and take steps to condition the data
Phase 3—Model planning: Phase 3 is model planning, where the team
determines the methods, techniques, and workflow it intends to follow for the
subsequent model building phase.

The team explores the data to learn about the relationships between
variables and subsequently selects key variables and the most suitable
models.
Phase 4—Model building: In Phase 4, the team develops datasets for testing,
training, and production purposes.
In addition, in this phase the team builds and executes models based on the
work done in the model planning phase.
The team also considers whether its existing tools will suffice for running the
models, or if it will need a more robust environment for executing models and
workflows (for example, fast hardware and parallel processing, if applicable).
Phase 5—Communicate results: In Phase 5, the team, in collaboration with
major stakeholders, determines if the results of the project are a success or a
failure based on the criteria developed in Phase 1.
The team should identify key findings, quantify the business value, and
develop a narrative to summarize and convey findings to stakeholders.

Phase 6—Operationalize: In Phase 6, the team delivers final reports,


briefings, code, and technical documents. In addition, the team may run a
pilot project to implement the models in a production environment.
Key roles for a successful
analytics project
Business User: Someone who understands the domain area and usually benefits
from the results. This person can consult and advise the project team on the context
of the project, the value of the results, and how the outputs will be operationalized.
Usually a business analyst, line manager, or deep subject matter expert in the
project domain fulfills this role.
Project Sponsor: Responsible for the genesis of the project. Provides the impetus
and requirements for the project and defines the core business problem. Generally
provides the funding and gauges the degree of value from the final outputs of the
working team. This person sets the priorities for the project and clarifies the desired
outputs.
Project Manager: Ensures that key milestones and objectives are met on time and at
the expected quality.
Business Intelligence Analyst: Provides business domain expertise based on a deep
understanding of the data, key performance indicators (KPIs), key metrics, and
business intelligence from a reporting perspective. Business Intelligence Analysts
generally create dashboards and reports and have knowledge of the data feeds and
sources.
Database Administrator (DBA): Provisions and configures the database environment
to support the analytics needs of the working team. These responsibilities may
include providing access to key databases or tables and ensuring the appropriate
security levels are in place related to the data repositories.
Data Engineer: Leverages deep technical skills to assist with tuning SQL queries for
data management and data extraction, and provides support for data ingestion into
the analytic sandbox.
Whereas the DBA sets up and configures the databases to be used, the data
engineer executes the actual data extractions and performs substantial data
manipulation to facilitate the analytics. The data engineer works closely with the
data scientist to help shape data in the right ways for analyses.
Data Scientist: Provides subject matter expertise for analytical techniques, data
modeling, and applying valid analytical techniques to given business problems.
Ensures overall analytics objectives are met. Designs and executes analytical
methods and approaches with the data available to the project.
Analytics parts in different profiles
Responsibilities related to data analysis, including interpreting trends,
identifying patterns, and drawing insights from data to inform decision-
making, often categorized as descriptive, diagnostic, predictive, and
prescriptive analytics.
Depending on the level of analysis involved; with roles like data analysts,
market research analysts, financial analysts, and business analysts
commonly having significant analytical components in their work.
Marketing Analyst:
Customer segmentation analysis: Identifying different customer groups based
on demographics and behaviors to tailor marketing strategies.
Campaign performance analysis: Evaluating the effectiveness of marketing
campaigns based on metrics like click-through rates and conversion rates.
Sales forecasting: Predicting future sales volumes using historical sales data
and market trends.
Case Study: Marketing Analyst Leveraging Data Analytics to Improve Customer
Acquisition

Company: ABC E-commerce (an online retail company)

Background:

ABC E-commerce was struggling with high customer acquisition costs (CAC) and low
conversion rates on its marketing campaigns. The company wanted to optimize its
digital marketing strategy using data analytics to improve ROI (Return on Investment).

Role of the Marketing Analyst:

A Marketing Analyst, was responsible for analyzing customer behavior, optimizing ad


campaigns, and improving targeting using data analytics. She used tools like Google
Analytics, SQL, Python, and Tableau to analyze and interpret marketing data.
Steps Taken by the Marketing Analyst Using Data Analytics
1. Data Collection & Cleaning

• Collected data from Google Analytics, Facebook Ads, Google Ads, and CRM
systems to track website traffic, ad performance, and customer interactions.
• Used SQL and Python (Pandas, NumPy) to clean and merge datasets for accurate
analysis.
2. Customer Segmentation & Behavior Analysis

• Performed clustering analysis (K-Means) in Python to segment customers based


on purchasing behavior, demographics, and engagement levels.
• Identified high-value customers (repeat buyers, high spenders) and low-
engagement customers to tailor marketing strategies.
3. Ad Campaign Optimization
• Used A/B testing to analyze which ad creatives and messaging performed best
in converting visitors to buyers.
• Applied predictive modeling (Logistic Regression, Decision Trees) to identify
which customer segments had the highest probability of conversion.
• Optimized ad spend by reallocating budgets toward high-performing channels
and audiences.
4. Personalization & Recommendation Engine
• Built a product recommendation model using collaborative filtering to
suggest products based on past purchases and browsing behavior.
• Personalized email marketing campaigns using dynamic content tailored to
customer preferences.
5. Business Impact & Results

Reduced Customer Acquisition Cost (CAC) by 25% by focusing on high-


performing ad segments.
Increased conversion rates by 30% through better audience targeting and
ad optimizations.
Boosted email marketing ROI by 40% by sending personalized
recommendations instead of generic promotions.
Improved customer retention by 15% by implementing loyalty programs
based on data-driven insights.
Financial Analyst:
Financial statement analysis: Interpreting company financial statements to
assess financial health and performance.
Valuation analysis: Estimating the intrinsic value of a company using financial
models.
Risk assessment: Identifying potential financial risks based on market data
and company performance.
Case Study: Financial Analyst Leveraging Data Analytics for Investment Decision-
Making
Company: XYZ Asset Management (a mid-sized investment firm)
Background:
XYZ Asset Management wanted to improve its stock selection process by integrating
data analytics. The company’s financial analysts previously relied on traditional
methods, such as fundamental analysis and financial statement reviews. However, they
needed a more data-driven approach to enhance decision-making and identify high-
potential investments faster.
Role of the Financial Analyst:
A financial analyst at XYZ Asset Management, took the initiative to incorporate data
analytics into his investment research.
He used tools like Python, SQL, Power BI, and machine learning models to analyze
large financial datasets
Steps Taken by the Financial Analyst Using Data Analytics

1. Data Collection & Cleaning

• Gathered historical stock prices, earnings reports, macroeconomic indicators, and


alternative data sources (such as social media sentiment and news feeds).

• Used Python (Pandas, NumPy) and SQL to clean and process large datasets from
multiple sources.

2. Exploratory Data Analysis (EDA) & Trend Identification

• Used Power BI and Python (Matplotlib, Seaborn) to visualize trends in stock price
movements.

• Identified patterns, such as seasonal trends, correlations between stock prices and
earnings reports, and macroeconomic impact.
3. Predictive Modeling for Stock Selection

• Developed a machine learning model (Random Forest & XGBoost) to predict stock
price movements based on historical data.

• Factored in sentiment analysis from social media using NLP (Natural Language
Processing).

• Conducted backtesting to validate the model’s accuracy.

4. Risk Assessment & Portfolio Optimization

• Used Monte Carlo simulations to assess potential risks under different economic
scenarios.

• Applied Modern Portfolio Theory (MPT) to construct an optimal stock portfolio


balancing risk and return.
5. Business Impact & Decision-Making

• Identified undervalued stocks with high growth potential that traditional


analysis had overlooked.

• Helped the investment team improve returns by 15% over six months
compared to the previous manual selection approach.

• Reduced the research time required for stock analysis by 40% using
automated data processing and visualization.
Business Analyst:
Process analysis: Examining business processes to identify areas for
improvement and optimization.
Competitive analysis: Assessing competitor strategies and market positioning.
Cost-benefit analysis: Evaluating the potential financial impact of different
business decisions.
Case Study: Business Analyst Leveraging Data Analytics to Improve Operational
Efficiency
Company: XYZ Logistics (a mid-sized supply chain and logistics company)
Background:
XYZ Logistics was facing high operational costs and delivery delays, which were
negatively impacting customer satisfaction. The company wanted to leverage data
analytics to improve efficiency, optimize routes, and reduce costs.
Role of the Business Analyst:
A Business Analyst, was assigned to identify inefficiencies in the logistics process and
propose data-driven solutions.
He used tools like SQL, Excel, Power BI, and Python to analyze operational data and
provide actionable insights.
Steps Taken by the Business Analyst Using Data Analytics

1. Data Collection & Cleaning

• Extracted historical shipment data, delivery times, fuel costs, and route details from
the company’s database using SQL.

• Cleaned and standardized the data using Excel and Python (Pandas, NumPy) to
ensure accuracy in analysis.

2. Identifying Bottlenecks & Inefficiencies

• Used Power BI and Tableau to visualize delivery performance, highlighting delayed


shipments and high-cost routes.

• Discovered that 30% of deliveries were delayed due to inefficient route planning
and vehicle underutilization.
3. Route Optimization Using Data Analytics

• Conducted geospatial analysis using Python (Geopandas, Folium) to optimize


delivery routes.

• Recommended shorter, high-traffic-avoidance routes that reduced delivery


times by 20%.

4. Cost Reduction & Resource Allocation

• Analyzed fuel consumption and vehicle utilization using statistical models to


identify cost-saving opportunities.

• Suggested dynamic vehicle allocation based on demand forecasts, reducing


empty truck trips by 25%.
5. Business Impact & Results

Reduced delivery delays by 30% through optimized routes and better


scheduling.
Cut fuel costs by 18% by minimizing unnecessary mileage.
Improved vehicle utilization by 25%, reducing operational waste.
Enhanced customer satisfaction by 15%, leading to better retention and
new business opportunities.
Data Analyst:
Descriptive analysis: Summarizing key metrics and trends from historical data
using charts and graphs.
Diagnostic analysis: Investigating anomalies and root causes of specific data
points to understand "why" something happened.
Predictive analysis: Building models to forecast future outcomes based on
historical patterns.
Prescriptive analysis: Suggesting optimal actions based on data insights to
achieve desired results.
THANK YOU

You might also like