[go: up one dir, main page]

0% found this document useful (0 votes)
88 views9 pages

CIA 01-Data Visualization 2228328

This document analyzes a dataset of 30,000 women's fashion products. Key findings include: 1) The top 2 categories, Indian and Western wear, contribute 74% of total sales. 2) Garment categories see more transactions with higher discounts, while watches and fragrance have the highest sell price per transaction. 3) The top 10 brands contribute 31% of total sales and 34% of total transactions. 4) 69% of transactions have a sell price less than 1500. Visualizing the data provides business insights into strengths, weaknesses, and growth opportunities.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
88 views9 pages

CIA 01-Data Visualization 2228328

This document analyzes a dataset of 30,000 women's fashion products. Key findings include: 1) The top 2 categories, Indian and Western wear, contribute 74% of total sales. 2) Garment categories see more transactions with higher discounts, while watches and fragrance have the highest sell price per transaction. 3) The top 10 brands contribute 31% of total sales and 34% of total transactions. 4) 69% of transactions have a sell price less than 1500. Visualizing the data provides business insights into strengths, weaknesses, and growth opportunities.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

DATA ANALYSIS FOR MANAGERS (MBA134)

CIA-1

By

LIKHITHA S
2228328

Under the Guidance of

PROF. Dr. Fezeena Khadir

MBA PROGRAMME
SCHOOL OF BUSINESS AND MANAGEMENT
CHRIST (DEEMED TO BE UNIVERSITY), BANGALORE
SL. Table of Contents Page No
No
1 Introduction 2
2 About the data 2
3 Levels of Measurement for Dataset 2
4 Data Size 3
4 Changes in the Dataset pre-analysis 3
5 Analysis 4
6 Deep Dive 4
7 Variable Relationships 6
8 Top Brands Overall 7
9 Conclusion 8
10 Reference 8

1
Introduction:
Data is the facts and figures collected, analyzed, and summarized for presentation and interpretation. All
the data collected in a particular study are referred to as the datasets for the study.

In this report, I have used the dataset of 30000 women’s fashion products from Kaggle. I have employed
my knowledge to identify different variables, distinguish types of measurements of data variables, and
have used data analysis to thoroughly analyze, visualize and interpret the dataset to gather relevant
business insights and have presented the same to you in a systematic format.

I have used various Histograms, Pie charts, bar graphs, scatterplots, and tables to support my analysis
and insights. This also underscores the importance of data analysis in today's data-driven world.

About the Data


This dataset is a collection of 30000 women’s fashion products. Categories covered in this dataset are western
wear, Indian wear, perfumes and fragrances, watches, and nightwear.

Column description is mentioned below:

1. Brand Name: Mentions the brand of the product


2. Details: Details about the product
3. Size: Sizes available
4. MRP: This is the max retail price
5. SellPrice: This is the price after discount
6. Category: Category of the product
7. Nan value is a null value

Levels of Measurement for Dataset


There are 4 scales of measurement

 Nominal: - A nominal scale usually deals with the non-numeric variables or the numbers that do
not have any value. Using the nominal scale of measurement, the data can be classified but cannot
be added, subtracted, multiplied, or divided

2
 Ordinal: - An ordinal scale indicates the order and ranking of data without specifying the degree
of variation among the data. Ordinal data is known as categorical. Grouping, naming, and orders
are possible.
 Interval: - The interval scale of measurement includes those values that can be measured in a
specific interval, for example, time, temperature, etc. It shows the order of variables with a
meaning proportion or difference between them.
 Ratio: - The ratio scale is the most comprehensive scale among others. It includes the properties
of all the above three scales of measurement. The ratio scale has a unique feature i.e., it possesses
the character of the origin or zero points.

Below is the classification of variables into their scales of measurement.

1. Brand name – This is a Categorical, Nominal type.


2. Details belong to Categorical, and Nominal as they are not numerical but are rather characteristic
and defining.
3. Sizes – Generally, they are Categorical, Ordinal as the sizes could be small, Medium, large, or X,
XX, XXX. But sometimes, there could just be one size as XX-Large. This makes it nominal as
well.
4. MRP belongs to the Numerical, Continuous, and Ratio level of data measurement.
5. Sell price is a Numerical, Continuous, Ratio.
6. Discount prices come under Numerical, Continuous, Interval
7. Category is the Nominal categorical type.
8. Discount is Categorical, Ordinal but needs transformation to be continuous

As the 'Discount' variable contains some text in its entries and hence is categorical, we need to change it
to continuous by only taking the number entry from it.

Ex: - '50% off' becomes 50, '20% off' becomes 20, Etc.

Data Size
Total records 30758
Nan records 1183
Records with no MRP or Discount 7025
Total records 22550

Changes in the Data set pre-analysis


As the data showed missing values, those data had to be removed for smoother analysis

3
 MRP converted to a number (Transformation of levels of measurement to continuous ratio)
 Discount percentages converted to a number (Transformation of levels of measurement to
continuous ordinal)
 Removed blank records: 1183 records
 Removed records with missing MRP as it will not be useful for analysis: 7025 records

Analysis
Additional Calculate fields:

 Amount of Discount on MRP

 Average ticket size per category and Brand

There are multiple cuts and slices that we can use on this database. We are focusing on Categories and
Brands here. Total Sales seen in the data are for 3.32 crores through 22.5k transactions across more than
170 brands.

Deep Dive
Brands can be categorized into seven categories, out of which Indian and Western wear is the major
contributors and contribute to 74% of the total sales whereas Watches and Fragrance give the highest Sell
Price per transaction.

4
By sales values, 69% of the transactions have a Sell Price of less than 1500.

Top Brands in each category can be seen in the following chart, ‘Casio’ has the highest Avg. sell price per
Transaction whereas ‘Vastranand’ contributes to the highest value sales.

5
In the next section, I have used the original variables for showing if there is any relationship between
them.

Variable Relationships:

Left: Original data – to check if there is any trend seen (Correlation = 0.955)

Right: After removing the outlier data point – to check if the trend still holds (Correlation = 0.950)

The correlation value for the Number of Transaction vs. Discount is the Number of transactions that increases
as the Discount amount is increased for Western wear.

Western wear category shows high dependence on the discounts given. Several transactions are seen to
increase with high discount value.

6
Left: Original data – to check if there is any trend seen (Correlation = 0.931)

Right: After removing the outlier data point – to check if the trend still holds (Correlation = 0.892)

Original data – to check if there is any trend seen (Correlation = 0.998)

Indian wear category, as well as Lingerie&Nightwear, also show dependence on the discounts given. The
number of transactions is seen to increase with increasing discount value.

While the other categories – Watches, Jewelry, and Fragrances show very less dependency on the discount
value given.

Top Brands Overall


Top 10 brands by Sales contribute to 31% of the total sales and are led by whereas Top 10 brands by several
transactions contribute to 34% of the total transactions.

7
Conclusion
1. 31% of total sales and 34% of total transactions are contributed by the top 10 brands, while the
top 2 categories (Indian and Western wear) contribute 74% of total sales
2. Garments – Indian wear, Western wear, Lingerie&Nightwear are seen to be more discount
dependent for increasing number of transactions.
3. Vastranand dominates the Indian wear category and also is the highest in terms of sales.
4. 69% of the transactions have Sell Price less than 1500
5. Watches and Fragrance have the highest Sell Price per transaction.
Data Visualization offers an extraordinary insight into key business aspects. It can enable businesses to
unearth their strengths and weaknesses and in turn, help them grow stronger in today's data-driven
landscape.

Reference and Bibliography


1. https://www.kaggle.com/datasets/mukuldeshantri/ecommerce-fashion-dataset
2. Data Visualization With Excel | Edureka
3. Introduction to Pivot Tables, Charts, and Dashboards in Excel (Part 1)
4. Anderson, Sweeney, Williams, Camm, Cochran, Statistics for Business & Economics, 13th ed Cengage
Learning
5. https://byjus.com/maths/scales-of-measurement/
6. https://www.indeed.com/career-advice/career-development/levels-of-measurement
7. Introduction to Data Visualization with Excel

You might also like