Unit IV: Best Practices of Data Visualization
Gestalt Principles in Data Visualization:
Gestalt principles are a set of psychological theories that explain how
humans naturally perceive visual elements as unified wholes rather than just
a collection of parts. These principles are critical in data visualization
because they help guide how viewers interpret charts, graphs, and
dashboards. Applying these principles leads to more effective, intuitive, and
aesthetically pleasing visualizations.
1. Proximity
Definition:
The principle of proximity states that elements placed close together are
perceived as a group.
In Data Visualization:
When data points, labels, or objects are placed near each other, the human
brain assumes they are related.
Example:
In a scatter plot, if points are clustered in one area, viewers interpret those
points as a group.
In a dashboard, placing a chart title close to the chart tells the user they are
related.
Fig. Scatter plot
2. Similarity
Definition:
Elements that look similar (in shape, size, color, or font) are perceived as
part of the same group or pattern.
In Data Visualization:
Use consistent colors, shapes, or line styles to help users identify similar
data types.
Example:
In a line chart showing different products, using the same line style or color
for similar categories helps users group them mentally.
Bar charts with the same color bars imply the bars are part of the same
dataset.
Fig. line chart
3. Continuity (or Good Continuation)
Definition:
The eye is drawn to continuous lines and patterns. People tend to follow the
smoothest path when viewing lines.
In Data Visualization:
Align elements in a natural flow to make it easier for users to follow the
story.
Example:
In a line chart, if a line smoothly increases, users follow it naturally to
understand trends.
Aligning icons or data panels in a dashboard in a consistent direction aids
scanning.
Fig. line chart
4. Closure
Definition:
Our brains tend to "fill in the blanks" and perceive a complete image even
when parts are missing.
In Data Visualization:
You can imply shapes or trends even with minimal data points or incomplete
boundaries.
Example:
In a donut chart, even if the circle isn’t fully closed, users still perceive a
whole.
In scatter plots with few points, viewers may still perceive a pattern or
trend.
5. Figure/Ground
Definition:
The brain differentiates an object (the "figure") from its surrounding area
(the "ground").
In Data Visualization:
Make sure the most important data stands out as the figure, and the
background recedes.
Example:
In a bar chart, using a light background and dark bars ensures the bars stand
out.
In dashboards, highlight key metrics with bold text or distinct backgrounds.
Gestalt Principle of Proximity:
What is Proximity?
The Gestalt Principle of Proximity states that objects or elements that are
close to each other are perceived to be more related than those that are
spaced farther apart, even if they differ in shape, size, or color.
Our brain tends to group items together based on spatial closeness. This
psychological shortcut helps us make sense of complex scenes and data
faster.
How Proximity Works in Data Visualization?
In data visualization, proximity helps viewers associate related data elements
and distinguish between different groups or sections. It plays a vital role in layout,
chart design, dashboard composition, and overall readability.
Key Uses of Proximity in Visualization
1. Grouping Related Data
Elements placed closer together are seen as belonging to the same group.
Example: In a scatter plot, a cluster of points near each other suggests a
group or pattern.
2. Associating Labels with Data
Putting a label close to a data point or chart helps users immediately see what it
refers to.
Example: A value placed near a bar in a bar chart clearly identifies that
bar’s quantity.
3. Creating Visual Hierarchy
Proximity separates content into visual sections, making it easier to scan and
understand.
Example: In a dashboard, placing KPI metrics close to their labels and icons
improves comprehension.
Real-World Examples:
✅Good Use of Proximity (Example: Dashboard Layout)
Sales figures are listed just below product names.
A set of charts is grouped in one corner showing regional data, while another
corner shows time-based trends.
User can easily tell which chart belongs to which metric group.
❌ Poor Use of Proximity
A pie chart placed far away from its legend.
Labels overlapping with unrelated data
points.
A dashboard with uneven spacing making it
hard to tell which text refers to which chart.
Accessible Visualizations:
Accessible visualizations are charts, graphs, and dashboards that everyone
can understand and use, including people with disabilities like:
Vision problems (like color blindness or low vision)
Difficulty reading small text
Using screen readers (for blind users)
Cognitive or learning disabilities
The goal is to make sure no one is left out when trying to read and
understand data.
Why Accessibility Is Important?
Imagine you create a beautiful chart — but some people can’t read it
because:
The colors are too similar (e.g., red and green for color-blind users)
The font is too small
It doesn’t work with a screen reader
Then your message is lost. Making visuals accessible ensures everyone gets
the message, not just a few people.
How to Make Visualizations More Accessible?
1. Use Color Carefully
Don’t rely on color alone to show differences.
Use patterns, labels, or icons along with colors.
Choose colorblind-friendly palettes (like blue and orange instead of red and
green).
Example: Instead of using just red and green for "Fail" and "Pass", also add
symbols like ✔️ and .
2. Use Clear and Big Text
Use easy-to-read fonts (like Arial or Roboto).
Keep text size large enough (usually 12–14 pt minimum).
Avoid writing too much — keep labels short and clear.
Example: Instead of writing “Q1 revenue comparison of all regions”, just say “Q1
Revenue”.
3. Add Descriptions and Alt Text
Add alt text or descriptive labels for charts.
Help screen reader users understand what the chart shows.
Example: “This bar chart shows sales increasing from January to March.”
4. Support Keyboard and Screen Reader Users
Make sure charts can be navigated without a mouse.
Use tools or platforms that support screen readers (like Tableau, Power BI,
or accessible HTML/SVG).
5. Avoid Visual Overload
Don’t cram too much info into one chart.
Use white space and group related info together.
Keep layout simple and consistent.
Example: Use separate small charts (small multiples) instead of one huge
confusing one.
Aesthetic in Data Visualization:
What Does "Aesthetic" Mean?
In data visualization, aesthetic refers to how visually appealing, clean, and well-
designed a chart or dashboard looks.
It’s not just about making things pretty — it’s about using visual design to make
the data easier to understand and more enjoyable to explore.
Good aesthetics = clear, simple, and attractive design that improves
understanding.
Why Aesthetics Matter?
People are more likely to trust and engage with visuals that look good.
A clean design helps the viewer focus on the important data, not get
distracted.
Aesthetics can help show hierarchy, relationships, or trends more clearly.
Key Elements of Aesthetic Design in Visualizations:
1. Color Choice
Use a few clear colors (not too many).
Choose colors that match the mood or message (e.g., blue for calm, red for
warning).
Use consistent color schemes.
Example: Use the same blue for "Revenue" across all charts.
2. Font and Text
Use clean, readable fonts (like Arial, Roboto, or Helvetica).
Avoid too many different font sizes or styles.
Keep text short and clear.
Example: Use one font for the whole dashboard, and bold only the most important
labels.
3. Spacing and Alignment
Use white space to separate parts of the visualization.
Align charts, legends, and titles neatly.
Avoid clutter.
Example: Leave space between charts and group related ones together.
4. Simplicity
Remove unnecessary elements like chart borders, 3D effects, or too many
grid lines.
Use minimal labeling, only what’s needed to understand the chart.
Example: Use simple bar charts instead of 3D pie charts.
5. Consistency
Use the same visual style throughout a dashboard or report.
Consistent font sizes, colors, shapes, and chart styles help users focus on the
data, not the design.
Poor Aesthetics Look Like:
Overuse of colors
Hard-to-read fonts
Charts packed with too much data
Inconsistent styles and messy layout
Good Aesthetics Look Like:
Clean layout with space
Easy-to-read text
Clear and limited use of color
Charts that highlight the message, not distract from it
Design in Data Visualization:
What is “Design” in Data Visualization?
Design in data visualization means how the whole visual is planned, structured,
and presented to clearly communicate data.
It includes everything from:
Choosing the right type of chart
Deciding how the data is arranged
Picking colors, fonts, and layout
Making the visual clear, useful, and attractive
Good design helps people understand data quickly and correctly.
Key Principles of Good Design:
1. Clarity
Keep the message simple and focused
Avoid clutter (too many colors, labels, or decorations)
Example: A line chart with only two lines and a clear title is better than a messy
chart with 10 colors.
2. Purpose
Know your goal: Are you trying to compare, show change, find a trend, or
highlight a problem?
Choose the right chart for the job.
Example: Use a bar chart to compare categories, or a line chart to show trends over
time.
3. Hierarchy
Guide the viewer’s eyes using size, color, and layout to show what is most
important.
Example: Make the title bold and larger than labels; highlight key data with color.
4. Consistency
Use the same fonts, colors, and styles across all visuals.
Helps the viewer feel comfortable and focused.
5. Accessibility
Make sure everyone can understand your visual — including people with
visual disabilities or using screen readers.
Example: Don’t use only color to show differences; add patterns or labels too.
6. Balance and Layout
Arrange your charts and elements in a logical flow
Leave enough white space so visuals don’t feel crowded
Example: Place related charts next to each other and keep plenty of space around
them.
Exploratory Data Analysis (EDA):
What is Exploratory Data Analysis?
Exploratory Data Analysis (EDA) is the process of examining, summarizing, and
visualizing data to:
Understand what the data is saying
Find patterns, trends, or outliers
Prepare the data for further analysis or modeling
Think of EDA as a detective step where you explore your dataset to ask questions
and find interesting insights before deciding what to do next.
"Before you tell a story with your data, you need to understand what it’s about."
Goals of Exploratory Analysis
Understand the structure and quality of the data
Discover relationships between variables
Spot missing values or errors
Identify trends, outliers, or unexpected patterns
Guide your next steps: modeling, reporting, or deeper analysis
Steps in Exploratory Data Analysis:
Step 1: Data Collection
Objective: Gather the data you want to analyze.
The data may come from databases, spreadsheets, online sources, or surveys.
Ensure the data is in a format that can be read by your analysis tools (CSV,
Excel, SQL, etc.).
Example: Sales data from a company’s database
Step 2: Data Understanding
Objective: Understand the structure, format, and meaning of the dataset.
Identify the columns (variables) and their data types:
o Categorical (e.g., gender, product name)
o Numerical (e.g., age, sales amount)
o Date/time (e.g., order date)
Understand the context: What does each column represent?
Example: Column “Total_Sales” = total value of purchases by a customer.
Step 3: Data Cleaning
Objective: Ensure the data is accurate, consistent, and usable.
Tasks include:
Handling missing values (e.g., filling, removing, or flagging them)
Fixing data entry errors (e.g., age = 500)
Removing duplicates
Correcting data types (e.g., converting string "100" to number)
Example: Replace missing “age” values with the average age.
Step 4: Descriptive Statistics
Objective: Get a basic summary of the dataset using statistical measures.
Mean, median, mode
Minimum, maximum, range
Standard deviation and variance
Frequency counts for categorical variables
Example: Average monthly sales = ₹12,000; max sales = ₹60,000.
Step 5: Data Visualization
Objective: Use charts and graphs to explore the data visually.
Common charts used:
Histogram – distribution of a single numeric variable
Bar chart – comparison of categorical data
Box plot – spread and outliers
Scatter plot – relationship between two numeric variables
Line chart – trends over time
Example: A line chart showing increasing sales from January to June.
Step 6: Detecting Outliers and Anomalies
Objective: Identify values that are unusually high or low.
Use box plots or scatter plots to visually detect outliers.
Use statistical rules like the IQR rule to flag outliers.
Example: A customer who made 100 orders in one day — a potential outlier.
Step 7: Identifying Relationships Between Variables
Objective: Explore how variables are related.
Use:
o Scatter plots to explore relationships
o Correlation coefficients to measure strength of relationships
o Group comparisons (e.g., sales by region)
Example: A positive correlation between advertising spend and sales.
Step 8: Feature Selection / Reduction (Optional)
Objective: Decide which variables are important for further analysis.
Remove irrelevant or redundant features
Combine or transform variables for better insights
Example: Create a new column “Profit = Revenue - Cost”.
Step 9: Drawing Preliminary Insights
Objective: Form early conclusions based on findings.
Identify trends, patterns, or key insights.
Note down questions or hypotheses for further analysis.
Example: Sales are higher in urban regions and during holiday months.
Step 10: Prepare for Next Steps
Objective: Use the insights from EDA to plan what comes next.
Next steps may include:
Building predictive models
Creating reports or dashboards
Sharing insights with decision-makers
Example: Based on EDA, create a dashboard showing regional sales trends.
Explanatory Data Analysis:
What is Explanatory Data Analysis?
Explanatory Data Analysis is the process of communicating insights and findings
from data clearly and effectively often to support decision-making.
While Exploratory Analysis is about discovering patterns, Explanatory Analysis is
about telling a story with data presenting the most important results in a way that is
easy to understand for your audience.
Think of exploratory analysis as "What can I find?", and explanatory analysis as
"How do I explain what I found?"
Purpose of Explanatory Analysis
Show what matters in the data
Answer specific business or research questions
Present insights to non-technical stakeholders
Support decisions with clear evidence
Tell a data-driven story
Steps in Explanatory Data Analysis:
Step 1: Define the Purpose or Question
Objective: Know exactly what you are trying to explain.
Ask: What question are we answering with the data?
Know your audience: Are they business managers, clients, or the public?
Example: “Why did product sales increase in December?”
Step 2: Identify Key Findings from Exploratory Analysis
Objective: Choose the most important patterns, trends, or relationships discovered
earlier.
Select findings that directly answer the question.
Avoid including all the data — focus on insights.
Example: "Sales were 2x higher in December due to promotions."
Step 3: Choose the Right Visualizations
Objective: Select charts that clearly show the insight.
Use simple visuals: bar charts, line charts, pie charts, etc.
Avoid complex visuals that might confuse your audience.
Example: A bar chart comparing monthly sales, highlighting December.
Step 4: Simplify and Clean the Visuals
Objective: Make the chart easy to understand at a glance.
Remove clutter: extra gridlines, unnecessary colors, 3D effects.
Use clear titles, labels, and legends.
Highlight key data (e.g., use a different color for December).
Example: Label the exact sales numbers on top of each bar.
Step 5: Create a Clear Narrative (Data Storytelling)
Objective: Present the insight in a logical, engaging flow.
Structure:
1. Introduction: What question are we answering?
2. Insight: What does the data show?
3. Conclusion: What does it mean? What should we do?
Example:
"December sales increased."
"Promotions ran that month."
"We recommend repeating this strategy."
Step 6: Tailor to the Audience
Objective: Match the language, visuals, and explanation style to the audience's
needs.
Use simple language for non-technical audiences.
Focus on actionable insights for business users.
Use visual storytelling for general presentations.
Example: Use icons or summary bullet points for executives, and deeper data
tables for analysts.
Step 7: Review and Refine
Objective: Make sure the final message is clear and impactful.
Ask: Can someone who doesn’t know the data understand this?
Get feedback, revise if needed.
Check for typos, formatting errors, or unclear visuals.
Example: Share your report or slide with a teammate before presenting.
Difference Between Exploratory and Explanatory Data Analysis
Feature Exploratory Data Analysis (EDA) Explanatory Data Analysis
To explore the data and discover
Purpose To explain findings and tell a data story
patterns
Open-ended analysis, finding what’s Focused on answering a specific
Focus
interesting question or goal
Decision-makers, stakeholders, general
Audience Analysts, data scientists
audience
Narrow, simplified, highlights key
Scope Broad, detailed, experimental
points
Charts Used Many types: scatter plots, Few simple charts: bar, line, pie, etc.
histograms, box plots, etc.
Data-driven exploration without a Question-driven storytelling using
Approach
clear question selected data
High – includes a lot of raw or Low – includes only what’s necessary
Detail Level
processed data to support the point
Presents conclusions and supports
Outcome Generates insights, forms hypotheses
decisions
Clean, polished visuals meant for
Visual Style More complex, experimental visuals
clarity
Tools
PowerPoint, Tableau, dashboards,
Commonly Python (Pandas, Seaborn), R, Excel
reports
Used
Data
What is Data?
Data refers to facts, numbers, measurements, or observations collected for
reference or analysis.
In data visualization or analysis, we use data to:
Understand what’s happening
Find trends and patterns
Make informed decisions
Types of Data:
Type Description Example
Quantitative values (can be
Numerical Age, salary, temperature
measured)
Labels or names (cannot be
Categorical Gender, product name, region
measured)
Ordinal Ordered categories Ratings (bad, average, good)
Time-series Data collected over time Daily sales, monthly expenses
Relationships:
In data visualization, a relationship refers to how two or more variables
interact or are associated with each other. The goal is to reveal patterns,
correlations, or trends in the data that help in understanding and decision-
making.
Types of Relationships in Visualization:
1. One-to-One Relationship
Definition: Each value in one variable is linked to one value in another
variable.
Example: Employee ID and their assigned desk number.
Visualization: Not commonly visualized because it's often straightforward;
shown in tables or simple labeled visuals.
2. One-to-Many Relationship
Definition: A single value in one variable is linked to multiple values in
another.
Example: One teacher has many students.
Visualization: Treemaps, Hierarchical diagrams.
3. Many-to-Many Relationship
Definition: Multiple values in one variable relate to multiple values in
another.
Example: Students enrolled in multiple courses.
Visualization: Network graphs, Matrix charts.
4. Correlation (Statistical Relationship)
Definition: Measures how strongly two numeric variables move together.
Types:
o Positive correlation: As X increases, Y increases.
o Negative correlation: As X increases, Y decreases.
o No correlation: No clear pattern.
Visualization:
o Scatter Plot: Best for showing the direction and strength of
relationships.
o Bubble Chart: Adds a third variable using size.
Example: Relationship between hours studied and test scores.
5. Causal Relationship (Cause and Effect)
Definition: One variable directly affects another.
Note: Visualization can suggest causation, but cannot confirm it without
controlled study.
Visualization: Line charts (over time), Scatter plots (with regression line).
Example: Ad spend increasing sales (can look like correlation, but may be
causation).
6. Trends Over Time (Temporal Relationship)
Definition: Observing how data changes over time (time is the relationship
axis).
Visualization:
o Line charts
o Area charts
o Time-series plots
Example: Website traffic over the last 12 months.
7. Part-to-Whole Relationships
Definition: A value’s share or portion of a larger total.
Visualization:
o Pie chart
o Stacked bar/column chart
o Treemaps
Example: Market share of each smartphone brand.
8. Hierarchical Relationship
Definition: Data organized in levels, often parent-child relationships.
Visualization:
o Treemaps
o Sunburst charts
o Organizational charts
Example: Company → Department → Team → Employee.
9. Geospatial Relationship
Definition: Data associated with specific geographical locations.
Visualization:
o Choropleth maps (color-coded regions)
o Symbol maps
o Heat maps
Example: Population density across cities or countries
Why Understanding Relationships Matters?
Helps identify patterns, anomalies, or opportunities in the data.
Guides which chart to use.
Prevents misinterpretation of data by showing context.
Supports exploratory and explanatory analysis.
Static vs. Interactive Visualizations:
In data visualization, how a user interacts with the data determines the type of
visual approach: static or interactive. Understanding both helps in choosing the
right method based on audience, purpose, and context.
1. Static Visualizations:
Static visualizations are non-interactive graphics. The data is displayed in a
fixed format, and the viewer cannot manipulate or explore the content.
Key Characteristics:
Fixed layout: Once designed, it doesn't change.
No user interaction: Viewers can only read and interpret.
Used in print or simple digital formats (e.g., PDF, slides, reports).
Designed for clarity and storytelling.
Examples of Static Visuals:
Infographics
Bar/line/pie charts in reports
Dashboard screenshots
Data visuals in newspapers and academic journals
Fig. Dashboard screenshots
Advantages:
Easy to create and share.
Works well in printed or offline formats.
Forces designers to simplify and highlight key insights.
Limitations:
Cannot explore different views or ask questions from the data.
Limited in handling complex or large datasets.
May not suit dynamic or real-time data analysis needs.
2. Interactive Visualizations:
Interactive visualizations allow users to engage with the data — such as
filtering, zooming, clicking, or hovering — to explore information
dynamically.
Key Characteristics:
User-driven exploration: The viewer can dig deeper into data.
Responsive design: Content updates based on user input.
Often built for dashboards, web apps, or data platforms.
Requires tools like Tableau, Power BI, D3.js, or Plotly.
Examples of Interactive Visuals:
Web dashboards with filters and drill-downs
Maps where users can click on regions to get more data
Time sliders that show how data changes over time
Hover tooltips showing detailed metrics
Advantages:
Enables deep exploration of large or complex datasets.
Enhances user engagement and personalization.
Great for dashboards, business intelligence, and analytics tools.
Supports real-time data updates.
Limitations:
More complex and time-consuming to build.
May confuse non-technical users without guidance.
Not suitable for print or static presentations.
Performance issues with very large data or poorly optimized visuals.
Difference Between Static and Interactive Visualizations:
Aspect Static Visualization Interactive Visualization
Allows user to explore (click,
User Interaction No interaction; view-only
filter, hover, zoom)
Communicate key insights Allow in-depth exploration and
Purpose
clearly and quickly discovery of patterns
Print reports, presentations, Web dashboards, data analysis
Usage Context
publications tools, apps
Complexity Simple to moderate Moderate to high
Excel, PowerPoint, Tableau, Power BI, D3.js, Plotly,
Tools
Illustrator, Google Charts Looker
Real-time Data Often supports real-time or
Not typically supported
Support dynamic data
Data Exploration Not possible Fully enabled
Performance Fast, lightweight Can be slower with large datasets
Steeper; may need training for
Learning Curve Minimal; easy to understand
users
Suitability for
Ideal Not suitable for printing
Print
Infographics, academic Business dashboards, data
Use Cases
reports, newspapers analytics, interactive web apps
Customization by Fixed view – determined by User can customize view based
User designer on filters or parameters
Bringing Everything Together in a Dashboard:
A dashboard is a visual display of key data points and insights, often from
multiple sources, consolidated into a single screen or interface. It helps users
monitor, analyze, and make decisions based on data in real-time or through
periodic snapshots.
What Is a Dashboard?
A dashboard is like the control panel of a car — it shows the most important
indicators at a glance, so you can monitor performance and take action quickly.
Dashboards bring together various visualizations (bar charts, line graphs, KPIs,
filters, maps, etc.) to give a comprehensive view of a particular domain, like
business performance, marketing campaigns, or project tracking.
Key Components of an Effective Dashboard
Component Description
The most important numbers to track (e.g., sales, revenue,
KPIs (Key Metrics)
conversion rate)
Visuals like bar charts, pie charts, line graphs to show
Charts/Graphs
trends and patterns
Interactive tools to drill down (by region, time, product
Filters/Slicers
category, etc.)
Navigation/Menu Tabs or buttons to switch between views or pages
Legends/Labels Help users interpret visualizations clearly
Tooltips Pop-up details when users hover over elements
Steps to Build a Dashboard (Conceptually)
1. Define the Purpose
What is the goal of this dashboard?
Who is the audience (executives, analysts, public)?
2. Gather and Prepare Data
Clean and organize the dataset.
Choose relevant KPIs and dimensions.
3. Select the Right Visuals
Use a mix of static and interactive visualizations based on the user's needs:
o Bar charts for comparisons
o Line charts for trends
o Pie/treemaps for composition
o Histograms/box plots for distributions
4. Design the Layout
Follow best practices in visual hierarchy:
o Top-left: Most important KPIs
o Center: Key charts
o Bottom or side: Detailed breakdowns
5. Add Interactivity
Include dropdowns, date selectors, region filters.
Use hover tooltips for more information without cluttering.
6. Test and Refine
Ensure readability, responsiveness, and performance.
Get feedback from actual users.
Dashboard Design Best Practices:
Best Practice Why It Matters
Keep it simple and
Prevents overwhelming the user with too much data
focused
Use consistent color
Helps users navigate and compare information easily
schemes
Limit number of visuals 5–7 charts per screen is ideal
Use bold fonts, colors, or icons to emphasize important
Highlight key insights
metrics
Executives need high-level summaries; analysts need more
Design for your audience
detail
Ensure data accuracy Misleading or outdated data erodes trust
Foundational to Advanced Visualizations:
Data visualization evolves from foundational (basic) charts to more advanced
(complex or specialized) types, depending on the type of data, analysis goals, and
user needs.
1. Bar Charts (Foundational Visualization)
Definition:
A bar chart uses rectangular bars to show the magnitude of values across different
categories.
When to Use:
To compare values across different categories.
Ideal when categories are discrete and non-continuous (like countries,
product types, or departments).
Key Features:
Bars can be vertical (column chart) or horizontal.
Bar length is proportional to the value it represents.
Good for ranking or sorting categories.
Example:
Comparing sales of five products:
2. Gantt Charts (Foundational Visualization)
Definition:
A Gantt chart is a type of bar chart used for project management, where
horizontal bars represent tasks or events over time. It shows the start and end dates
of each task, allowing you to visualize the timeline of a project.
When to Use:
To manage and visualize project timelines.
Ideal for tracking project schedules, tasks, and milestones.
Commonly used in project management to display deadlines and
dependencies.
Key Features:
Tasks or Activities are listed on the y-axis.
Time is represented on the x-axis, usually segmented into days, weeks, or
months.
Bars represent the duration of each task, starting from the task’s start date to
its end date.
Dependencies can be shown by linking tasks with arrows to show the order
in which they need to be completed.
Example:
Managing a Website development process
3. Stacked Bar Charts (Foundational Visualization)
Definition:
A stacked bar chart shows how different sub-categories contribute to the total value
for each main category. Each bar is divided into segments that represent different
parts of the whole.
When to Use:
To compare parts of a whole within multiple categories.
Useful when you want to show how categories are broken down into
different sub-categories.
Great for showing part-to-whole relationships over categories.
Key Features:
Main categories are represented on the x-axis.
Each bar is divided into multiple segments, each representing a sub-
category.
The length of each segment is proportional to the value it represents.
The total height/length of the bar represents the total value of that category.
Example:
Quarterly sales by product category
4. Tree Maps (Advanced Visualization)
Definition:
A tree map is a hierarchical chart that uses nested rectangles to represent data. The
size of each rectangle is proportional to the value it represents, and the rectangles
are arranged to show the relationship between parts and wholes.
When to Use:
To visualize hierarchical data.
Great for showing parts-to-whole relationships within a nested structure,
such as organizational charts, financial data, or product categories.
Key Features:
Rectangles represent individual data points or categories.
The size of each rectangle is proportional to the value it represents.
Color can be used to indicate an additional variable (e.g., performance,
status).
Nested rectangles show the hierarchical relationship between data points.
Example:
4. Area Charts (Intermediate Visualization)
Definition:
An area chart is like a line chart, but the area beneath the line is filled with color
or shading. It shows quantitative data over time and helps visualize cumulative
values and trends.
When to Use:
To show how values change over time (trends).
When you want to emphasize the magnitude of change.
Good for showing total values and how parts contribute to a whole over
time.
Key Features:
X-axis: Time or continuous data (e.g., months, years).
Y-axis: Quantitative values (e.g., sales, traffic).
The area under the line is filled to emphasize volume.
Can display multiple data series stacked on top of each other.
Example: Website Traffic Over a Year
5. Pie Charts (Basic Visualization)
Definition:
A pie chart is a circular graph divided into slices, where each slice
represents a portion of a whole. It’s used to show percentages or
proportions of a single variable.
When to Use:
To show simple part-to-whole relationships.
Best for comparing a few categories (ideally fewer than 5).
When exact comparison between parts is not critical.
Key Features:
A full circle equals 100% of the data.
Each slice size is proportional to its value.
Often includes labels or percentages for clarity.
Colors help differentiate categories.
Example: Market Share by Product Type
Visualizing Distributions:
When analyzing data, understanding how values are spread (i.e., their
distribution) is crucial. Distribution visualizations help identify patterns like
central tendency, variability, skewness, and outliers.
1. Circle Charts (Distribution Visualization)
Definition:
Circle charts are circular representations used to show distribution or proportion.
Most commonly, these include bubble charts, where each circle's size represents a
variable, and its position may show two additional dimensions.
When to Use:
To visualize multiple variables simultaneously (size, position, and
sometimes color).
When displaying proportional values using area instead of angle (as in pie
charts).
For comparing relative magnitudes in a visual and intuitive way.
Key Features:
Each circle represents a data point or category.
Size of circle shows quantity or importance.
Position (in bubble charts) can encode X and Y variable values.
Color can be used to distinguish categories or add another data dimension.
2. Jittering (Distribution Technique)
Definition:
Jittering is a technique, not a chart type, used to add a small amount of random
variation to data points in a plot. This prevents them from overlapping and
improves readability, especially in categorical or discrete data.
When to Use:
When multiple data points have the same values and overlap.
In scatter plots or dot plots with high density at specific values.
To reveal the density and spread of points in discrete datasets.
Key Features:
Jitter is added only visually (it doesn’t change the actual data).
Enhances visibility in dense clusters.
Often used with categorical x-axis data.
Example:
People and their life expectancy in various continent.
3. Box and Whisker Plots (Box Plots)
Definition:
A box and whisker plot is a graphical summary of a data set’s distribution using
the five-number summary: minimum, first quartile (Q1), median, third quartile
(Q3), and maximum.
When to Use:
To show spread, skewness, and outliers in numerical data.
When comparing distributions across multiple categories.
Useful in exploratory analysis for identifying variability and anomalies.
Key Features:
The box spans from Q1 to Q3.
A line inside the box indicates the median.
Whiskers extend to minimum and maximum (excluding outliers).
Outliers are plotted separately as points or stars.
Example:
Comparing exam scores for two classes. Each box plot shows median performance,
spread, and which class had outliers.
4. Histograms (Distribution Chart)
Definition:
A histogram is a type of bar chart that represents the frequency of data values
within equal intervals (bins). It is used to show the shape of a continuous data
distribution.
When to Use:
To analyze frequency distribution of numerical data.
To detect skewness, modality (peaks), and spread.
Best for large datasets with continuous variables.
Key Features:
The X-axis shows value intervals (bins).
The Y-axis shows the count or frequency of values in each bin.
Bars touch each other to represent continuous data.
Example:
A histogram showing lamps and their life time in hours