[go: up one dir, main page]

0% found this document useful (0 votes)
13 views12 pages

703 (A) Data Visualization Unit-2 Notes

Uploaded by

coc7987515756
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views12 pages

703 (A) Data Visualization Unit-2 Notes

Uploaded by

coc7987515756
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Chameli Devi Group of Institutions

Department of Artificial Intelligence & Data Science

Subject Name: Data Visualization Subject Code: 703 (A)


Subject Notes
Syllabus: Data Visualization Techniques

Data Visualization Techniques– Pixel-Oriented Visualization Techniques- Geometric Projection Visualization


Techniques- Icon-Based Visualization Techniques- Hierarchical Visualization Techniques, Visualizing Complex
Data and Relations. Visualization Techniques, Scalar and point techniques, Color maps, Contouring Height
Plots - Vector visualization techniques, Vector properties, Vector Glyphs, Vector Color Coding Stream Objects.
Exploratory data analysis (EDA) Techniques.

__________________________________________________________________________________________

Course Outcome (CO2): Design and use various methodologies present in data visualization.
Unit-2

Data Visualization Techniques

Data visualization is the graphical representation of information and data, which helps in understanding
complex data patterns, trends, and insights more effectively. There are several techniques used in data
visualization, each suitable for different types of data and purposes. Here are some of the most common data
visualization techniques:

Pixel-Oriented Visualization Techniques

Pixel-oriented visualization techniques are a form of data visualization where each data value is represented
by a single pixel (or a small group of pixels) on the screen. These techniques are particularly useful for
visualizing large-scale multidimensional data, as they can display thousands or even millions of data points in a
single view. The key idea is to map each pixel's color or intensity to represent the data's value, enabling users
to detect patterns, trends, and anomalies in massive datasets.

Key Characteristics of Pixel-Oriented Visualization Techniques

 Scalability: Capable of visualizing large datasets, sometimes up to a million data points, in a single
view.
 Density: Because each data point is represented by a single pixel, the technique is dense, allowing for a
compact display of large amounts of data.
 Color Mapping: Data values are mapped to colors or intensities, making it possible to distinguish
between different data ranges or clusters.

Types of Pixel-Oriented Visualization Techniques

1. Pixel Bar Charts


o Description: Combines the pixel-oriented approach with traditional bar charts. Each bar in the
chart is made up of a stack of pixels where each pixel represents a data value.
Chameli Devi Group of Institutions
Department of Artificial Intelligence & Data Science

o Use Case: Effective for comparing multiple datasets or variables across different categories.
2. Recursive Pattern Technique
o Description: Arranges pixels recursively to reveal patterns within large datasets. This method
recursively partitions the dataset and displays it in a 2D grid format, preserving the hierarchical
structure.
o Use Case: Useful for visualizing hierarchical or structured data, where the recursive
arrangement helps in understanding both local and global patterns.
3. Dimensional Stacking
o Description: This technique uses multiple dimensions to stack pixels within a 2D space. Each
dimension is mapped to a different section of the 2D display, with the innermost dimensions
stacked within the outermost ones.
o Use Case: Suitable for data with many dimensions, allowing visualization of complex
interrelationships between data points.
4. Circle Segments Technique
o Description: Data points are represented as pixels arranged in circular segments. This technique
helps in visualizing cyclical or periodic patterns within data.
o Use Case: Effective for datasets that have inherent cyclic characteristics, such as sales data over
a calendar year.
5. Query-Dependent Pixel Displays
o Description: The arrangement of pixels is based on the results of specific data queries. Only
relevant data points (as defined by the query) are shown, enhancing the focus on important
data.
o Use Case: Useful for highlighting data points that meet certain criteria or for conducting data
exploration through queries.

Geometric Projection Visualization Techniques

Geometric projection visualization techniques are essential for representing three-dimensional objects on
two-dimensional surfaces, which is crucial in fields such as computer graphics, engineering, architecture, and
mathematics. These techniques vary based on the type of projection used and the intended outcome. Here’s a
breakdown of some common geometric projection visualization techniques:

1. Orthographic Projection

 Description: Orthographic projection is a method where the object is projected onto a plane
perpendicular to one of its axes, resulting in each axis being represented without perspective
distortion.
 Types:
o Plan View: Top-down view of the object.
o Elevation View: Side or front view.
o Axonometric Projections: Includes isometric, dimetric, and trimetric projections.
 Applications: Technical drawings, CAD designs, architectural blueprints.
Chameli Devi Group of Institutions
Department of Artificial Intelligence & Data Science

2. Isometric Projection

 Description: A type of axonometric projection where the object is rotated along its axes to show three
sides simultaneously, with equal scaling along each axis.
 Visualization Technique: Lines parallel to the axes of the object are drawn at equal angles (120°
between axes), giving a clear view of the object's overall structure without perspective distortion.
 Applications: Engineering drawings, video games, and pixel art.

3. Perspective Projection

 Description: Perspective projection mimics human vision by depicting objects smaller as they get
further away from the viewer, converging towards vanishing points.
 Types:
o One-Point Perspective: Single vanishing point, often used in road and railway illustrations.
o Two-Point Perspective: Two vanishing points, common in architectural drawings.
o Three-Point Perspective: Three vanishing points, used for complex scenes like tall buildings.
 Applications: Art, 3D rendering, architectural visualization.

4. Oblique Projection

 Description: A type of projection where the object is projected along lines that are not perpendicular
to the viewing plane. Common forms include cavalier and cabinet projections.
 Visualization Technique: One face of the object is parallel to the projection plane, while the other
faces are projected at an angle, often at 45°.
 Applications: Engineering, simple mechanical drawings.

5. Stereographic Projection

 Description: A projection where a sphere is projected onto a plane. This projection is conformal,
meaning it preserves angles, but not areas.
 Visualization Technique: Often used to project the Earth's surface onto a flat map or in visualizing
complex mathematical functions.
 Applications: Cartography, complex mathematical visualizations.

6. Exploded View Projection

 Description: A technique where the components of an object are separated along its axes, but still
aligned to show their relationship.
 Visualization Technique: Components are spaced apart in such a way that their assembly or
configuration is clear.
 Applications: Manuals, instructional diagrams, and CAD.

7. Shadow Projection (Silhouette)

 Description: A technique where only the outline (or shadow) of a 3D object is projected onto a plane.
Chameli Devi Group of Institutions
Department of Artificial Intelligence & Data Science

 Visualization Technique: By casting a “light” from a certain direction, the silhouette is projected onto a
2D plane, giving insight into the shape and form of the object.
 Applications: Art, character design, and pattern recognition.

8. Parallel Projection

 Description: In parallel projection, all projection lines are parallel to each other, but they are not
necessarily perpendicular to the projection plane. Unlike perspective projection, this does not simulate
depth.
 Types: Includes orthographic, oblique, and axonometric projections.
 Applications: Technical illustrations where accurate measurements are required.

9. Cross-Sectional (Section) View

 Description: A projection technique where a plane cuts through the object to reveal its internal
structure.
 Visualization Technique: The object is sliced, and the cross-sectional surface is exposed to show
internal components or structures.
 Applications: Engineering, medical imaging, architectural designs.

Icon-Based Visualization Techniques

Icon-based visualization techniques use graphical symbols (icons) to represent data, concepts, or actions in a
visually intuitive way. These techniques are effective in conveying complex information quickly and are
commonly used in user interfaces, dashboards, and data visualization. Here are some popular icon-based
visualization techniques:

1. Pictographs (Pictograms)

 Description: Pictographs use icons that visually represent the actual objects or categories they
symbolize, often used to represent data in a more tangible way.
 Visualization Technique: A repeated icon (e.g., a person to represent population) is used to depict
quantities. Each icon can represent a specific count or percentage.
 Applications: Infographics, educational materials, health reports, social statistics.

2. Glyphs

 Description: Glyphs are small, graphical units that encode multiple variables within their design. A
glyph can vary in shape, size, color, orientation, or pattern to convey different dimensions of data.
 Visualization Technique: For example, a glyph representing weather data might use different shapes
for temperature, colors for humidity, and orientations for wind direction.
 Applications: Multivariate data analysis, scientific visualization, geographic information systems (GIS).
Chameli Devi Group of Institutions
Department of Artificial Intelligence & Data Science

3. Chernoff Faces

 Description: A specialized type of glyph where human facial features are varied to represent different
data variables. Each feature (e.g., eyes, mouth, nose) corresponds to a specific data dimension.
 Visualization Technique: Changes in the shape, size, and orientation of facial features represent
different data values, allowing for a quick visual comparison of datasets.
 Applications: Multivariate statistics, psychological studies, market research.

4. Star Glyphs (Star Plots)

 Description: Star glyphs use a central point from which several axes radiate outwards, with each axis
representing a different variable. The length of each axis is proportional to the value of the variable.
 Visualization Technique: By connecting the end points of each axis, a star-like shape is formed,
allowing for easy comparison of multiple variables.
 Applications: Multidimensional data analysis, performance metrics, financial analysis.

5. Icon Arrays (Dot Plots)

 Description: Icon arrays (also known as dot plots) use grids of icons to represent frequencies or
proportions. Each icon represents a specific data point or a percentage of a whole.
 Visualization Technique: The icons are arranged in a grid or linear sequence to show quantities or
distributions, making comparisons intuitive.
 Applications: Medical decision aids, risk communication, educational materials.

6. Tag Clouds (Word Clouds)

 Description: Tag clouds visually represent the frequency or importance of words within a text dataset.
The size and sometimes color of each word reflect its frequency or relevance.
 Visualization Technique: Words are arranged randomly or in a clustered format, with the most
frequently occurring words appearing larger and more prominently.
 Applications: Text analysis, social media analysis, website content summaries.

7. Tile Maps (Tile Grid Maps)

 Description: Tile maps replace geographical regions (such as states or countries) with equal-sized icons
or tiles, simplifying spatial data visualization.
 Visualization Technique: Each tile represents a specific region and is color-coded or symbolized based
on the data value associated with that region.
 Applications: Political maps, demographic data visualization, election results.

8. Icons on Geographical Maps

 Description: Icons are placed on geographical maps to represent specific data points or events (e.g.,
weather conditions, landmarks, or incidents).
Chameli Devi Group of Institutions
Department of Artificial Intelligence & Data Science

 Visualization Technique: Different icons or symbols are used for various categories, and their size or
color can represent additional data dimensions such as frequency or intensity.
 Applications: Weather maps, traffic reports, event mapping.

9. Bubble Charts with Icons

 Description: Similar to standard bubble charts, but instead of plain circles, icons representing the data
type or category are used. The size of the icon can represent a quantitative variable.
 Visualization Technique: The position of each icon on the chart is based on two variables (x and y
coordinates), while the size represents a third variable.
 Applications: Market analysis, financial dashboards, and comparative studies.

10. Icon-Based Infographics

 Description: Infographics use a combination of icons and text to convey information in a visually
appealing and easily digestible format.
 Visualization Technique: Icons are used to highlight key points or data, often accompanied by brief
text explanations, to simplify complex information.
 Applications: Marketing, public awareness campaigns, educational content.

11. Heatmaps with Icons

 Description: Icons are used instead of colors to create heatmap-style visualizations, where different
icons or sizes can represent various data intensities.
 Visualization Technique: Icons are placed on a grid or geographical map, and their density, size, or
type represents data frequency or intensity.
 Applications: Geographic data analysis, customer behavior mapping, and spatial analytics.

12. Hierarchical Trees with Icons

 Description: Hierarchical tree diagrams can use icons to represent different nodes or categories,
making the hierarchy visually clearer and more engaging.
 Visualization Technique: Icons are placed at each node of the tree, with lines connecting parent-child
relationships to depict organizational structures.
 Applications: Organizational charts, website sitemaps, file directories.

13. Interactive Icon-Based Visualizations

 Description: These visualizations use interactive icons that can be clicked or hovered over to reveal
more information, allowing for deeper data exploration.
 Visualization Technique: Icons are placed strategically on dashboards or data plots, and interactions
like clicking or hovering trigger additional details or data to be displayed.
 Applications: Business intelligence dashboards, e-commerce analytics, interactive reports.
Chameli Devi Group of Institutions
Department of Artificial Intelligence & Data Science

Hierarchical Visualization Techniques

Hierarchical visualization techniques are methods used to visually represent data that has a natural
hierarchical structure, such as organizational charts, family trees, or file systems. These techniques help in
understanding relationships and structures within the data, especially when dealing with multiple levels of
information. Here’s an overview of some popular hierarchical visualization techniques:

1. Tree Diagrams

 Description: Tree diagrams are one of the most straightforward hierarchical visualization techniques.
They represent data as a tree-like structure, starting from a single root node and branching out to child
nodes.
 Visualization Technique: Nodes represent data points, and lines (edges) connect nodes to illustrate
parent-child relationships. Common variations include binary trees, decision trees, and phylogenetic
trees.
 Applications: Organizational charts, decision-making processes, biological taxonomy, software
structures (e.g., file systems).

2. Dendrograms

 Description: A dendrogram is a tree diagram used to illustrate the arrangement of clusters produced
by hierarchical clustering. It shows the hierarchical relationship between different data points.
 Visualization Technique: The data points are merged step-by-step into clusters, and the dendrogram
displays these merges as tree branches, with the height of each branch representing the distance or
dissimilarity between clusters.
 Applications: Hierarchical clustering in machine learning, genomics, taxonomies, and customer
segmentation.

3. Radial Tree (Sunburst) Diagrams

 Description: Radial tree diagrams (also known as sunburst diagrams) display hierarchical data in a
circular layout. The root node is placed at the center, and the child nodes spread outwards in
concentric circles.
 Visualization Technique: Nodes are arranged radially around a central point, and each level of the
hierarchy is represented by a ring around the center. The size of the segment or the arc can represent
additional attributes, such as quantity or value.
 Applications: Visualizing directory structures, organizational charts, hierarchical data sets, and
genealogies.
Chameli Devi Group of Institutions
Department of Artificial Intelligence & Data Science

Vector visualization techniques

Vector visualization is a powerful way to represent data that involves both magnitude and direction. It's widely
used in fields like physics, engineering, computer graphics, and data science. Here are several techniques and
methods for visualizing vectors:

1. Arrow Plots

 Description: Arrows are the most common way to represent vectors in 2D or 3D space. The tail of the
arrow represents the starting point, the length represents the magnitude, and the direction of the
arrow represents the direction of the vector.
 Use Cases: Physics (to show forces), weather maps (wind direction and speed), fluid dynamics (velocity
fields).

2. Quiver Plots

 Description: A type of plot that shows vectors as arrows with components (u, v) at points (x, y) in 2D
space. In Python's Matplotlib library, this is done using the quiver function.
 Use Cases: Ideal for visualizing vector fields, such as electric or magnetic fields or flow fields in fluid
dynamics.

3. Streamlines

 Description: Streamlines are a set of curves that represent the trajectories that particles would follow
in a vector field. They are particularly useful for visualizing fluid flows.
 Use Cases: Fluid dynamics, aerodynamics, visualizing the flow of air over a wing or water around a hull.

4. Heatmaps with Overlaid Vectors

 Description: Heatmaps represent scalar fields (like temperature or pressure) with colors, and vectors
can be overlaid on these to show direction and magnitude.
 Use Cases: Climate science, oceanography, meteorology, where temperature or pressure fields are
shown alongside wind or ocean currents.

5. Glyph Plots

 Description: Glyph plots use geometric shapes (glyphs) to represent vectors. The shape, size,
orientation, and color of each glyph can encode different vector properties.
 Use Cases: Tensor visualization in 3D vector fields, such as diffusion tensor imaging (DTI) in medical
imaging.
Chameli Devi Group of Institutions
Department of Artificial Intelligence & Data Science

Exploratory data analysis (EDA) Techniques

Exploratory Data Analysis (EDA) is a crucial step in the data science workflow that involves summarizing the
main characteristics of a dataset, often using statistical graphics and other data visualization methods. EDA
helps in understanding the underlying patterns, spotting anomalies, testing hypotheses, and checking
assumptions through the use of descriptive statistics and graphical representations.

Here are some common EDA techniques and tools:

1. Data Cleaning and Preprocessing

 Description: Before any analysis, the data should be cleaned to remove noise and errors. This involves
handling missing values, removing duplicates, and correcting inconsistencies.
 Techniques:
o Imputation: Filling missing values using mean, median, mode, or using more sophisticated
methods like regression or K-Nearest Neighbors.
o Outlier detection: Identifying outliers using statistical methods (e.g., Z-score, IQR) and deciding
whether to remove or transform them.
o Data normalization and scaling: Transforming data to a standard scale for easier comparison
(e.g., Min-Max scaling, Z-score standardization).

2. Descriptive Statistics

 Description: Descriptive statistics provide a summary of the dataset through numerical measures.
 Techniques:
o Measures of Central Tendency: Mean, median, and mode to understand the typical value in
the dataset.
o Measures of Dispersion: Range, variance, standard deviation, and interquartile range (IQR) to
understand the spread of the data.
o Distribution Analysis: Skewness and kurtosis to understand the shape of the data distribution.
o Frequency Tables and Cross Tabulations: To analyze categorical data and its distribution across
different categories.

3. Data Visualization

 Description: Visualization techniques are used to visually explore data and identify patterns, trends,
and outliers.
 Techniques:
o Histograms: To understand the distribution of a continuous variable.
o Bar Charts: To visualize categorical data and compare different categories.
o Box Plots: To visualize the spread and skewness of data along with potential outliers.
o Scatter Plots: To explore the relationship between two continuous variables.
o Pair Plots: To visualize pairwise relationships in a dataset, commonly used for multivariate data.
o Heatmaps: To visualize the correlation matrix, showing relationships between variables.
o Violin Plots: Combines aspects of box plots and kernel density plots to show data distribution
and density.
Chameli Devi Group of Institutions
Department of Artificial Intelligence & Data Science

4. Correlation Analysis

 Description: Correlation analysis is used to identify and measure the strength of relationships between
variables.
 Techniques:
o Correlation Coefficient (Pearson, Spearman, Kendall): Measures the strength and direction of
the linear relationship between two variables.
o Correlation Matrix: A table showing correlation coefficients between multiple variables.
 Visualization Tools:
o Heatmaps or color-coded matrices to visualize correlations.

5. Dimensionality Reduction

 Description: Reducing the number of variables under consideration to simplify models and reduce
overfitting.
 Techniques:
o Principal Component Analysis (PCA): Identifies the principal components that explain the most
variance in the data.
o t-Distributed Stochastic Neighbor Embedding (t-SNE): A non-linear technique for reducing
dimensions and visualizing high-dimensional data.
o Linear Discriminant Analysis (LDA): Used when there is labeled data to find a linear
combination of features that characterizes or separates two or more classes.

6. Feature Engineering and Selection

 Description: Creating new features from existing ones and selecting the most relevant features for
model building.
 Techniques:
o Feature Creation: Combining existing features, creating ratios, or applying mathematical
transformations.
o Feature Selection: Identifying important features using methods like Recursive Feature
Elimination (RFE), SelectKBest, or Lasso regression.

7. Univariate Analysis

 Description: Examining each variable in the dataset individually to understand its distribution, central
tendency, and variability.
 Techniques:
o Histograms: To visualize the distribution of a single variable.
o Box Plots: To visualize the distribution, median, and outliers.
o Density Plots: To estimate the probability density function of a continuous variable.

8. Multivariate Analysis

 Description: Examining relationships between two or more variables simultaneously.


Chameli Devi Group of Institutions
Department of Artificial Intelligence & Data Science

 Techniques:
o Scatter Plot Matrix (Pair Plots): To visualize relationships between pairs of variables.
o Heatmaps: To visualize correlation matrices.
o 3D Scatter Plots: To visualize relationships between three variables.
o Factor Analysis: To identify underlying variables (factors) that explain the pattern of
correlations within a set of observed variables.

9. Time Series Analysis

 Description: Techniques specifically for data collected over time.


 Techniques:
o Line Plots: To visualize trends over time.
o Seasonal Decomposition: To decompose the time series into trend, seasonal, and residual
components.
o Autocorrelation Plots: To identify any autocorrelation in time series data.
o Lag Plots: To detect if there is a relationship between observations separated by a certain lag.

10. Missing Value Analysis

 Description: Identifying and handling missing data is critical in EDA.


 Techniques:
o Missing Value Heatmaps: To visualize the distribution of missing values across features.
o Imputation: Filling missing values with statistical measures or using algorithms.

11. Anomaly Detection

 Description: Identifying outliers or anomalies that can affect data analysis.


 Techniques:
o Z-Score and IQR Method: To detect outliers in numerical data.
o Isolation Forests and DBSCAN: Advanced methods for identifying anomalies in datasets.

12. Interactive Visualization and Tools

 Description: Tools that allow for interactive exploration of data.


 Tools:
o Jupyter Notebooks: Interactive Python environments with visualization capabilities.
o Plotly and Bokeh: Libraries for creating interactive plots.
o Tableau and Power BI: Tools for interactive data visualization and dashboard creation.

13. Hypothesis Testing

 Description: Statistical methods to test assumptions or claims about the data.


 Techniques:
o T-tests: To compare the means of two groups.
o Chi-square tests: To examine the association between categorical variables.
Chameli Devi Group of Institutions
Department of Artificial Intelligence & Data Science

o ANOVA (Analysis of Variance): To compare means across multiple groups.


o Non-parametric tests: For data that doesn’t fit normal distribution assumptions.

You might also like