Interactive Data Visualization
05
Visualization Techniques Multivariate Data
IDV 2019/2020
Notice
! Author
" João Moura Pires (jmp@fct.unl.pt)
! This material can be freely used for personal or academic purposes without
any previous authorization from the author, provided that this notice is kept
with.
! For commercial purposes the use of any part of this material requires the
previous authorisation from the author.
Visualization Techniques for Multivariate Data - 2
Table of Contents
! Introduction
! Point-Based Techniques
! Line-Based Techniques
! Region-Based Techniques
! Combinations of Techniques
Visualization Techniques for Multivariate Data - 3
Interactive Data Visualization
Introduction
IDV 2019/2020
2.4. Dataset Types
Dataset Types:
26
Table 2. What: Data Abstraction
Dataset Types
Tables Networks Fields (Continuous) Geometry (Sp
Attributes (columns) Grid of positions
Items Link attribute
Field
Cell
(rows) Po
Node
(item)
Cell containing value Attributes (columns)
item cell
20 Value in cell
Multidimensional Table Trees
Value in cell
Figure 2.5. In a simple table of orders, a row represents an item, a column rep-
A multidimensionalFigure 2.4.a The detailed
table has structure of the four basic dataset types.
resents an attribute, and their intersection is the cell containing the value for that
more complex!structure for indexing pairwise combination.
A synonym for networks Tamara Munzner
into a cell, withis multiple keys.
graphs. The word graph
is also deeply overloaded in
2.4.1 Tablesvis. Sometimes it is used
to mean network as we dis- 2.4.2 Networks and Trees Data Foundations - 5
cuss here, for instance in
Many datasets come in the form The tablestype
of dataset that are made
of networks is well up offor specifying that there
suited
Multivariate Data
! Data that does not generally have an explicit spatial attribute
! Point-Based Techniques
! Project records from an n-dimensional data space to an arbitrary k-dimensional
display space, such that data records map to k-dimensional points. (e.g. Scatterplots)
! Line-Based Techniques
" Points corresponding to a particular record or dimension are linked together with
straight or curved lines. (e.g. Line Graphs, Parallel Coordinates)
! Region-Based Techniques
" Filled polygons are used to convey values, based on their size, shape, color, or other
attributes. (e.g. Bar Charts/Histograms)
Visualization Techniques for Multivariate Data - 6
Interactive Data Visualization
Point-Based Techniques
Visualization Techniques Time Oriented Data - 7
Multivariate Data: Point-Based Techniques
! Scatterplots and Scatterplot Matrices
! Their success stems from our innate abilities to judge relative position within a
bounded space
! As the dimensionality of the data increases, the choices for visual analysis consist of:
! dimension subsetting (user selection or algorithm based suggestion);
! dimension embedding (mapping dimensions to other graphical attributes besides position,
such as color, size, and shape);
! multiple displays (either superimposed or juxtaposed - e. g. scatterplot matrix);
! dimension reduction (to transform the high-dimensional data to data of lower dimension).
Visualization Techniques for Multivariate Data - 8
Multivariate Data: Point-Based Techniques
! Scatterplots
x-coordinate: number of atoms;
y-coordinate: heat information;
y = mx + b; m = -12.5 and b = 50
Color of each point: Gibs energy
Visualization Techniques for Multivariate Data - 9
Multivariate Data: Point-Based Techniques
! Scatterplots
Visualization Techniques for Multivariate Data - 10
Multivariate Data: Point-Based Techniques
Visualization Techniques Time Oriented Data - 11
Scatter Matrix (in Python)
...
# data is the data frame with all variable
# snc is the subset of numerical variables of interest
# Let's check how these variables relate to ecah other
scatter_matrix(data[snc],figsize=(12,12))
Visualization Techniques Time Oriented Data - 13
Scatter Matrix (in Python)
...
# data is the data frame with all variable
# snc is the subset of numerical variables of interest
# Let's check how these variables relate to ecah other
scatter_matrix(data[snc],figsize=(12,12), diagonal=‘kde’)
Visualization Techniques Time Oriented Data - 15
Scatter Matrix (in Tableau)
Visualization Techniques Time Oriented Data - 17
Scatter Matrix (in Tableau)
Visualization Techniques Time Oriented Data - 18
Multivariate Data: Point-Based Techniques
! In situations where the dimensionality of the data exceeds the capabilities of
the visualization technique. It is necessary to investigate ways to reduce the
data dimensionality, while at the same time preserving, as much as possible,
the information contained within.
! Principal Component Analysis (PCA) - read more and see this implementation
! Multidimensional Scaling (MDS) - read more and more
! Non-linear dimension reduction techniques:
" Self-organizing Maps (SOMs) - read more
" Local Linear Embeddings (LLE) - read more
Visualization Techniques for Multivariate Data - 22
Principal Component Analysis (PCA)
https://en.wikipedia.org/wiki/Principal_component_analysis
Visualization Techniques Time Oriented Data - 23
Principal Component Analysis (PCA)
http://www.nlpca.org/pca_principal_component_analysis.html
Visualization Techniques Time Oriented Data - 24
Multidimensional scaling (MDS)
! Projecting M points in N dimensions into L dimensions (L = 2 or 3) display space.
! The key goal is to attempt to maintain the N-dimensional features and characteristics of the
data through the projection process, e.g., relationships that exist in the original data must also
exist after projection.
" The projection may also unintentionally introduce artifacts that may appear in the
visualization and are not present in the data.
" Repeat
" Create an Similarity M x M Matrix (D) (could be distance)
" Create a coordinates Matrix M x L and fill randomly or other method (ex: PCA)
" Compute an M x M matrix (L) based on L coordinates. And compute S the difference
between D and L.
" Shift the positions of points in L in a direction that will reduce their individual stress levels
" Until S is small of not changed significantly
Visualization Techniques for Multivariate Data - 25
Multidimensional scaling (MDS)
! Projecting M points in N dimensions into L dimensions (L = 2 or 3) display space.
MxM MxL MxM Mx1
D Coordinates L S
" Repeat Change
" Create an Similarity M x M Matrix (D) (could be distance)
" Create a coordinates Matrix M x L and fill randomly or other method (ex: PCA)
" Compute an M x M matrix (L) based on L coordinates. And compute S the difference
between D and L.
" Shift the positions of points in L in a direction that will reduce their individual stress levels
" Until S is small of not changed significantly
Visualization Techniques for Multivariate Data - 26
Multidimensional scaling (MDS)
! There are many possible variants on this algorithm, including:
" Different similarity and stress measures;
" Different initial and termination conditions;
" Different position update strategies.
" As in any optimization process, there is the potential to fall into a local minimal configuration that
still has a high level of stress.
" Common strategies to alleviate this include occasionally adding a random jump in the
position of a point to see if it will converge to a different location
" Obviously, the results are not unique: minor changes in the starting conditions can lead to
dramatically different results.
Visualization Techniques for Multivariate Data - 27
Multivariate Data: Point-Based Techniques
! Iris flower data set
Iris versicolor Iris virginica
Iris setosa
Visualization Techniques for Multivariate Data - 28
Multivariate Data: Point-Based Techniques
Iris setosa
Iris versicolor
Iris virginica
Visualization Techniques for Multivariate Data - 29
Multivariate Data: Point-Based Techniques
! Iris data set projected using MDS
Visualization Techniques for Multivariate Data - 30
Multivariate Data: Point-Based Techniques
! RadViz: is a force-driven point layout technique that is based on Hooke’s Law for equilibrium.
! For an N-dimensional data set, N anchor points are placed on the circumference of the circle to
represent the fixed ends of the N springs attached to each data point.
! Different placement and ordering of the anchors will give different results, and that points
that are quite distinct in N dimensions may map to the same location in 2D.
DIMENSIONAL ANCHORS: A GRAPHIC PRIMITIVE FOR
MULTIDIMENSIONAL MULTIVARIATE INFORMATION
VISUALIZATIONS, Patrick Hoffman, Georges G. Grinstein
Visualizing Multivariate Data with Radviz
Visualization Techniques for Multivariate Data - 31
Multivariate Data: Point-Based Techniques
! RadViz: different views of the same data set in RadViz, using manual reordering of dimensions.
Visualization Techniques for Multivariate Data - 32
Multivariate Data: Point-Based Techniques
! RadViz: different views of the same data set in RadViz, using manual reordering of dimensions.
Visualization Techniques for Multivariate Data - 33
Multivariate Data: Point-Based Techniques
! RadViz: different views of the same data set in RadViz, using manual reordering of dimensions.
Visualization Techniques for Multivariate Data - 34
Interactive Data Visualization
Line-Based Techniques
IDV 2019/2020
Multivariate Data: Line-Based Techniques
Line Graphs
Visualization Techniques Time Oriented Data - 36
Multivariate Data: Line-Based Techniques
! Parallel Coordinates
Visualization Techniques for Multivariate Data - 37
Parallel Coordinates (||-coords or PCP)
! Inselberg in 1985
State of the Art of Parallel Coordinates
J. Heinrich and D. Weiskopf
Visualization Techniques for Multivariate Data - 38
Parallel Coordinates (||-coords or PCP)
State of the Art of Parallel Coordinates
J. Heinrich and D. Weiskopf
Visualization Techniques Time Oriented Data - 39
Parallel Coordinates (||-coords or PCP)
State of the Art of Parallel Coordinates
J. Heinrich and D. Weiskopf
Visualization Techniques Time Oriented Data - 40
http://bl.ocks.org/syntagmatic/raw/3150059/
Parallel Coordinates (||-coords or PCP)
! Check https://eagereyes.org/techniques/parallel-coordinates
! Check https://syntagmatic.github.io/parallel-coordinates/
! See the video: https://youtu.be/ypc7Ul9LkxA
State of the Art of Parallel Coordinates
J. Heinrich and D. Weiskopf
Visualization Techniques for Multivariate Data - 42
Parallel Coordinates (||-coords or PCP)
! Check https://eagereyes.org/techniques/parallel-coordinates
! Check https://syntagmatic.github.io/parallel-coordinates/
! See the video: https://youtu.be/ypc7Ul9LkxA
State of the Art of Parallel Coordinates
J. Heinrich and D. Weiskopf
Visualization Techniques for Multivariate Data - 43
Parallel Coordinates (||-coords or PCP)
! Check https://eagereyes.org/techniques/parallel-coordinates
! Check https://syntagmatic.github.io/parallel-coordinates/
! See the video: https://youtu.be/ypc7Ul9LkxA
State of the Art of Parallel Coordinates
J. Heinrich and D. Weiskopf
Visualization Techniques for Multivariate Data - 44
Parallel Coordinates (||-coords or PCP)
! Check https://eagereyes.org/techniques/parallel-coordinates
! Check https://syntagmatic.github.io/parallel-coordinates/
! See the video: https://youtu.be/ypc7Ul9LkxA
State of the Art of Parallel Coordinates
J. Heinrich and D. Weiskopf
Visualization Techniques for Multivariate Data - 45
Parallel Coordinates (||-coords or PCP)
! Check https://eagereyes.org/techniques/parallel-coordinates
! Check https://syntagmatic.github.io/parallel-coordinates/
! See the video: https://youtu.be/ypc7Ul9LkxA
! http://www.xdat.org/
! Check http://www.parallelcoordinates.de/paco/#
Visualization Techniques for Multivariate Data - 46
Parallel Coordinates (||-coords or PCP)
! Very special videos !
! Tutorial by Alfred Inselberg at iV 2016 (at Lisbon) (FB and Twitter)
! Part1
! Part2
! Part3
State of the Art of Parallel Coordinates
J. Heinrich and D. Weiskopf
Visualization Techniques for Multivariate Data - 47
Multivariate Data: Line-Based Techniques
! Radial Axis Techniques
! circular line graph;
! polar graphs: point plots using polar coordinates;
! circular bar charts: like circular line graphs, but plotting bars on the base line;
! circular area graphs: like a line graph, but with the area under line filled in with a color
or texture;
! circular bar graphs: with bars that are circular arcs with a common center point and
base line.
Visualization Techniques for Multivariate Data - 48
Multivariate Data: Line-Based Techniques
Visualization Techniques for Multivariate Data - 49
Multivariate Data: Line-Based Techniques
polar graphs - point plots using polar coordinates
https://brilliant.org/wiki/polar-curves/
Visualization Techniques for Multivariate Data - 50
Multivariate Data: Line-Based Techniques
circular bar charts: like circular line graphs, but plotting bars on the base line
Visualization Techniques for Multivariate Data - 51
Multivariate Data: Line-Based Techniques
circular bar charts: like circular line graphs, but plotting bars on the base line
https://datavizcatalogue.com/methods/radial_bar_chart.html
Visualization Techniques for Multivariate Data - 52
Multivariate Data: Line-Based Techniques
circular bar graphs: with bars that are circular arcs with a common center point and base line.
https://www.r-graph-gallery.com/circular-barplot/
Visualization Techniques for Multivariate Data - 53
Interactive Data Visualization
Region-Based Techniques
Visualization Techniques Time Oriented Data - 55
Multivariate Data: Region-Based Techniques
! Bar Charts/Histograms
Visualization Techniques for Multivariate Data - 56
Multivariate Data: Region-Based Techniques
! Bar Charts
Visualization Techniques for Multivariate Data - 57
Multivariate Data: Region-Based Techniques
! Tabular Displays
" Heatmaps are created by displaying the table of record values using color rather than
text. All data values are mapped to the same normalized color space, and each is
rendered as a colored square or rectangle.
Visualization Techniques for Multivariate Data - 60
Multivariate Data: Region-Based Techniques
Visualization Techniques for Multivariate Data - 61
Multivariate Data: Region-Based Techniques
Visualization Techniques for Multivariate Data - 62
Multivariate Data: Region-Based Techniques
! table lens combines all these ideas and includes a level-of-detail mechanism for providing
panning and zooming capabilities to display whole table views, while still providing some detail
through local table lenses
Visualization Techniques for Multivariate Data - 63
Multivariate Data: Region-Based Techniques
Visualization Techniques for Multivariate Data - 64
Multivariate Data: Region-Based Techniques
! Dimensional Stacking
" Begin with data of dimension 2N + 1 (for an even number of dimensions there would be
an additional implicit dimension of cardinality one).
" Select a finite cardinality/discretization for each dimension.
" Choose one of the dimensions to be the dependent variable. The rest will be
considered independent
" Create ordered pairs of the independent dimensions (N pairs) and assign to each pair a
unique value (speed) from 1 to N.
" The pair corresponding to speed 1 will create a virtual image whose size coincides with
the cardinality of the dimensions (the first dimension in the pair is oriented horizontally,
the second vertically).
Visualization Techniques for Multivariate Data - 65
in the discrete high-dimensional space has a unique location in the two-
dimensional image resulting from the mapping. The concept of the speed of
Multivariate
a dimensionData: can bestRegion-Based
be likened to the digits Techniques
on an odometer, where digits
cycle through their values at different rates.
The value of the dependent variable at the location in the high-dimensional
! Dimensional Stacking
space is then mapped to a color/intensity value at that location in the two-
dimensional
" Create orderedimage. This
pairs of the embedding
independentprocess is illustrated
dimensions (N pairs)in
andFigure
assign 8.18 withpair a
to each
a six-dimensional data set, where dimensions d1,. . . , d6 have cardinalities
unique
4, 5, 2, value (speed)
3, 3, and from 1 to N. For clarity, we have not displayed the values
6, respectively.
associated
" The with a dependent
pair corresponding to speedvariable, which
1 will create wouldimage
a virtual be thewhose
seventhsizedimension
coincides with
and would dictate the colors in the smallest grid locations. Figure 8.19 is an
the cardinality
example of a of the dimensions
dimensional (the first
stacking dimension in the pair is oriented horizontally,
visualization.
the second vertically).
re 8.18. Conceptualization of dimensional stacking; collapsing six dimensions into two
dimensions. d1,. . . , d6 have cardinalities 4, 5, 2, 3, 3, and 6, respectively
Visualization Techniques for Multivariate Data - 66
Multivariate Data: Region-Based Techniques
Region-Based Techniques 305
! Dimensional Stacking
ure 8.19. An example of 4D data visualized using dimensional stacking. The data consists
of drill-hole data, with three spatial dimensions, and the ore grade as the fourth
dimension.
Dimensional stacking is basically a 2DVisualization
extension of a technique
Techniques developedData -
for Multivariate 67
by Mihalisin et al. [291], which involves graphing scalar fields in multiple
chniques
Multivariate Data: Region-Based Techniques 30
n example of 4D data visualized using dimensional stacking.
Visualization Techniques Time The data
Oriented Data - consis
68
drill-hole data, with three spatial dimensions, and the ore grade as the fourt
chniques
Multivariate Data: Region-Based Techniques 30
n example of 4D data visualized using dimensional stacking.
Visualization Techniques Time The data
Oriented Data - consis
69
drill-hole data, with three spatial dimensions, and the ore grade as the fourt
Interactive Data Visualization
Combinations of Techniques
Visualization Techniques Time Oriented Data - 70
Multivariate Data: Combinations of Techniques
! Glyphs and Icons
! Dense Pixel Displays
! Many others
Visualization Techniques for Multivariate Data - 71
Multivariate Data: Combinations of Techniques
! Glyphs and Icons
Visualization Techniques for Multivariate Data - 72
Interactive Data Visualization
Further Reading and Summary
Visualization Techniques Time Oriented Data - 73
Further Reading
! Recommend Readings
" Interactive Data Visualization: Foundations, Techniques, and Applications, Matthew O. Ward
et all, 2015, pages 285-314.
! Supplemental readings:
" Visualization Analysis & Design , Tamara Munzner, Chapter 7
Visualization Techniques for Multivariate Data - 74
What you should know
! Point based techniques
" Classical point base techniques have a limited dimensionality - Scatter based
" Dimension reduction or selection for data viz
! Line based
" Classical line based
" Radial Axis Techniques
" Parallel coordinates techniques and related stuff
! Region based
" Reordering the data in graphical tables
! Combination Techniques
" Dense
! Glyphs
Visualization Techniques for Multivariate Data - 75