From Data To Image Pat Hanrahan
Topics
The properties of the data or information The properties of the image The rules mapping data to images Bertin 101
Page 1
The Data
Taxonomy by Data Type
1D (sets and sequences) Temporal 2D (maps) 3D (shapes) nD (relational) Trees (hierarchies) Networks (graphs) Text and documents [mine]
B. Schneiderman, The eyes have it: A task by data type taxonomy for information visualization, 1996
Page 2
Data Models vs. Conceptual Models
Data models are mathematical abstractions
Sets with operations on them
For example, integers with + and operators
Conceptual models are mental constructions
Include semantics and support reasoning
For example, navigating through a city using landmarks
Examples (data vs. conceptual): 1D vs. Time nD vs. Space
Types of Data Models
Discrete
Relations Topology Fields* Manifolds
Continuous
* Treinish, A function-based datamodel for visualization
Page 3
Relational Data Model
Records are fixed-length tuples Each column of a tuple has a domain (type) Relation is a schema plus a table of tuples Database is a collection of relations
Example: Digital cameras
Relational Algebra [Codd]
Data transformations (SQL)
Selection (SELECT) Projection (WHERE) Sorting (ORDER BY) Aggregation (GROUP BY, SUM, MIN, ) Set operations (UNION, ) Join (INNER JOIN)
Page 4
Statistical Data Model
Variables or Measurements Categories or Factors Observations or Cases
Month March April May June July August Control 165 162 164 162 166 163 Placebo 163 159 158 161 158 158 300 mg 166 161 161 158 160 157 450 mg 168 163 153 160 148 150
Blood Pressure Study (4 treatments, 6 months)
Sepal and petal lengths and widths for three species of iris (Fisher).
Page 5
Sepal and petal lengths and widths for three species of iris (Fisher).
Sepal and petal lengths and widths for three species of iris (Fisher).
Page 6
Format of the data in Appendix 14, pp. 365-366 Chambers, Cleveland, Kleiner, Tukey, Graphical Methods for Data Analysis
Types
Physical types
Characterized by storage Characterized by machine operations bool, short, int32, float, double, string,
Example: Abstract types
Characterized by methods/attributes Organized into a class hierarchy nominal, ordinal, cardinal, , plants, animals, metazoans,
Example:
Page 7
Measurements
N - Nominal (labels or types)
Fruits: Apples, oranges, Days: Mon, Tue, Wed, Thu, Fri, Sat, Sun Quality of meat: Grade A, AA, AAA Periods of time: second, minute, Counts Physical measurement: Kelvin, L, M, R,
O - Ordered
Q - Interval (Location of 0 arbitrary)
Q - Ratio (0 fixed)
S. S. Stevens, On the theory of scales of measurements, 1946
Q O N
Page 8
Dimensions and Measures
Independent vs. dependent variables Example: y=f(x,a) Infer causality
Response ~ factors Functional dependency in databases [Ullman]
Extrinsic vs. intrinsic variables Example: mass vs. density (mass/vol) Summarize
Groupby dimensions and aggregate measures
Data Cube
Measure
Width
Length
Petal Sepal I. setosa I. versicolor I. verginica
Organ
Species
Page 9
Summary of Basic Properties
Multidimensional - Number of columns Type - Type of column (N, O, Q) Cardinality (levels) - Number of different column values
The Image
Page 10
Image Information
Graphical primitives and attributes (Marks) Attributes are parameters that control the appearance of geometric primitives Visual channels Separable channels of information flowing from the retina to the brain
Visual Language is a Sign System
Image is perceived as a set of signs Sender encodes information in these signs Receiver decodes information from these signs
Page 11
8 Visual Variables
J. Bertin, Semiology of Graphics, 1967 [x,y]
Position Size Value Color Texture Orientation Shape
[z]
[Bertin, Graphics, 1983]
Note: Bertin does not consider 3D or time Note: Card and Mackinlay extend the number of vars.
[Bertin, Graphics, 1983]
Page 12
Information in Position
1. 2. 3.
A, B, C are distinguishable B is between A and C. BC is twice as long as AB.
C B A
"Resemblance, order and proportional are the three signfields in graphics. - Bertin
Information in Color and Value
Value is perceived as ordered Encode ordinal variables (O)
Encode continuous variables (Q) [not as well]
Hue is normally perceived as unordered Encode nominal variables (N) using color
Page 13
Bertins Levels of Organization
Position Size Value Texture Color Orientation Shape N N N N N N N Note: Bertin actually breaks visual variables down into differentiating () and associating () O O O Q Q
Q
N O Q
Nominal Ordered Quantitative
Note: Q<O<N
The Rules
Page 14
Design Space = Visual Metaphors
[Bertin, Semiology, 1967]
Bertins Specification
[Bertin, Semiology, 1967]
Page 15
Polaris C. Stolte
Fields Create Tables and Graphs
Ordinal fields: interpret field as a sequence that partitions table into rows and columns:
Quarter = {(Qtr1),(Qtr2),(Qtr3),(Qtr4)}
Quantitative fields: treat field as single element sequence and encode as an axis:
Profit = {(Profit)}
Page 16
Union (+) Operator
Quarter + ProductType = {(Qtr1),(Qtr2),(Qtr3),(Qtr4)}+{(Coffee),(Espresso)} = {(Qtr1),(Qtr2),(Qtr3),(Qtr4),(Coffee),(Espresso)}
Profit + Sales = {(Profit),(Sales)}
Cross () Operator
Quarter ProductType = {(Qtr1, Coffee), (Qtr1, Tea), (Qtr2, Coffee), (Qtr2, Tea), (Qtr3, Coffee), (Qtr3, Tea), (Qtr4, Coffee), (Qtr4, Tea)}
ProductType Profit = {(Coffee, Profit), (Tea, Profit)}
Page 17
Combinatorics of Encodings
Challenge: Pick the best encoding from the exponential number of possibilities (n+1)8 Principle of Consistency: The properties of the image should match the properties of the data. Principle of Importance Ordering: Encode the most important information in the most effective way.
Mackinlays Expressiveness Criteria
Expressiveness A set of facts is expressible in a visual language if the sentences (i.e. the visualizations) in the language express all the facts in the set of data, and only the facts in the data.
Page 18
Cannot Express the Facts
A 1 N relation cannot be expressed in a single horizontal dot plot because multiple tuples are mapped to the same position
Expresses Facts Not in the Data
A length is interpreted as a quantitative value; Length says something untrue about N data
[Mackinlay, APT, 1986]
Page 19
Mackinlays Criteria 2
Effectiveness A visualization is more effective than another visualization if the information conveyed by one visualization is more readily perceived than the information in the other visualization. The subject of the next lecture.
Summary
Formal approach to picture specification
Declare the picture you want to see Compile query, analysis, and rendering commands needed to make the picture Automatically generate presentations by searching over the space of designs Formalize data model Formalize the specifications Experimentally test perceptual assumptions
Bertins vision still not complete
Much more research to be done in this area
Page 20