0% found this document useful (0 votes)

41 views65 pages

Lec3 1

Uploaded by

nathandyoung1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views65 pages

Lec3 1

Uploaded by

nathandyoung1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 65

Database Systems I

CMPT 354 Summer 2024

Zhengjie Miao

15 May 2024
Course Setup
• OH slots:
• Tue 2pm-3pm, Zhengjie@TASC 1 9407
• Mon 11am-12pm, Xinyi@ASB9812
• Wed 1pm-2pm, Obumneme@ASB9812
• Thu 3pm-4pm, Zahra@ASB9812

• Sign up for Piazza!

• The place to ask course-related questions
• https://piazza.com/sfu.ca/summer2024/cmpt354d100
• Access code: qc76kdzsz9l
Recap: MapReduce
• Which design is better for OLAP?
• MapReduce vs. Relational DBMS?
• Many problems can be processed in MapReduce pattern:
• Given a lot of unsorted data
• Map: extract something of interest from each record
• Shuffle: group the intermediate results in some way
• Reduce: further process (e.g., aggregate, summarize, analyze, transform)
each group and write final results
(Customize map and reduce for problem at hand)

3
MR Example - Map
1.com 2.com 3.org 4.edu key: document id
ABC BBC CBC AAC value: document contents

A, 1 C, 3 B, 2 A, 4 map(String key, String value):

B, 1 B, 3 C, 2 C, 4
C, 1 for each word w in value:
(Local disk)
Emit_Intermediate(w, document id);

4
MR Example - Shuffle & Reduce
A, 1 C, 3 B, 2 A, 4
B, 1 B, 3 C, 2 C, 4
C, 1

Values exchanged by shuffle process

key: a word
A, { 1, 4 } values: a list of doc id
B, { 1, 3, 2 }
C, { 3, 1, 2, 4 }

A, [1, 4] The reduce function accepts all pairs for a

B, [1, 2, 3] given word, sorts the corresponding
C, [1, 2, 3, 4] document IDs

5
Trends
• 1 Hybrid Transactional and Analytical Processing

• 2. Cloud-Native Databases

• 3. Lakehouse = Data Lake + Data Warehouse

• 4. Specialized Systems
• E.g. Time-series Database, GPU Acceleration, Vector Database

6
Outline
• An overview of data models

• Basics of the Relational Model

• Relational Algebra
• Core operators & Derived operators
• Extension
• Relational calculus

7
Recap: Data Models
• Query: Who have accounts with 0 balance managed by a branch in
Springfield? Branch
Account Id Location
234 Brockway
Id Name Balance Branch
1000 Ned Flanders 0 234
Id Location
1001 Homer Simpson 0 653
653 Springfield
1002 Montgomery Burns 1000000000 781

Id Location
781 Shelbyville

• Programmer controls “navigation”: accounts → branches

• How about branches → accounts? (e.g., find branches managing accounts
with 0 balance)
8
Data Model
• Data Model
• mathematical formalism (or conceptual way) for describing the data

• The description generally consists of three parts:

• Structure of the data
• Operations on the data
• Constraints on the data

9
Structure of the data
• Schema (e.g., table names, attribute names)
• Describe the conceptual structure of the data

• Different from data structure (e.g., list, array)

• Data structure can be seen as a physical data model

10
Operations on the data
• Query language (e.g., RA, SQL)
• Describe what operations that can be performed on data

• Two kinds of operations

• operations that retrieve information (“query” the data)
• operations that change the database (“update” the data)

• Different from programming languages (e.g., C, Java)

• Support a set of limited operations
• Allow for query optimizations

11
Constraints on the data.
• Constraints (e.g., balance >= 0, account id is unique)
• describe limitations on what the data can be.

• Different kinds of constraints

• Domain constraints
• Integrity constraints

• Why does it matter?

• Ensure the correctness of data

12
Commonly Used Data Models
• Relational Data Model

• Key-Value Data Model

• Semi-structured Data Model (e.g., Json, XML)

13
The Relational Model in Brief
Id Name Balance Branch Id Location
1000 Ned Flanders 0 234 234 Brockway
1001 Homer Simpson 0 653 653 Springfield
1002 Montgomery Burns 1000000000 781 781 Shelbyville

• Structure of the Data

• Table structure
• Query language
• RA, RC, SQL
• Constraints on the data
• E.g., id is unique, balance >= 0, name is not NULL
14
The Key-Value Model in Brief
Key à Value
1000 à (Ned Flanders, 0, 234)
1001 à (Homer Simpson, 0, 653)
1002 à (Montgomery Burns, 1000000000, 781)

• Structure of the Data

• (Key, Value) pairs
• Key is an integer/string, value can be any object
• Query language
• get(key), put(key, value)
• Constraints on the data
• E.g., key is unique, value is not NULL
15
The Semistructured Model in Brief
• Structure of the Data
• Tree structure (e.g., XML, json)
• Query language
• XQuery, MongoDB QL
• Constraints
• E.g., each <Account> has a <Id> element
and a <name> element nested within

Resembles the hierarchical model

“what goes around comes around”

16
Edgar F. Codd (1923-2003)

• Pilot in the Royal Air Force in WW2

• Inventor of the relational model
and algebra while at IBM
• Turing Award, 1981

17
http://en.wikipedia.org/wiki/File:Edgar_F_Codd.jpg
Relational data model
• A database is a collection of relations (or tables)
• Each relation has a set of attributes (or columns)
• Each attribute has a name and a domain (or type)
• Set-valued attributes are not allowed
• Each relation contains a set of tuples (or rows)
• Each tuple has a value for each attribute of the relation
• Duplicate tuples are not allowed
• Two tuples are duplicates if they agree on all attributes

• Simplicity is a virtue!

18
Example Group
User gid name
uid name age pop abc Book Club
142 Bart 10 0.9 gov Student Government
123 Milhouse 10 0.2 dps Dead Putting Society
857 Lisa 8 0.7 … …
456 Ralph 8 0.3
… … … …
Member uid gid
142 dps
123 gov
Ordering of rows doesn’t matter
857 abc
(even though output is
always in some order) 857 gov
456 abc
456 gov
… …
19
Schema vs. instance
• Schema (metadata)
• Specifies the logical structure of data
• Is defined at setup time
• Rarely changes
• Instance
• Represents the data content
• Changes rapidly, but always conforms to the schema
• Compare to types vs. collections of objects of these types in a
programming language

20
Example
• Schema
• User (uid int, name string, age int, pop float)
• Group (gid string, name string)
• Member (uid int, gid string)
• Instance
• User: {〈142, Bart, 10, 0.9〉, 〈857, Milhouse, 10, 0.2〉, …}
• Group: {〈abc, Book Club〉, 〈gov, Student Government〉, …}
• Member: {〈142, dps〉, 〈123, gov〉, …}

21
Relational algebra
A language for querying relational data based on “operators”

RelOp
RelOp

• Core operators:
• Selection, projection, cross product, union, difference, and renaming
• Additional, derived operators:
• Join, natural join, intersection, etc.
• Compose operators to make complex queries
22
Selection
• Input: a table 𝑅
• Notation: 𝜎! 𝑅
• 𝑝 is called a selection condition (or predicate)
• Purpose: filter rows according to some criteria
• Output: same columns as 𝑅, but only rows or 𝑅 that satisfy 𝑝

23
Selection example
• Users with popularity higher than 0.5
𝜎!"!#$.& 𝑈𝑠𝑒𝑟

uid name age pop uid name age pop

142 Bart 10 0.9 142 Bart 10 0.9
123 Milhouse 10 0.2
𝜎!"!#$.&
857 Lisa 8 0.7 857 Lisa 8 0.7
456 Ralph 8 0.3
… … … … … … … …

24
More on selection
• Selection condition can include any column of 𝑅, constants,
comparison (=, ≤, etc.) and Boolean connectives (∧: and, ∨: or, ¬:
not)
• Example: users with popularity at least 0.9 and age under 10 or above 12
𝜎"#"$%.' ∧ *+,-.% ∨ *+,0.1 𝑈𝑠𝑒𝑟
• You must be able to evaluate the condition over each single row of
the input table!
• Example: the most popular user
𝜎"#" $ ,2,34 "#" 56 78,3 𝑈𝑠𝑒𝑟
RO N G!
W
25
Projection
• Input: a table 𝑅
• Notation: 𝜋' 𝑅
• 𝐿 is a list of columns in 𝑅
• Purpose: output chosen columns
• Output: same rows, but only the columns in 𝐿

26
Projection example
• IDs and names of all users
𝜋()*,,-./ 𝑈𝑠𝑒𝑟

uid name age pop uid name

142 Bart 10 0.9 142 Bart
123 Milhouse 10 0.2 123 Milhouse
𝜋'(),+,-.
857 Lisa 8 0.7 857 Lisa
456 Ralph 8 0.3 456 Ralph
… … … … … …

27
More on projection
• Duplicate output rows are removed (by definition)
• Example: user ages
𝜋*+, 𝑈𝑠𝑒𝑟
uid name age pop age
142 Bart 10 0.9 10
123 Milhouse 10 0.2 10
𝜋,/.
857 Lisa 8 0.7 8
456 Ralph 8 0.3 8
… … … … …

28
Composing Select and Project
• Names of users with popularity higher than 0.5
𝜋,-./ (𝜎!"!#$.& 𝑈𝑠𝑒𝑟)

uid name age pop uid name age pop name

142 Bart 10 0.9 142 Bart 10 0.9 Bart
123 Milhouse 10 0.2
857 Lisa 8 0.7 Lisa
857 Lisa 8 0.7
… … … … …
456 Ralph 8 0.3
… … … …
𝜎%&%'(.* 𝜋!"#$

• What about: 𝜎!"!#$.& (𝜋'()* (𝑈𝑠𝑒𝑟))

• Invalid types. Input to 𝜎!"!#$.& does not contain pop.
Cross product
• Input: two tables 𝑅 and 𝑆
• Natation: 𝑅×𝑆
• Purpose: pairs rows from two tables
• Output: for each row 𝑟 in 𝑅 and each 𝑠 in 𝑆, output a row 𝑟𝑠
(concatenation of 𝑟 and 𝑠)

30
Cross product example
𝑈𝑠𝑒𝑟×𝑀𝑒𝑚𝑏𝑒𝑟 uid gid
uid name age pop
123 gov
123 Milhouse 10 0.2
857 abc
857 Lisa 8 0.7
857 gov
… … … … × … …

uid name age pop uid gid

• How many rows in result? 123 Milhouse 10 0.2 123 gov

123 Milhouse 10 0.2 857 abc
• |R|*|S|
123 Milhouse 10 0.2 857 gov
• Schema compatability? 857 Lisa 8 0.7 123 gov
• Not needed. 857 Lisa 8 0.7 857 abc

• Duplicates? 857 Lisa 8 0.7 857 gov

… … … … … …
• None generated. 31
A note on column ordering
• Ordering of columns is unimportant as far as contents are concerned
uid name age pop uid gid uid gid uid name age pop
123 Milhouse 10 0.2 123 gov 123 gov 123 Milhouse 10 0.2
123 Milhouse 10 0.2 857 abc 857 abc 123 Milhouse 10 0.2
123 Milhouse 10 0.2 857 gov 857 gov 123 Milhouse 10 0.2
857 Lisa 8 0.7 123 gov = 123 gov 857 Lisa 8 0.7
857 Lisa 8 0.7 857 abc 857 abc 857 Lisa 8 0.7
857 Lisa 8 0.7 857 gov 857 gov 857 Lisa 8 0.7
… … … … … … … … … … … …

• So cross product is commutative, i.e., for any 𝑅 and 𝑆, 𝑅×𝑆 = 𝑆×𝑅 (up
to the ordering of columns)

32
Derived operator: join
(A.k.a. “theta-join”)
• Input: two tables 𝑅 and 𝑆
• Notation: 𝑅 ⋈! 𝑆
• 𝑝 is called a join condition (or predicate)
• Purpose: relate rows from two tables according to some
criteria
• Output: for each row 𝑟 in 𝑅 and each row 𝑠 in 𝑆, output a
row 𝑟𝑠 if 𝑟 and 𝑠 satisfy 𝑝
• Shorthand for 𝜎! 𝑅×𝑆
33
Join example
• Info about users, plus IDs of their groups
𝑈𝑠𝑒𝑟 ⋈!"#$.&'()*#+,#$.&'( 𝑀𝑒𝑚𝑏𝑒𝑟
uid gid
uid name age pop
123 gov
123 Milhouse 10 0.2
857 abc
857
…
Lisa
…
8
…
0.7
…
⋈
×
!"#$.&'()
*#+,#$.&'( 857 gov
… …

uid name age pop uid gid

Prefix a column reference 123 Milhouse 10 0.2 123 gov
with table name and “.” to 123 Milhouse 10 0.2 857 abc
disambiguate identically named 123 Milhouse 10 0.2 857 gov
columns from different tables 857 Lisa 8 0.7 123 gov
857 Lisa 8 0.7 857 abc
857 Lisa 8 0.7 857 gov
… … … … … …
34
Derived operator: natural join
• Input: two tables 𝑅 and 𝑆
• Notation: 𝑅 ⋈ 𝑆
• Purpose: relate rows from two tables, and
• Enforce equality between identically named columns
• Eliminate one copy of identically named columns
• Shorthand for 𝜋" 𝑅 ⋈! 𝑆 , where
• 𝑝 equates each pair of columns common to 𝑅 and 𝑆
• 𝐿 is the union of column names from 𝑅 and 𝑆 (with duplicate columns
removed)

35
Natural join example
𝑈𝑠𝑒𝑟 ⋈ 𝑀𝑒𝑚𝑏𝑒𝑟 = 𝜋? 𝑈𝑠𝑒𝑟 ⋈? 𝑀𝑒𝑚𝑏𝑒𝑟
= 𝜋&'(,/0+#,01#,232,1'( 𝑈𝑠𝑒𝑟 ⋈ !"#$.&'() 𝑀𝑒𝑚𝑏𝑒𝑟
*#+,#$.&'(
uid gid
uid name age pop
123 gov
123 Milhouse 10 0.2
857 abc
857 Lisa 8 0.7 ⋈
⋈
!"#$.&'()
*#+,#$.&'( 857 gov
… … … …
… …

uid name age pop uid gid

123 Milhouse 10 0.2 123 gov

857 Lisa 8 0.7 857 abc

857 Lisa 8 0.7 857 gov
… … … … … …
36
Union
• Input: two tables 𝑅 and 𝑆
• Notation: 𝑅 ∪ 𝑆
• 𝑅 and 𝑆 must have identical schema
• Output:
• Has the same schema as 𝑅 and 𝑆
• Contains all rows in 𝑅 and all rows in 𝑆 (with duplicate rows removed)

uid gid uid gid uid gid

123 gov ∪ 123 gov = 123 gov
857 abc 857 gov 857 abc
857 gov

37
Difference
• Input: two tables 𝑅 and 𝑆
• Notation: 𝑅 − 𝑆
• 𝑅 and 𝑆 must have identical schema
• Output:
• Has the same schema as 𝑅 and 𝑆
• Contains all rows in 𝑅 that are not in 𝑆

uid gid uid gid uid gid

123 gov − 123 gov = 857 abc
857 abc 857 gov

38
Derived operator: intersection
• Input: two tables 𝑅 and 𝑆
• Notation: 𝑅 ∩ 𝑆
• 𝑅 and 𝑆 must have identical schema
• Output:
• Has the same schema as 𝑅 and 𝑆
• Contains all rows that are in both 𝑅 and 𝑆
• Shorthand for?
uid gid uid gid uid gid
123 gov ∩ 123 gov = 123 gov
857 abc 857 gov

39
Intersection
R ∩ S
• R∩S=R–?

= R – ?

40
Intersection
• R ∩ S = R – (R – S)
R ∩ S
• Also = S – (S – R)

= R –( R
? – S )
• How can you write it without –
𝑅⋈𝑆
41
Renaming
• Input: a table 𝑅
• Notation: 𝜌4 𝑅, 𝜌 5!,5",… 𝑅, or 𝜌4 5!,5",… 𝑅
• Purpose: “rename” a table and/or its columns
• Output: a table with the same rows as 𝑅, but called differently

Member M1
uid gid uid gid
123 gov 𝜌,- ./0-,2/0- 𝑀𝑒𝑚𝑏𝑒𝑟 = 123 gov
857 abc 857 gov

42
Renaming
• Used to
• Avoid confusion caused by identical column names
• Create identical column names for natural joins
• As with all other relational operators, it doesn’t modify the database
• Think of the renamed table as a copy of the original
Member M1
uid gid uid gid
123 gov 𝜌,- ./0-,2/0- 𝑀𝑒𝑚𝑏𝑒𝑟 = 123 gov
857 abc 857 gov

43
Renaming example
• IDs of users who belong to at least two groups
𝑀𝑒𝑚𝑏𝑒𝑟 ⋈? 𝑀𝑒𝑚𝑏𝑒𝑟
uid gid uid gid
123 gov 123 gov
857 abc ⋈? 857 abc
857 gov 857 gov
uid gid uid gid
123 gov 123 gov Same uid
123 gov 857 abc
123 gov 857 gov
857 abc 123 gov Different gids
857 abc 857 abc
857 abc 857 gov
857 gov 123 gov
857 gov 857 abc
857 gov 857 gov 44
Renaming example
• IDs of users who belong to at least two groups
𝑀𝑒𝑚𝑏𝑒𝑟 ⋈? 𝑀𝑒𝑚𝑏𝑒𝑟

𝜋&'( 𝑀𝑒𝑚𝑏𝑒𝑟 ⋈#+,#$.&'()#+,#$.&'( ∧ 𝑀𝑒𝑚𝑏𝑒𝑟

N G! *#+,#$.1'(9*#+,#$.1'(
WRO

𝜌 &'(!,1'(! 𝑀𝑒𝑚𝑏𝑒𝑟
𝜋&'(! ⋈&'(!)&'(" ∧ 1'(!91'("
𝜌 &'(",1'(" 𝑀𝑒𝑚𝑏𝑒𝑟

45
Expression tree notation

𝜋#$%,
⋈#$%, (#$%- ∧ '$%, +'$%-

𝜌 #$%, ,'$%, 𝜌 #$%- ,'$%-

𝑀𝑒𝑚𝑏𝑒𝑟 𝑀𝑒𝑚𝑏𝑒𝑟

46
Summary of core operators
• Selection: 𝜎2 𝑅
• Projection: 𝜋: 𝑅
• Cross product: 𝑅×𝑆
• Union: 𝑅 ∪ 𝑆
• Difference: 𝑅 − 𝑆
• Renaming: 𝜌4 5!,5",… 𝑅
• Does not really add “processing” power

47
Summary of derived operators
• Join: 𝑅 ⋈2 𝑆
• Natural join: 𝑅 ⋈ 𝑆
• Intersection: 𝑅 ∩ 𝑆

• Many more
• Semijoin, anti-semijoin, quotient, …

48
User (uid int, name string, age int, pop float)

An exercise
Group (gid string, name string)
Member (uid int, gid string)

• Names of users in Lisa’s groups

Writing a query bottom-up: Their names 𝜋,-./ name
Milhouse
gid uid
Lisa
abc
gov
123
857
⋈
Users in
𝜋#$%
gov 857

uid name age pop gid

Lisa’s groups 𝑈𝑠𝑒𝑟
857
857
Lisa
Lisa
8
8
0.7
0.7
abc
gov
⋈
uid name age pop
Lisa’s groups 𝜋'$% 𝑀𝑒𝑚𝑏𝑒𝑟
857 Lisa 8 0.7
Who’s Lisa? ⋈ uid
123
gid
gov
𝜎,-./(""$1-" 𝑀𝑒𝑚𝑏𝑒𝑟 857 abc
857 gov
𝑈𝑠𝑒𝑟 … …
49
User (uid int, name string, age int, pop float)

Another exercise
Group (gid string, name string)
Member (uid int, gid string)

• IDs of groups that Lisa doesn’t belong to

Writing a query top-down:
−
All group IDs IDs of Lisa’s groups

𝜋'$% 𝜋'$%

𝐺𝑟𝑜𝑢𝑝 ⋈

𝑀𝑒𝑚𝑏𝑒𝑟 𝜎,-./(""$1-"

𝑈𝑠𝑒𝑟 50
User (uid int, name string, age int, pop float)

A trickier exercise
Group (gid string, name string)
Member (uid int, gid string)

• Who are the most popular?

• Who do NOT have the highest pop rating?
• Whose pop is lower than somebody else’s?
−
𝜋#$% 𝜋21/3, .#$%

𝑈𝑠𝑒𝑟 ⋈21/3, .!5!621/3- .!5!

𝜌21/3, 𝜌21/3-
A deeper question:
When (and why) is “−” needed?
𝑈𝑠𝑒𝑟 𝑈𝑠𝑒𝑟 51
Monotone operators
RelOp What happens
Add more rows to the output?
to the input...
• If some old output rows may need to be removed
• Then the operator is non-monotone
• Otherwise the operator is monotone
• That is, old output rows always remain “correct” when more rows are
added to the input
• Formally, for a monotone operator 𝑜𝑝:
𝑅 ⊆ 𝑅 ; implies 𝑜𝑝 𝑅 ⊆ 𝑜𝑝 𝑅 ; for any 𝑅, 𝑅 ;

52
Monotone operators
RelOp What happens
Add more rows to the output?
to the input...
• If some old output rows may need to be removed
• Then the operator is non-monotone
• Otherwise the operator is monotone
• That is, old output rows always remain “correct” when more rows are
added to the input
uid gid uid gid uid gid
123 gov − 123 gov = 857 abc
857 abc 857 gov
857 abc
53
Monotone operators
RelOp What happens
Add more rows to the output?
to the input...
• If some old output rows may need to be removed
• Then the operator is non-monotone
• Otherwise the operator is monotone
• That is, old output rows always remain “correct” when more rows are
added to the input The old row is always
uid gid uid gid uid gid “correct” no matter
123 gov − 123 gov = 857 abc what is added to R
857 abc 857 gov 693 abc
857 gov
693 abc 54
Classification of relational operators
• Selection: 𝜎2 𝑅 Monotone
• Projection: 𝜋: 𝑅 Monotone
• Cross product: 𝑅×𝑆 Monotone
• Join: 𝑅 ⋈2 𝑆 Monotone
• Natural join: 𝑅 ⋈ 𝑆 Monotone
• Union: 𝑅 ∪ 𝑆 Monotone
• Difference: 𝑅 − 𝑆 Monotone w.r.t. 𝑅; non-monotone w.r.t 𝑆
• Intersection: 𝑅 ∩ 𝑆 Monotone?

55
Is intersection monotonic?
R ∩ S
•R ∩ S = R – (R – S)

= R –( R
? – S )
• Yes!
R1 ⊆ R2 ⇒ S ∩ R1 ⊆ S ∩ R2
Why is “−” needed for “highest”?
• Composition of monotone operators produces a monotone query
• Old output rows remain “correct” when more rows are added to the input
• Is the “highest” query monotone?
• No!
• Current highest pop is 0.9
• Add another row with pop 0.91
• Old answer is invalidated
• So it must use difference!

57
Why do we need core operator 𝑋?
• Difference
• The only non-monotone operator
• Projection
• The only operator that removes columns
• Cross product
• The only operator that adds columns
• Union
• The only operator that allows you to add rows?
• A more rigorous argument?
• Selection?
• Left as an exercise for the viewer

58
Extensions to relational algebra
• Duplicate handling (“bag algebra”)
• Grouping and aggregation

• All these will come up when we talk about SQL

• But for now we will stick to standard relational algebra without
these extensions

59
Why is RA a good query language?
• Simple
• A small set of core operators
• Semantics are easy to grasp
• Declarative?
• Yes, compared with older languages like CODASYL
• Though assembling operators into a query does feel somewhat
“procedural”
• Complete?
• With respect to what?

60
User (uid int, name string, age int, pop float)

Relational Calculus
Group (gid string, name string)
Member (uid int, gid string)

• 𝑢. 𝑢𝑖𝑑 𝑢 ∈ 𝑈𝑠𝑒𝑟 ∧
¬ ∃𝑢. ∈ 𝑈𝑠𝑒𝑟: 𝑢. 𝑝𝑜𝑝 < 𝑢. . 𝑝𝑜𝑝 }, or
• 𝑢. 𝑢𝑖𝑑 𝑢 ∈ 𝑈𝑠𝑒𝑟 ∧
∀𝑢. ∈ 𝑈𝑠𝑒𝑟: 𝑢. 𝑝𝑜𝑝 ≥ 𝑢. . 𝑝𝑜𝑝 }
• Relational algebra = “safe” relational calculus
• Every query expressible as a safe relational calculus query is also expressible as a
relational algebra query
• And vice versa
• Example of an “unsafe” relational calculus query
• 𝑢. 𝑛𝑎𝑚𝑒 ¬ 𝑢 ∈ 𝑈𝑠𝑒𝑟
• Cannot evaluate it just by looking at the database

61
Relational Calculus
• RA and RC form the basis for “real” query languages (e.g. SQL),
and for implementation
• Relational Algebra
• Queries are expressed by applying a sequence of operations to relations
• More operational/procedural, very useful for representing how query executes
• Relational Calculus
• Let’s users describe WHAT they want in first-order logic
• Rather than HOW to compute it
• Non-operational, declarative

62
Codd’s Theorem
• Established equivalence in expressivity between :
• Relational Calculus
• Relational Algebra

• Why an important result?

• Connects declarative representation of queries with operational
description
• Constructive: we can “compile” SQL into relational algebra

63
Limits of relational algebra
• Relational algebra has no recursion
• Example: given relation Friend(uid1, uid2), who can Bart reach in his social
network with any number of hops?
• Writing this query in r.a. is impossible!
• So RA is not as powerful as general-purpose languages (not Turing-
complete)
• But why not?
• Optimization becomes undecidable
• Simplicity is empowering
• Besides, you can always implement it at the application level (and
recursion is added to SQL nevertheless 😱)

65
What’s Next
• Next class:
• Database design in E/R model (recording)
• Sign up Piazza
• https://piazza.com/sfu.ca/summer2024/cmpt354d100
• Access code: qc76kdzsz9l
• Assignment 1 releasing soon
• Stay tuned for RA help tools

Dbms Ptu
No ratings yet
Dbms Ptu
245 pages
CECS 323: Mimi Opkins
No ratings yet
CECS 323: Mimi Opkins
372 pages
An Introduction To Database Systems Bipin C.desaI
No ratings yet
An Introduction To Database Systems Bipin C.desaI
849 pages
Dbms Unit3 Part1
No ratings yet
Dbms Unit3 Part1
35 pages
Unit 1 - DBMS-II BSC
No ratings yet
Unit 1 - DBMS-II BSC
97 pages
Database
No ratings yet
Database
14 pages
Unit 1 - DBMS-II BSC
No ratings yet
Unit 1 - DBMS-II BSC
74 pages
M1 - Intro
No ratings yet
M1 - Intro
56 pages
Database
No ratings yet
Database
11 pages
Lecture 012
No ratings yet
Lecture 012
33 pages
Database SMY PQ
No ratings yet
Database SMY PQ
15 pages
UNIT - I Dbms
No ratings yet
UNIT - I Dbms
50 pages
DB1 Lecture3
No ratings yet
DB1 Lecture3
45 pages
Database Management
No ratings yet
Database Management
7 pages
CH - 11 - Data Base Concepts
No ratings yet
CH - 11 - Data Base Concepts
7 pages
Intro To Data Science Prelims Reviewer
No ratings yet
Intro To Data Science Prelims Reviewer
12 pages
Chapter 9
No ratings yet
Chapter 9
17 pages
Database Management Systems Week 1
No ratings yet
Database Management Systems Week 1
20 pages
Master of Computer Application: Centre For Distance Education Anna University
No ratings yet
Master of Computer Application: Centre For Distance Education Anna University
251 pages
Sending WhatsApp Messages Directly From An Oracle Database Involves Integrating The Database With An External API That Provides WhatsApp Messaging Services
100% (2)
Sending WhatsApp Messages Directly From An Oracle Database Involves Integrating The Database With An External API That Provides WhatsApp Messaging Services
4 pages
Database Management System: Introduction To DBMS Ms. Deepikkaa.S
No ratings yet
Database Management System: Introduction To DBMS Ms. Deepikkaa.S
45 pages
Database Concepts Till Features of MySQL
No ratings yet
Database Concepts Till Features of MySQL
13 pages
01 Relationalmodel
No ratings yet
01 Relationalmodel
5 pages
CHAP 3 Relational Database
No ratings yet
CHAP 3 Relational Database
41 pages
Lec02 Data Models
No ratings yet
Lec02 Data Models
16 pages
Computer Models
No ratings yet
Computer Models
15 pages
Relational Model and Algebra: Introduction To Databases Compsci 316 Fall 2014
No ratings yet
Relational Model and Algebra: Introduction To Databases Compsci 316 Fall 2014
41 pages
Database Concept
No ratings yet
Database Concept
6 pages
CS121Lec01 PDF
No ratings yet
CS121Lec01 PDF
37 pages
Query Execution
No ratings yet
Query Execution
87 pages
Oracle 8i DBA Bible
No ratings yet
Oracle 8i DBA Bible
1,040 pages
Moocs
No ratings yet
Moocs
25 pages
Database
No ratings yet
Database
47 pages
Document
No ratings yet
Document
55 pages
Ramya - Syllabus-& Unit - 1.
No ratings yet
Ramya - Syllabus-& Unit - 1.
111 pages
CS121 Lec 01
No ratings yet
CS121 Lec 01
42 pages
Bajwa A C
No ratings yet
Bajwa A C
4 pages
Lecture 2
No ratings yet
Lecture 2
15 pages
Lecture 5
No ratings yet
Lecture 5
54 pages
01 Introduction
No ratings yet
01 Introduction
4 pages
Chapter 03
No ratings yet
Chapter 03
43 pages
Data Storage and Relational Databases
No ratings yet
Data Storage and Relational Databases
14 pages
Database Systems The Complete Book
No ratings yet
Database Systems The Complete Book
25 pages
Database Management Systems: Instructor: Murali Mani Mmani@cs - Wpi.edu
100% (1)
Database Management Systems: Instructor: Murali Mani Mmani@cs - Wpi.edu
22 pages
Lecture No 04: By: Syed Aun Irtaza
No ratings yet
Lecture No 04: By: Syed Aun Irtaza
10 pages
DBMS Lec No 3
No ratings yet
DBMS Lec No 3
41 pages
Chapter 2 - Database System Concepts and Architecture
No ratings yet
Chapter 2 - Database System Concepts and Architecture
45 pages
CSC301 - Chapter 9
No ratings yet
CSC301 - Chapter 9
36 pages
Lecture #01: Relational Model & Relational Algebra: 1 Databases
No ratings yet
Lecture #01: Relational Model & Relational Algebra: 1 Databases
4 pages
4 PySpark Exercises
No ratings yet
4 PySpark Exercises
7 pages
Chapter 2
No ratings yet
Chapter 2
37 pages
Lab Manual of Database Management System (3330703)
No ratings yet
Lab Manual of Database Management System (3330703)
90 pages
Introduction To Database Systems: Ruoming Jin TTH 9:15 - 10:30pm Spring 2009 RM MSB115
No ratings yet
Introduction To Database Systems: Ruoming Jin TTH 9:15 - 10:30pm Spring 2009 RM MSB115
54 pages
Q. State The Basic Database Concepts
No ratings yet
Q. State The Basic Database Concepts
7 pages
ADM Pranit Micro
100% (1)
ADM Pranit Micro
28 pages
Slides1 Introduction-Merged
No ratings yet
Slides1 Introduction-Merged
96 pages
DWM Unit 1 (2023)
No ratings yet
DWM Unit 1 (2023)
38 pages
Airline Reservation Complete
No ratings yet
Airline Reservation Complete
5 pages
DBMS Unit1
No ratings yet
DBMS Unit1
30 pages
Irt 2 Marks With Answer
No ratings yet
Irt 2 Marks With Answer
15 pages
DBMS Module 2
No ratings yet
DBMS Module 2
125 pages
Lesson 2
No ratings yet
Lesson 2
50 pages
SAP ABAP I3 IT SOLUTIONZ
No ratings yet
SAP ABAP I3 IT SOLUTIONZ
12 pages
DCICT 2 Databases
No ratings yet
DCICT 2 Databases
26 pages
102 Copies Adv Lesson 1
No ratings yet
102 Copies Adv Lesson 1
5 pages
Chapter 8 Data Structures and Caats For Data Extraction
No ratings yet
Chapter 8 Data Structures and Caats For Data Extraction
4 pages
Sana Saeed ROLL # BSMS12143: "Some People Feel The Rain. Others Just Get Wet."
No ratings yet
Sana Saeed ROLL # BSMS12143: "Some People Feel The Rain. Others Just Get Wet."
24 pages
Dbms Lab Experiment List
No ratings yet
Dbms Lab Experiment List
3 pages
Ktustudents - In: 1. Hierarchical Methods
No ratings yet
Ktustudents - In: 1. Hierarchical Methods
21 pages
4.alumini Management System
No ratings yet
4.alumini Management System
8 pages
Installation of MFP-8.0
No ratings yet
Installation of MFP-8.0
18 pages
Solution Assignment 1 - Search Engine Optimization
No ratings yet
Solution Assignment 1 - Search Engine Optimization
5 pages
Database Concepts For MBA
No ratings yet
Database Concepts For MBA
49 pages
Ubd 6
No ratings yet
Ubd 6
9 pages
Radarr Logs-1
No ratings yet
Radarr Logs-1
3 pages
IT1090 IS and DM
No ratings yet
IT1090 IS and DM
7 pages
Checkvote Is 1 (Voted) or 0 (Not Voted)
No ratings yet
Checkvote Is 1 (Voted) or 0 (Not Voted)
9 pages
HND Computing UNIT 38: Database Management: Faisal Saghir
No ratings yet
HND Computing UNIT 38: Database Management: Faisal Saghir
38 pages
4nf &5nf
No ratings yet
4nf &5nf
4 pages
Exams Collection: Exams4Collection Exam Dumps & Exams4Collection Exam Study Material
No ratings yet
Exams Collection: Exams4Collection Exam Dumps & Exams4Collection Exam Study Material
4 pages
DP - 14 - 2 - Practice FAZRULAKMALFADILA - C2C022001
No ratings yet
DP - 14 - 2 - Practice FAZRULAKMALFADILA - C2C022001
4 pages
SQL Server Specialist Self-Test
No ratings yet
SQL Server Specialist Self-Test
2 pages
CS8481 - Set2
No ratings yet
CS8481 - Set2
5 pages
Completed Unit 2 Dbms
No ratings yet
Completed Unit 2 Dbms
34 pages
Kanna Chaitanya - CV
No ratings yet
Kanna Chaitanya - CV
2 pages
Database Systems - Concepts, Languages and Architectures-Paolo Atzeni
No ratings yet
Database Systems - Concepts, Languages and Architectures-Paolo Atzeni
10 pages
What's A Database System?
No ratings yet
What's A Database System?
5 pages
DBMS MASTER: Become Pro in Database Management System
From Everand
DBMS MASTER: Become Pro in Database Management System
Ummed Singh
No ratings yet
100 Puzzles to Learn Data Warehousing
From Everand
100 Puzzles to Learn Data Warehousing
Cristian Scutaru
No ratings yet
SQL Tutorial For Beginners
From Everand
SQL Tutorial For Beginners
HAU DANG
No ratings yet

Lec3 1

Uploaded by

Lec3 1

Uploaded by

Database Systems I

CMPT 354 Summer 2024

• Sign up for Piazza!

A, 1 C, 3 B, 2 A, 4 map(String key, String value):

Values exchanged by shuffle process

A, [1, 4] The reduce function accepts all pairs for a

• 3. Lakehouse = Data Lake + Data Warehouse

• Basics of the Relational Model

• Programmer controls “navigation”: accounts → branches

• The description generally consists of three parts:

• Different from data structure (e.g., list, array)

• Two kinds of operations

• Different from programming languages (e.g., C, Java)

• Different kinds of constraints

• Why does it matter?

• Key-Value Data Model

• Semi-structured Data Model (e.g., Json, XML)

• Structure of the Data

• Structure of the Data

Resembles the hierarchical model

• Pilot in the Royal Air Force in WW2

uid name age pop uid name age pop

uid name age pop uid name

uid name age pop uid name age pop name

• What about: 𝜎!"!#$.& (𝜋'()* (𝑈𝑠𝑒𝑟))

uid name age pop uid gid

• How many rows in result? 123 Milhouse 10 0.2 123 gov

• Duplicates? 857 Lisa 8 0.7 857 gov

uid name age pop uid gid

uid name age pop uid gid

857 Lisa 8 0.7 857 abc

uid gid uid gid uid gid

uid gid uid gid uid gid

𝜋&'( 𝑀𝑒𝑚𝑏𝑒𝑟 ⋈*#+,#$.&'()*#+,#$.&'( ∧ 𝑀𝑒𝑚𝑏𝑒𝑟

𝜌 #$%, ,'$%, 𝜌 #$%- ,'$%-

• Names of users in Lisa’s groups

uid name age pop gid

• IDs of groups that Lisa doesn’t belong to

• Who are the most popular?

𝑈𝑠𝑒𝑟 ⋈21/3, .!5!621/3- .!5!

• All these will come up when we talk about SQL

• Why an important result?

You might also like

𝜋&'( 𝑀𝑒𝑚𝑏𝑒𝑟 ⋈#+,#$.&'()#+,#$.&'( ∧ 𝑀𝑒𝑚𝑏𝑒𝑟