Database Systems CMP3101
Lecture #1 October 4th , 2011
Recommended and Reference Books
[1] Hector Garcia-Molina, Jeff Ullman, and Jennifer Widom, 2008. Database Systems: The Complete Book (DS:CB), 2nd Edition , Prentice Hall, ISBN-10:
[2] Jeffrey D. Ullman, Jennifer Widom, 2007. A First Course in Database Systems. Prentice Hall [3] Rebecca Riordan, 2005. Designing Effective Database Systems. Addison- Wesley ISBN 0321290933.
2
Course Format
Lectures Tuesday , 10:00 am - 12:00 pm Tutorials Friday 08:00 am 10: 00 am
Software Tools
MySQL DBMS
Free Downloadable community edition from oracle Better installed as part of a bundle LAMP, XAMP or WAMP
Postgres: download from
download http://www.postgresql.org/download/ Is also installed on lab machines
Xquery: download one interpreter from
Zorba: http://www.zorba-xquery.com/ ( day installation) Galax: http://galax.sourceforge.net/ (great in the past, seems less well maintained) Saxon: http://saxon.sourceforge.net/ (from apache; very popular)
Pig Latin: download from
http://hadoop.apache.org/pig/
Course Outline
History and Overview of Database Systems
(Assignment I Due 11th October, 2011 2-3 pages)
Fundamentals of Database Systems Database Query Languages Data Modelling Relational Database Design Transaction Processing Distributed Databases
5
Rest of Todays Lecture
Overview of DBMS SQL
Dan Suciu -- p544 Fall 2010
Database
What is a database ?
Give examples of databases
Dan Suciu -- p544 Fall 2010
Database
What is a database ? A collection of files storing related data
Give examples of databases Accounts database; payroll database; Makerere students database; Amazons products database; airline reservation database
8
Database Management System
What is a DBMS ?
Give examples of DBMS
Dan Suciu -- p544 Fall 2010
Database Management System
What is a DBMS ? An application (typically written in C ) that provides tools and utilities used to efficiently and effectively manage databases Give examples of DBMS DB2 (IBM), SQL Server (MS), Oracle, Sybase MySQL, Postgres, SQLite, Hadoop
10
An Example
The Internet Movie Database http://www.imdb.com Entities: Actors (800k), Movies (400k), Directors, Relationships: who acted where, who directed what,
11
Tables
Actor:
id fName lName gende r
Cast:
pid mid
195428 Tom
Hanks M
645947 Amy ... Movie:
id
Hanks F
195428 ...
337166
Name
year
337166
...
Toy Story ...
1995 . ..
12
SQL
SELECT *
FROM Actor
13
SQL
SELECT count(*)
FROM Actor
This is an aggregate query
14
SQL
SELECT *
FROM Actor
WHERE lName = Hanks
This is a selection query
15
SQL
SELECT *
FROM Actor, Casts, Movie WHERE lname='Hanks' and Actor.id = Casts.pid and Casts.mid=Movie.id and Movie.year=1995
This query has selections and joins
817k actors, 3.5M casts, 380k movies; How can it be so fast ?
Dan Suciu -- p544 Fall 2010
16
How Can We Evaluate the Query ?
Actor:
id
... ...
Cast:
lName
Hanks
Movie:
mid id ... ... Name year 1995
fNam e
gende r
pid ... ...
17
Optimization and Query Execution
Indexes: on Actor.lName, on Movie.year Query optimization
Access path selection Join order
Statistics
18
Recovery
Transfer 1,000,000 from account #4662 to #7199: X = Read(Account_1); X.amount = X.amount 1,000,000; Write(Account_1, X); Y = Read(Account_2); Y.amount = Y.amount + 1,000,000; Write(Account_2, Y);
19
Recovery
Transfer 1,000,000 from account #4662 to #7199: X = Read(Account_1); X.amount = X.amount 1,000,000; Write(Account_1, X);
CRASH !
Y = Read(Account_2); Y.amount = Y.amount + 1,000,000; Write(Account_2, Y);
What Dan the-- problem ? is Suciu p544 Fall 2010
20
Concurrency Control
How to overdraft your account:
User 1
X = Read(Account); if (X.amount > 100) { dispense_money( ); X.amount = X.amount 100; } else error(Insufficient funds);
User 2
X = Read(Account); if (X.amount > 100) { dispense_money( ); X.amount = X.amount 100; } else error(Insufficient funds);
WhatDan Suciu -- p544 Fall 2010 can go wrong ?
21
Transactions
Recovery Concurrency control
ACID = Atomicity ( = recovery) Consistency Isolation ( = concurrency control) Durability
22
Client/Server Database Architecture
There is one single server that stores the database (called DBMS or RDBMS):
Usually a beefed-up system, e.g. IPROJSRV But can be your own desktop or a huge cluster running a parallel dbms
Many clients running apps and connecting to DBMS
E.g. Microsofts Management Studio Or psql (for postgres) Always: some elses big Java or C++ program
The client talks to the server using JDBC protocol
23
Types of Usages for Databases
OLTP (online-transaction-processing)
Many updates Many point queries: retrieve the record with a given key.
Decision-Support
Many aggregate/group-by queries. Sometimes called data warehouse
24
SQL v.s. noSQL
Reading: SQL Databases v. NoSQL Databases, by Mike Stonebraker, CACM 53(4), 2010.
25
Data Management
Data Management is more than databases ! Here is an example of a problem: Alice sends Bob in random order all the numbers 1, 2, 3, , 100000000000000000000 She does not repeat any number But she misses exactly one Help Bob find out which one is missing !
After you solve it, make it a bit harder: Alice misses exactly ten numbers Help Bob find out which ones are missing !
26
SQL
Data Definition Language (DDL)
Create/alter/delete tables and their attributes Following lectures...
Data Manipulation Language (DML)
Query one or more tables discussed next ! Insert/delete/modify tuples in tables
27
Table name
Tables in SQL
Key
Attribute names
Product
PName
Gizmo Powergizmo SingleTouch MultiTouch Tuples or rows
Price
$19.99 $29.99 $149.99 $203.99
Category
Gadgets Gadgets Photography Household
Manufacturer
GizmoWorks GizmoWorks Canon Hitachi
28
Data Types in SQL
Atomic types:
Characters: CHAR(20), VARCHAR(50) Numbers: INT, BIGINT, SMALLINT, FLOAT Others: MONEY, DATETIME,
Record (tuple)
Has atomic attributes
Table (relation)
A set of tuples
29
Simple SQL Query
Product
PName
Gizmo Powergizmo SingleTouch MultiTouch
Price
$19.99 $29.99 $149.99 $203.99
Category
Gadgets Gadgets Photography Household
Manufacturer
GizmoWorks GizmoWorks Canon Hitachi
SELECT * FROM Product WHERE category=Gadgets
PName Gizmo Price $19.99 $29.99 Category Gadgets Gadgets Manufacturer GizmoWorks GizmoWorks
30
selection
Powergizmo
Simple SQL Query
Product
PName Price Category Manufacturer
Gizmo
Powergizmo SingleTouch MultiTouch
$19.99
$29.99 $149.99 $203.99
Gadgets
Gadgets Photography Household
GizmoWorks
GizmoWorks Canon Hitachi
SELECT PName, Price, Manufacturer FROM Product WHERE Price > 100
selection and projection
PName SingleTouch MultiTouch
Price $149.99 $203.99
Manufacturer Canon Hitachi
31
Details
Case insensitive:
SELECT = Select = select Product = product
BUT: Seattle seattle
Constants:
abc - yes
abc - no
32
Eliminating Duplicates
SELECT DISTINCT category FROM Product
Category Gadgets Photography Household
Compare to:
Category
SELECT category FROM Product
Gadgets Gadgets Photography Household
33
Ordering the Results
SELECT pname, price, manufacturer FROM Product WHERE category=gizmo AND price > 50 ORDER BY price, pname Ties are broken by the second attribute on the ORDER BY list. Ordering is ascending, unless you specify the DESC keyword.
34
PName
Price
Category
Manufacturer
Gizmo
Powergizmo SingleTouch MultiTouch
$19.99
$29.99 $149.99 $203.99
Gadgets
Gadgets Photography Household
GizmoWorks
GizmoWorks Canon Hitachi
SELECT DISTINCT category FROM Product ORDER BY category SELECT Category FROM Product ORDER BY PName SELECT DISTINCT category FROM Product ORDER BY PName
? ? ?
35
Keys and Foreign Keys
Company
CName StockPrice 25 65 15 Country USA Japan Japan
Key
GizmoWorks Canon Hitachi
Product
PName Gizmo Powergizmo SingleTouch MultiTouch Price $19.99 $29.99 $149.99 $203.99 Category Gadgets Gadgets Photography Household Manufacturer GizmoWorks GizmoWorks Canon Hitachi
36
Foreign key
Joins
Product (PName, Price, Category, Manufacturer) Company (CName, stockPrice, Country) Find all products under $200 manufactured in Japan; Join return their names and prices.
SELECT PName, Price FROM Product, Company WHERE Manufacturer=CName AND Country=Japan AND Price <= 200
37
between Product and Company
Joins
Product
PName Gizmo Powergizmo SingleTouch MultiTouch Price $19.99 $29.99 $149.99 $203.99 Category Gadgets Gadgets Photography Household Manufacturer GizmoWorks GizmoWorks Canon Hitachi
Company
Cname GizmoWorks Canon Hitachi StockPrice 25 65 15 Country USA Japan Japan
SELECT PName, Price FROM Product, Company WHERE Manufacturer=CName AND Country=Japan AND Price <= 200
PName SingleTouch
Price $149.99
38
Tuple Variables
Person(pname, address, worksfor) Company(cname, address)
SELECT DISTINCT pname, address FROM Person, Company WHERE worksfor = cname SELECT DISTINCT Person.pname, Company.address FROM Person, Company WHERE Person.worksfor = Company.cname SELECT DISTINCT x.pname, y.address FROM Person AS x, Company AS y WHERE x.worksfor = y.cname
39
Which address ?
In Class
Product (pname, price, category, manufacturer) Company (cname, stockPrice, country) Find all Chinese companies that manufacture products both in the toy category SELECT cname
FROM
WHERE
40
In Class
Product (pname, price, category, manufacturer) Company (cname, stockPrice, country) Find all Chinese companies that manufacture products both in the electronic and toy categories SELECT cname
FROM
WHERE
Dan Suciu -- p544 Fall 2010 41
Meaning (Semantics) of SQL Queries
SELECT a1, a2, , ak FROM R1 AS x1, R2 AS x2, , Rn AS xn WHERE Conditions Answer = {} for x1 in R1 do for x2 in R2 do .. for xn in Rn do if Conditions then Answer = Answer {(a1,,ak)} return Answer
42
Using the Formal Semantics
What do these queries compute ?
SELECT DISTINCT R.A FROM R, S WHERE R.A=S.A SELECT DISTINCT R.A FROM R, S, T WHERE R.A=S.A OR R.A=T.A Returns R S
If S and T then returns R (S T) else returns
43
Joins Introduce Duplicates
Product (pname, price, category, manufacturer) Company (cname, stockPrice, country) Find all countries that manufacture some product in the Gadgets category. SELECT Country FROM Product, Company WHERE Manufacturer=CName AND Category=Gadgets
44
Joins Introduce Duplicates
Product
Name Gizmo Powergizmo SingleTouch MultiTouch Price $19.99 $29.99 $149.99 $203.99 Category Gadgets Gadgets Photography Household Manufacturer GizmoWorks GizmoWorks Canon Hitachi
Company
Cname GizmoWorks Canon Hitachi StockPrice 25 65 15 Country USA Japan Japan
SELECT Country FROM Product, Company WHERE Manufacturer=CName AND Category=Gadgets
Country
Duplicates ! Remember to add DISTINCT
USA USA
45
Subqueries
A subquery is another SQL query nested inside a larger query Such inner-outer queries are called nested queries A subquery may occur in:
1. A SELECT clause 2. A FROM clause 3. A WHERE clause
Rule of thumb: avoid writing nested queries when possible; keep in mind that sometimes its impossible
46
1. Subqueries in SELECT
Product ( pname, price, company) Company(cname, city)
For each product return the city where it is manufactured
SELECT X.pname, (SELECT Y.city FROM Company Y WHERE Y.cname=X.company) FROM Product X
What happens if the subquery returns more than one city ?
47
1. Subqueries in SELECT
Product ( pname, price, company) Company(cname, city)
Whenever possible, dont use a nested queries:
SELECT pname, (SELECT city FROM Company WHERE cname=company) FROM Product
=
SELECT pname, city FROM Product, Company WHERE cname=company Dan Suciu -- p544 Fall 2010
We have unnested the query
48
1. Subqueries in SELECT
Product ( pname, price, company) Company(cname, city)
Compute the number of products made in each city
SELECT DISTINCT city, (SELECT count(*) FROM Product WHERE cname=company) FROM Company
Better: we can unnest by using a GROUP BY (next lecture)
49
2. Subqueries in FROM
Product ( pname, price, company) Company(cname, city)
Find all products whose prices is > 20 and < 30
SELECT X.city FROM (SELECT * FROM Product AS Y WHERE Y.price > 20) AS X WHERE X.price < 30
Unnest this query !
50
3. Subqueries in WHERE
Product ( pname, price, company) Existential quantifiers Company( cname, city) Find all cities that make some products with price < 100
Using EXISTS:
SELECT DISTINCT Company.city FROM Company WHERE EXISTS (SELECT * FROM Product WHERE company = cname and Produc.price < 100)
51
3. Subqueries in WHERE
Product ( pname, price, company) Existential quantifiers Company( cname, city) Find all cities that make some products with price < 100
Predicate Calculus (a.k.a. First Order Logic) { y | x. Company(x,y) (z. p. Product(z,p,x) p < 100) }
52
3. Subqueries in WHERE
Product ( pname, price, company) Existential quantifiers Company( cname, city) Find all cities that make some products with price < 100
Using IN
SELECT DISTINCT Company.city FROM Company WHERE Company.cname IN (SELECT Product.company FROM Product WHERE Produc.price < 100)
53
3. Subqueries in WHERE
Product ( pname, price, company) Existential quantifiers Company( cname, city) Find all cities that make some products with price < 100
Using ANY:
SELECT DISTINCT Company.city FROM Company WHERE 100 > ANY (SELECT price FROM Product WHERE company = cname)
54
3. Subqueries in WHERE
Product ( pname, price, company) Existential quantifiers Company( cname, city) Find all cities that make some products with price < 100
Now lets unnest it:
SELECT DISTINCT Company.cname FROM Company, Product WHERE Company.cname = Product.company and Product.price < 100
Existential quantifiers are easy !
55
3. Subqueries in WHERE
Product ( pname, price, company) Universal quantifiers Company( cname, city)
Find all cities with companies that make only products with price < 100
Universal quantifiers are hard !
56
3. Subqueries in WHERE
Product ( pname, price, company) Universal quantifiers Company( cname, city)
Find all cities with companies that make only products with price < 100 Predicate Calculus (First Order Logic) { y | x. Company(x,y) (z. p. Product(z,p,x) p < 100) }
57
3. Subqueries in WHERE
De Morgans Laws: (A B) = A B (A B) = A B x. P(x) = x. P(x) x. P(x) = x. P(x)
(A B) = A B
{ y | x. Company(x,y) (z. p. Product(z,p,x) p < 100) } = { y | x. Company(x,y) (zp. Product(z,p,x) p 100) } = { y | x. Company(x,y)) } { y | x. Company(x,y) Dan Suciu -- p544 Fall 2010 (zp. Product(z,p,x) p 100) } 58
3. Subqueries in WHERE
1. Find the other companies: i.e. s.t. some product 100
SELECT DISTINCT Company.city FROM Company WHERE Company.cname IN (SELECT Product.company FROM Product WHERE Produc.price >= 100
2. Find all companies s.t. all their products have price < 100
SELECT DISTINCT Company.city FROM Company WHERE Company.cname NOT IN (SELECT Product.company FROM Product WHERE Produc.price >= 100
Dan Suciu -- p544 Fall 2010
59
3. Subqueries in WHERE
Product ( pname, price, company) Universal quantifiers Company( cname, city)
Find all cities with companies that make only products with price < 100 Using EXISTS:
SELECT DISTINCT Company.city FROM Company WHERE NOT EXISTS (SELECT * FROM Product WHERE company = cname and Produc.price >= 100)
60
3. Subqueries in WHERE
Product ( pname, price, company) Universal quantifiers Company( cname, city)
Find all cities with companies that make only products with price < 100 Using ALL:
SELECT DISTINCT Company.city FROM Company WHERE 100 > ALL (SELECT price FROM Product WHERE company = cname)
61
Monotone Queries
A query Q is monotone if:
Whenever we add tuples to one or more of the tables the answer to the query cannot contain fewer tuples
Fact: all unnested queries are monotone
Proof: using the nested for loops semantics
Fact: A query a universal quantifier is not monotone
Consequence: we cannot unnest a query with a universal quantifier
62
Queries that must be nested
Queries with universal quantifiers or with negation The drinkers-bars-beers example next This is a famous example from textbook on databases by Ullman
Rule of Thumb: Non-monotone queries cannot be unnested. In particular, queries with a universal quantifier cannot be unnested
63
The drinkers-bars-beers example
Likes(drinker, beer) Frequents(drinker, bar) Serves(bar, beer)
Challenge: write these in SQL
Find drinkers that frequent some bar that serves some beer they like. x: y. z. Frequents(x, y)Serves(y,z)Likes(x,z)
Find drinkers that frequent only bars that serves some beer they like. x: y. Frequents(x, y) (z. Serves(y,z)Likes(x,z))
Find drinkers that frequent some bar that serves only beers they like. x: y. Frequents(x, y)z.(Serves(y,z) Likes(x,z))
Find drinkers that frequent only bars that serves only beer they like. x: y. Frequents(x, y) z.(Serves(y,z) Likes(x,z)) Dan Suciu -- p544 Fall 2010
64
Aggregation
SELECT avg(price) FROM Product WHERE maker=Toyota SELECT count(*) FROM Product WHERE year > 1995
SQL supports several aggregation operations: sum, count, min, max, avg
Except count, all aggregations apply to a single attribute
65
Aggregation: Count
COUNT applies to duplicates, unless otherwise stated: SELECT Count(category) FROM Product WHERE year > 1995 We probably want: SELECT Count(DISTINCT category) FROM Product WHERE year > 1995
66
same as Count(*)
More Examples
Purchase(product, date, price, quantity)
SELECT Sum(price * quantity) FROM Purchase
What do they mean ? SELECT Sum(price * quantity) FROM Purchase WHERE product = bagel
67
Simple Aggregations
Purchase
Product Bagel Bagel Banana Price 3 1.50 0.5 Quantity 20 20 50
Banana
Banana
2
4
10
10
90 (= 60+30)
68
SELECT Sum(price * quantity) FROM Purchase WHERE product = Bagel
Grouping and Aggregation
Purchase(product, price, quantity) Find total quantities for all sales over $1, by product.
SELECT FROM WHERE GROUP BY
product, Sum(quantity) AS TotalSales Purchase price > 1 product
Lets see what this means
69
Grouping and Aggregation
1. Compute the FROM and WHERE clauses.
2. Group by the attributes in the GROUPBY
3. Compute the SELECT clause, including aggregates.
70
1&2. FROM-WHERE-GROUPBY
Product Bagel Bagel Banana Banana Banana Price 3 1.50 0.5 2 4 Quantity 20 20 50 10 10
71
3. SELECT
Product Bagel Bagel Banana Banana Banana Price 3 1.50 0.5 2 4 Quantit y 20 20 50 10 10
Product TotalSales
Bagel Banana 40 20
SELECT FROM WHERE GROUP BY
product, Sum(quantity) AS TotalSales Purchase price > 1 product
72
GROUP BY v.s. Nested Quereis
SELECT FROM WHERE GROUP BY product, Sum(quantity) AS TotalSales Purchase price > 1 product
SELECT DISTINCT x.product, (SELECT Sum(y.quantity) FROM Purchase y WHERE x.product = y.product AND price > 1) AS TotalSales FROM Purchase x WHERE price > 1 Why twice ? 73
Another Example
SELECT product, sum(quantity) AS SumSales max(price) AS MaxQuantity FROM Purchase GROUP BY product
What does it mean ?
Rule of thumb: Every group in a GROUP BY is non-empty ! If we want to include empty groups in the output, then we need either a subquery, or a left outer join (see later)
74
HAVING Clause
Same query, except that we consider only products that had at least 100 buyers. SELECT product, Sum(quantity) FROM Purchase WHERE price > 1 GROUP BY product HAVING Sum(quantity) > 30 HAVING clause contains conditions on aggregates.
75
General form of Grouping and Aggregation
SELECT S FROM R1,,Rn WHERE C1 GROUP BY a1,,ak HAVING C2
Why ?
S = may contain attributes a1,,ak and/or any aggregates but NO OTHER ATTRIBUTES C1 = is any condition on the attributes in R1,,Rn C2 = is any condition on aggregate expressions
76
General form of Grouping and Aggregation
SELECT S FROM R1,,Rn WHERE C1 GROUP BY a1,,ak HAVING C2
Evaluation steps: 1. Evaluate FROM-WHERE, apply condition C1
2.
3. 4.
Group by the attributes a1,,ak
Apply condition C2 to each group (may have aggregates) Compute aggregates in S and return the result
77
Advanced SQLizing
1. Unnesting Aggregates
2. Finding witnesses
78
Unnesting Aggregates
Product ( pname, price, company) Company(cname, city)
Find the number of companies in each city
SELECT DISTINCT city, (SELECT count(*) FROM Company Y WHERE X.city = Y.city) FROM Company X
SELECT city, count(*) FROM Company GROUP BY city
Equivalent queries
Note: no need for DISTINCT (DISTINCT is the same as GROUP BY) 79
Unnesting Aggregates
Product ( pname, price, company) Company(cname, city)
Find the number of products made in each city
SELECT DISTINCT X.city, (SELECT count(*) FROM Product Y, Company Z WHERE Y.cname=Z.company AND Z.city = X.city) FROM Company X SELECT X.city, count(*) FROM Company X, Product Y WHERE X.cname=Y.company GROUP BY X.city
They are NOT equivalent ! (WHY?)
80
More Unnesting
Author(login,name) Wrote(login,url) Find authors who wrote 10 documents:This is SQL by Attempt 1: with nested queries
a novice
SELECT DISTINCT Author.name FROM Author WHERE count(SELECT Wrote.url FROM Wrote WHERE Author.login=Wrote.login) > 10
81
More Unnesting
Find all authors who wrote at least 10 documents: Attempt 2: SQL style (with GROUP BY)
SELECT Author.name FROM Author, Wrote WHERE Author.login=Wrote.login GROUP BY Author.name HAVING count(wrote.url) > 10 This is SQL by an expert
82
Finding Witnesses
Store(sid, sname) Product(pid, pname, price, sid)
For each store, find its most expensive products
83
Finding Witnesses
Finding the maximum price is easy
SELECT Store.sid, max(Product.price) FROM Store, Product WHERE Store.sid = Product.sid GROUP BY Store.sid
But we need the witnesses, i.e. the products with max price
84
Finding Witnesses
To find the witnesses, compute the maximum price in a subquery SELECT Store.sname, Product.pname FROM Store, Product, (SELECT Store.sid AS sid, max(Product.price) AS p FROM Store, Product WHERE Store.sid = Product.sid GROUP BY Store.sid, Store.sname) X WHERE Store.sid = Product.sid and Store.sid = X.sid and Product.price = X.p
85
Finding Witnesses
There is a more concise solution here:
SELECT Store.sname, x.pname FROM Store, Product x WHERE Store.sid = x.sid and x.price >= ALL (SELECT y.price FROM Product y WHERE Store.sid = y.sid)
86
NULLS in SQL
Whenever we dont have a value, we can put a NULL Can mean many things:
Value does not exists Value exists but is unknown Value not applicable Etc.
The schema specifies for each attribute if can be null (nullable attribute) or not How does SQL cope with tables that have NULLs ?
87
Null Values
If x= NULL then 4*(3-x)/7 is still NULL If x= NULL then x=Joe is UNKNOWN In SQL there are three boolean values:
FALSE = 0 UNKNOWN = 0.5 TRUE =1
88
Null Values
C1 AND C2 = min(C1, C2) C1 OR C2 = max(C1, C2) NOT C1 = 1 C1
SELECT * FROM Person WHERE (age < 25) AND (height > 6 OR weight > 190)
E.g. age=20 heigth=NULL weight=200
Rule in SQL: include only tuples that yield TRUE
89
Null Values
Unexpected behavior:
SELECT * FROM Person WHERE age < 25 OR age >= 25 Some Persons are not included !
90
Null Values
Can test for NULL explicitly:
x IS NULL x IS NOT NULL
SELECT * FROM Person WHERE age < 25 OR age >= 25 OR age IS NULL
Now it includes all Persons
91
Outerjoins
Product(name, category) Purchase(prodName, store) join: An inner
SELECT Product.name, Purchase.store FROM Product, Purchase WHERE Product.name = Purchase.prodName Same as: SELECT Product.name, Purchase.store FROM Product JOIN Purchase ON Product.name = Purchase.prodName But Products that never sold will be lost !
92
Outerjoins
Product(name, category) Purchase(prodName, store)
If we want the never-sold products, need an outerjoin:
SELECT Product.name, Purchase.store FROM Product LEFT OUTER JOIN Purchase ON Product.name = Purchase.prodName
93
Product
Name Gizmo Camera OneClick Category gadget Photo Photo Name Gizmo Camera Camera OneClick
Purchase
ProdName Gizmo Camera Camera Store Wiz Ritz Wiz NULL
94
Store Wiz Ritz Wiz
Application
Compute, for each product, the total number of sales in September Product(name, category) Purchase(prodName, month, store) SELECT Product.name, count(*) FROM Product, Purchase WHERE Product.name = Purchase.prodName and Purchase.month = September GROUP BY Product.name
Whats wrong ?
95
Application
Compute, for each product, the total number of sales in September Product(name, category) Purchase(prodName, month, store)
SELECT Product.name, count(store) FROM Product LEFT OUTER JOIN Purchase ON Product.name = Purchase.prodName and Purchase.month = September GROUP BY Product.name Now we also get the products who sold in 0 quantity
96
Outer Joins
Left outer join:
Include the left tuple even if theres no match
Right outer join:
Include the right tuple even if theres no match
Full outer join:
Include the both left and right tuples even if theres no match
97