FUNDAMENTALS OF
DATABASES
More Queries in SQL
NGUYEN Hoang Ha
Email: nguyen-hoang.ha@usth.edu.vn
Recall: SELECT Statements →
π σ ρ ⋈c
Projection:You can use the projection capability in SQL to choose the columns in
a table that you want returned by your query.
Selection:You can use the selection capability in SQL to choose the rows in a table
that you want returned by a query (with WHERE clause)
Joining:You can use the join capability in SQL to bring together data that is stored
in
2
different tables by creating a link between them.
QUERIES INVOLVING MORE THAN
ONE TABLES
Reference: Section 6.2 Jeffrey D. Ullman, Jennifer Widom: A First Course in Database
Systems, Pearson, 3rd Edition (2007)
A Cartesian Products example
4
INNER JOIN
SELECT *
FROM student A, class B
WHERE A.class_id = B.class_id
SELECT *
FROM student A INNER JOIN class B ON A.class_id = B.class_id
To determine a student’s class name, you compare the value in the CLASS_ID
column in the CLASS table with the CLASS_ID values in the STUDENT table.
The relationship between the STUDENT and CLASS tables is an INNER JOIN,
a.k.a equijoin — that is, values in the CLASS_ID column on both tables must be
equal.
5
Express INNER JOIN in RA
δA.class_id = B.class_id(A x B)
SELECT *
FROM student A, class B
WHERE A.class_id = B.class_id
6
Non-INNER JOIN
A non-INNER JOIN is a join condition containing
something other than an equality operator.
SELECT A.student_id, B.grade
FROM student A, ranking B
WHERE A.marks >= B.MIN
AND A.marks < B.MAX
7
Express Non-INNER JOIN with Relational Algebra?
δA.mark >= B.min AND A.mark < B.max(A x B)
SELECT A.student_id, B.grade
FROM student A, ranking B
WHERE A.marks >= B.MIN
AND A.mark < B.MAX
8
Outer Joins
If a row does not satisfy a join condition, the row will not appear in the query
result.
The missing rows can be returned if an outer join operator is used in the join
condition.
CLASS STUDENT
CLASS_ID CLASS_NAME CLASS_ID ID NAME
106 Lop 106 106 1A
107 Lop 107 106 2B
201 Lop 201 107 3C
202 Lop 202 107 4D
5E
6F
7G
8H
9
Left Outer Join
List CLASS_NAME of all the class, regardless of whether or not
they have any student or not when joining CLASS and
STUDENT STUDENT
CLASS_ID ID NAME
106 1A
CLASS
106 2B
CLASS_ID CLASS_NAME
106 Lop 106
107 3C
107 Lop 107 107 4D
201 Lop 201 5E
202 Lop 202 6F
7G
8H
SELECT A.class_name
FROM class A LEFT OUTER JOIN student B
10 ON A.class_id = B.class_id
Output of LEFT OUTER JOIN
CLASS STUDENT
CLASS_ID CLASS_NAME CLASS_ID ID NAME
106 Lop 106 106 1A
106 Lop 106 106 2B
107 Lop 107 107 3C
107 Lop 107 107 4D
201 Lop 201
202 Lop 202
The red box is on the Left
11
Right Outer Join
List NAME of all the student, regardless of whether or not they
have any class or not when joining CLASS and STUDENT
STUDENT
CLASS_ID ID NAME
106 1A
CLASS 106 2B
CLASS_ID CLASS_NAME 107 3C
106 Lop 106
107 4D
107 Lop 107
5E
201 Lop 201
202 Lop 202 6F
7G
8H
SELECT B.name
FROM class A RIGHT OUTER JOIN student B
12
ON A.class_id = B.class_id
Output of RIGHT OUTER JOIN
CLASS STUDENT
CLASS_IDCLASS_NAME CLASS_ID ID NAME
106 Lop 106 106 1 A
106 Lop 106 106 2 B
107 Lop 107 107 3 C
107 Lop 107 107 4 D
5 E
6 F
7 G
8 H
The red box is on the Right
13
Full Outer Join
List NAME of all the student and ALL classes, regardless
of whether or not they have any class or not when
joining CLASS and STUDENT
STUDENT
CLASS
CLASS_ID ID NAME
CLASS_ID CLASS_NAME
106 1A
106 Lop 106
106 2B
107 Lop 107
201 Lop 201 107 3C
202 Lop 202 107 4D
5E
6F
7G
8H
SELECT B.name
FROM class A FULL OUTER JOIN student B
14
ON A.class_id = B.class_id
Output of FULL OUTER JOIN
CLASS STUDENT
CLASS_IDCLASS_NAME CLASS_ID ID NAME
106 Lop 106 106 1 A
106 Lop 106 106 2 B
107 Lop 107 107 3 C
107 Lop 107 107 4 D
5 E
6 F
7 G
8 H
201 Lop 201
202 Lop 202
LEFT OUTER JOIN RIGHT OUTER JOIN
15
Why it’s called inner/outter?
❖ SELECT A.class_name, B.name What is the
FROM class A, student B Domain Values of
WHERE A.class_id = B.class_id B.class_id?
What is the
Domain
Values of
A.class_id?
16
Self Joins
Sometimes you need to join a
table to itself.
SELECT A.employee_id + ‘ MANAGED BY ‘ +
B.employee_id
FROM employee A, employee B
WHERE A.Manager_id = B.Employee_ID
17
Cartesian Products
A Cartesian product is formed when the join condition is
omitted → All rows in the first table are joined to all rows
in the second table
To avoid a Cartesian product, always include a valid join
condition in a WHERE clause.
18
Disambiguating attributes
Sometimes we ask a query involving several relations, and
among these relations are two or more attributes with the
same name. If so, we need a way to indicate which of these
attributes is meant by a use of their shared name.
SQL solves this problem by allowing us to place a relation
name and a dot in front of an attribute, thus R.A refers to
the attribute A of relation R
19
Disambiguating attributes example
Two relations:
MovieStar (name, address, gender, birthdate)
MovieExec (name, address, cert#, netWorth)
Both relations have attributes “name” and “address”. Look at the
following query to see the way of dis-ambiguating:
SELECT MovieStar.Name, MovieExec.Name
FROM MovieStar, MovieExec
WHERE MovieStar.Address = MovieExec.Address
20
SET operators
21
UNION & UNION ALL
The UNION operator eliminates any duplicated rows.
But UNION ALL still returns duplicated rows
22
INTERSECT
{A ∩ B} = {a | a is in A and B}
23
Difference
SELECT employee_id, job_id FROM employees
EXCEPT
SELECT employee_id, job_id FROM job_history
SQL Server
Oracle
24
FULL RELATION OPERATIONS
Reference: Section 6.4 Jeffrey D. Ullman, Jennifer Widom: A First Course in Database
Systems, Pearson, 3rd Edition (2007)
SELECT distinct - Eliminating duplicates
When we do projection or select data from a bag-relation,
the output may have duplicate tuples
If we do not want duplicates in the result, then we may
follow the key-word SELECT by the key-word DISTINCT
26
SELECT distinct - Eliminating duplicates
27
Duplicates in Set operators
Unlike the SELECT statement, which preserves duplicates as
a default and only eliminates them when instructed to by the
DISTINCT keyword, the set operators (union, intersection,
difference) normally eliminate duplicates
In order to prevent the elimination of duplicates, we must
follow the operator UNION, INTERSECT or EXCEPT by
the keyword ALL
28
Duplicates in Set operators
The UNION, INTERSECT, EXCEPT operator eliminates any
duplicated rows.
But UNION ALL, INTERSECT ALL, EXCEPT ALL still
returns duplicated rows
29
Grouping and Aggregation Operators
STUDENT
CLASS_ID ID NAME
106 1A
106 2B
107 3C
107 4D
5E
6F
7G
8H
Group functions (or Aggregate Functions) operate on sets of rows
to give one result per group. These sets may be the whole table or
the table split into groups.
Eg: If we want all rows with the same CLASS_ID value will be
groupped: GROUP BY class_id
30
Aggregate Operators
AVG
COUNT
MAX
MIN
SUM
STDDEV
VARIANCE
❖ DISTINCT makes the function consider only nonduplicate values;
❖ ALL makes it consider every value including duplicates.
❖ The default is ALL and therefore does not need to be specified.
31 ❖ All group functions ignore NULL values.
Example: AVG, SUM, MIN, MAX, COUNT
32
Example: Aggregate Functions and DISTINCT
33
Grouping
Until now, all group functions (demonstrated in above
examples) have treated the table as one large group of
information.
At times, you need to divide the table of information into
smaller groups. This can be done by using the GROUP BY
clause
The keyword GROUP BY is followed by a list of grouping
attributes.
34
Grouping
Syntax: Divide rows in a table into smaller groups
by using the GROUP BY clause.
SELECT column, group_function (column)
FROM table
[WHERE conditions]
[GROUP BY group_by_expression]
[ORDER BY {column [ASC | DESC] ,…} ]
group_by_expression: specifies columns whose
values determine the basis for grouping rows
35
Example: Aggregate Functions and GROUP BY
36
Example: Aggregate Functions and GROUP BY
37
Example: Aggregate Functions and GROUP BY
38
Grouping: rules to remember
SELECT column, group_function (column)
FROM table
[WHERE conditions]
[GROUP BY group_by_expression]
[ORDER BY {column [ASC | DESC] ,…} ]
When aggregate functions are used in a select list,
the select list can contain only:
Aggregate functions.
Grouping columns from a GROUP BY clause.
An expression that returns the same value for every
row in the result set of each group, such as a constant
39
Grouping, Aggregation, and NULLs
When tuples have nulls, there are a few rules we
must remember:
The value NULL is ignored in any aggregation: it
does not contribute to a SUM, AVG or COUNT of an
attribute, nor can it be the minimum or maximum in its
column.
Exp: COUNT(*) is always a count of the number of tuples
in a relation; but COUNT(A) is the number of tuples with
non-NULL values for attribute A
On the other hand, NULL is treated as an
ordinary value when forming groups: that is, we can
have a group in which one or more of the grouping
attributes are assigned the value NULL
When we perform any aggregation except COUNT over
an empty bag of values, the result is NULL (the COUNT
of an empty bag is 0)
40
HAVING clauses
Sometimes, we want to exclude some groups from displaying result.
The solution is using HAVING clause
41
HAVING clauses
Syntax:
SELECT column, group_function (column)
FROM table
[WHERE conditions]
[GROUP BY group_by_expression]
[HAVING conditions]
[ORDER BY {column [ASC | DESC] ,…} ]
❖The WHERE clause is used to restrict the rows that
you select
❖But the HAVING clause is used to restrict groups.
42
HAVING clauses
SELECT column, group_function (column)
FROM table
[WHERE conditions]
[GROUP BY group_by_expression]
[HAVING conditions]
[ORDER BY {column [ASC | DESC] ,…} ]
Groups are formed and group functions are calculated
before the HAVING clause is applied to the groups.
In Oracle (not SQL Server), the HAVING clause can
precede the GROUP BY clause, but it is recommended
that you place the GROUP BY clause first because it is
more logical.
43
HAVING clauses
1
Select ROWS (with WHERE clause) first
2
ROWS are grouped (GROUP BY clause)
3 Groups matching the HAVING clause are
displayed
44
Example: HAVING
The example displays department numbers and
maximum salaries for those departments whose
maximum salary is greater than $10,000.
45
Example: HAVING
The example displays the job ID and total monthly salary
for each job with a total payroll exceeding $13,000.
46
Example: HAVING
Group functions can be nested to a depth of two. The
example displays the maximum average salary.
47