SC4x W2L2 v2
SC4x W2L2 v2
ctl.mit.edu
Motivating questions
• How can we narrow the sets of records that get
returned from a query?
3
WHERE IN
• WHERE attribute IN is used to select rows that are
satisfied by a set of WHERE conditions on the same attribute:
SELECT *
FROM Offices
WHERE State IN ('CO', 'UT', 'TX');
• Equivalent to:
SELECT *
FROM Offices
WHERE State = 'CO' OR State = 'UT'
OR State = 'TX';
OfficeNbr City State Region Target Sales Phone
1 Denver CO West 3000000.00 130000.00 970.586.3341
57 Dallas TX West 0.00 0.00 214.781.5342
4
BETWEEN keyword
• Select records where the attribute value is between two
numbers using BETWEEN
• Range is inclusive and also works with time and date data
SELECT *
FROM SalesReps
WHERE Quota BETWEEN 50000 AND 100000;
RepNbr Name RepOffice Quota Sales
53 Bill Smith 1 100000.00 0.00
89 Jen Jones 2 50000.00 130000.00
SELECT *
FROM SalesReps
WHERE Quota NOT BETWEEN 50000 AND 100000;
RepNbr Name RepOffice Quota Sales
5
NULL values
• Empty or missing values are stored in tables as NULL
6
NULLs
• The following two sets of queries will not return all sales reps:
7
NULLs
• Check for NULLS using IS:
SELECT Name
FROM SalesReps
WHERE Quota IS NULL;
SELECT Name
FROM SalesReps
WHERE Quota IS NOT NULL;
8
Key points from lesson
• WHERE IN statements are used to identify records in a table
with an attribute matching a value from a specified set of
values
9
Grouping Data
and Statistical Functions
10
GROUP BY with COUNT(*)
• Find the number of sales for each customer:
11
GROUP BY with COUNT(*) and HAVING
• Return the number of parts from each vendor:
GROUP BY Vendor A 4
HAVING COUNT(*) > 2;
12
Aggregate Statistical Functions in SQL
• Statistical functions are available in SQL to perform analytics
• Commonly used functions include:
Name Description
AVG() Return the average value of the argument
COUNT() Return a count of the number of rows returned
COUNT(DISTINCT) Return the count of a number of different values
MAX() Return the maximum value
MIN() Return the minimum value
STD() Return the population standard deviation
STDDEV_SAMP() Return the sample standard deviation
SUM() Return the sum
VAR_POP() Return the population standard variance
VAR_SAMP() Return the sample variance
VARIANCE() Return the population standard variance
http://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html 13
Advanced Statistical Functions in SQL
• More advanced statistical functions can be created using the
basic statistical functions built into SQL
• Calculate the weighted average of student scores from the
table of grades:
SELECT SUM(student_score*score_weight)/
SUM(score_weight)
FROM grades;
14
Key points from lesson
• Built-in statistical functions exist in many implementations of
SQL
15
Sorting and Sampling Data
16
ORDER BY
• Order by one column, ascending, ASC, or descending, DESC
SELECT *
FROM Customers
ORDER BY Country DESC;
SELECT *
FROM Customers
ORDER BY Country ASC, CustomerName DESC;
17
LIMIT the number of returned records
• Example to return 5 people from the Persons table:
SELECT *
FROM Persons
ORDER BY Last_Name ASC
LIMIT 5;
18
Randomly select and order records
• RAND function can generate random numbers for various uses
19
Randomly select and order records
• Generate a random number in the output results:
20
Key points from lesson
• The ORDER BY clause specifies that the results from a query
should be returned in ascending or descending order
21
Creating New Tables
and Aliases
22
AS Keyword (Aliases)
• AS can be used to create an alias for a field
23
CREATE TABLE AS
• Use CREATE TABLE with AS to create a new table in the
database using a select query
24
SELECT INTO
• Take the results of a select statement and put them in an
existing table or database:
SELECT column_name(s)
INTO newtable [IN externaldb]
FROM table1;
• Example:
25
Key points from lesson
• The AS keyword creates an alias for an attribute or result of a
function that is returned in a query
26
Joining Multiple Tables
27
Introduction to JOINs
• Last week, our focus was on taking large datasets and
normalizing them by separating the data into different tables
28
JOIN details
• Joins effectively merge two tables together based on a
relationship between different columns in each table
• Both tables will still contain the normal separated data, and data
can still be added to the individual tables
29
Columns in a JOIN
• Joins may be done on any columns in two tables, as long as the
merge operation makes logical sense
30
JOIN Example
• List all orders, showing order number and amount along with
the name and credit limit of each customer
◼ Orders table has order number and amount, but no
order info
• Cust = CustNbr OrderNbr Cust Prod Qty Amt Disc
1 211 Bulldozer 7 31000.00 0.20
2 522 Riveter 2 4000.00 0.30
3 522 Crane 1 500000.00 0.40
SELECT *
FROM tb1, tb2;
SELECT *
FROM tb1, tb2
WHERE tb1.bg = tb2.bg;
SELECT grey, pink
FROM tb1, tb2
WHERE tb1.bg = tb2.bg;
36
Further notes on JOINs
• Use * carefully in joins – it returns all columns from all tables
being joined
• If a field has the same name in the tables being joined, specify
the table name along with the field name:
◼ table1.fieldname, table2.fieldname
◼ Customers.CustNbr, Orders.Amt
37
Key points from lesson
• The relational database model allows us join multiple tables
to build new and unanticipated relationships
38
Types of JOINs and VIEWs
39
Join with 3 tables
SELECT OrderNbr, Amt, Company, Name
FROM Orders, Customers, SalesReps
WHERE Cust = CustNbr AND CustRep = RepNbr AND
Amt > 25000;
-- (Implicit syntax) OrderNbr Cust Prod Qty Amt Disc
1 211 Bulldozer 7 31000.00 0.20
-- (SQL-92 syntax)
• Advantages:
◼ User queries are simpler on views constructed for them
◼ Security: can restrict access to data in views for users
◼ Independence: user or program are not affected by small
changes in underlying tables
45
VIEWs
CREATE VIEW CustomerOrders AS
SELECT CustNbr, Company, Name, OrderNbr, Prod, Qty, Amt
FROM Customers, SalesReps, Orders
WHERE CustRep = RepNbr AND CustNbr = Cust;
-- (Implicit syntax)
OrderNbr Cust Prod Qty Amt Disc
1 211 Bulldozer 7 31000.00 0.20
2 522 Riveter 2 4000.00 0.30
3 522 Crane 1 500000.00 0.40
48
Course update and closing
• We’ve learned
how to SELECT
data from two or
more tables that
we JOIN together
49