SQL 1
SQL 1
• Database
– is collection of related data and its metadata organized in a structured
format
– for optimized information management
• Database System
– is an integrated system of hardware, software, people, procedures,
and data
– that define and regulate the collection, storage, management, and use
of data within a database environment
1
Database Management System
- manages interaction between end users and database
2
Database System Environment
▪ Hardware
▪ Software
- OS
- DBMS
- Applications
▪ People
▪ Procedures
▪ Data
3
Database: Data Models
• Importance
– Abstraction of complex real-word data structures in
relative simple (graphical) representations
– Facilitate interaction among the designer, the applications
programmer, and the end user
5
Relational Database: Example
■ Provides a logical “human-level” view of the data and associations
among groups of data (i.e., tables)
6
Relational Database: Pros & Cons
• Advantages
– Structural independence
• Separation of database design and physical data storage/access
• Easier database design, implementation, management, and use
– Ad hoc query capability with Structured Query Language (SQL)
• SQL translates user queries to codes
• Disadvantages
– Substantial hardware and system software overhead
• more complex system
– Poor design and implementation is made easy
• ease-of-use allows careless use of RDBMS
7
Entity Relationship Model
• Peter Chen’s Landmark Paper in 1976
– “The Relationship Model: Toward a Unified View of Data”
– Graphical representation of entities and their relationships
8
Entity Relationship Model
• Entity Relationship (ER) Model
9
E-R Diagram: Chen Model
• Entity
– represented by a rectangle
with its name in capital
letters.
• Relationships
– represented by an active or
passive verb inside the
diamond that connects the
related entities.
• Connectivities
– i.e., types of relationship
– written next to each entity
box.
Database Systems: Design, Implementation, & Management: Rob & Coronel
10
E-R Diagram: Crow’s Foot Model
• Entity
– represented by a rectangle
with its name in capital
letters.
• Relationships
– represented by an active or
passive verb that connects
the related entities.
• Connectivities
– indicated by symbols next to
entities.
• 2 vertical lines for 1
• “crow’s foot” for M
11
Crow’s Foot Model Symbols
12
E-R Model: Pros & Cons
• Advantages
• Disadvantages
– Incomplete model on its own
• Limited representational power
– cannot model data constraints not tied to entity relationships
» e.g. attribute constraints
– cannot represent relationships between attributes within
entities
• No data manipulation language (e.g. SQL)
– Loss of information content
• Hard to include attributes in ERD 13
Web Database
• Internet is emerging as a prime business tool
– Shift away from models (e.g. relational)
– Emphasis on interfacing with the Internet
14
15
SQL Introduction
Standard language for querying and manipulating data
16
SQL
• Data Definition Language (DDL)
– Create/alter/delete tables and their attributes
• Data Manipulation Language (DML)
– Query one or more tables
– Insert/delete/modify rows/tuples in tables
• Transact-SQL
– Idea: package a sequence of SQL statements server
17
Data in SQL
1. Atomic types, a.k.a. data types
2. Tables built from atomic types
18
Data Types in SQL
• Characters:
– CHAR(20) -- fixed length
– VARCHAR(40)-- variable length
• Numbers:
– BIGINT, INT, SMALLINT, TINYINT
– REAL, FLOAT -- differ in precision
– MONEY
• Times and dates:
– DATE
– DATETIME -- SQL Server
• Others... All are simple
19
Table name Attribute names
Tables in SQL
Product
Tuples or rows
20
Tables Explained
• A tuple = a record
– Restriction: all attributes are of atomic type
• A table = a set of tuples
– Like a list…
– …but it is unordered: no first(), no next(), no last().
21
Tables Explained
• The schema of a table is the table name and
its attributes:
Product(PName, Price, Category, Manfacturer)
22
SQL Query
SELECT attributes
FROM relations (possibly multiple)
WHERE conditions (selections)
23
Simple SQL Query
SELECT *
FROM Product
WHERE category=‘Gadgets’
24
Simple SQL Query
25
A Notation for SQL Queries
Input Schema
Answer(PName, Price,
Manfacturer)
Output Schema
26
Selections
What goes in the WHERE clause:
• x = y, x < y, x <= y, etc
– For number, they have the usual meanings
– For CHAR and VARCHAR: lexicographic ordering
• Expected conversion between CHAR and VARCHAR
– For dates and times, what you expect...
• Pattern matching on strings...
27
The LIKE operator
• s LIKE p: pattern matching on strings
• p may contain two special symbols:
▪ % = any sequence of characters
▪ _ = any single character
SELECT *
FROM Products
WHERE PName LIKE
‘%gizmo%’
28
Eliminating Duplicates
PName Price Category Manufacturer
Produc
Gizmo $19.99 Gadgets GizmoWorks
t
Powergizmo $29.99 Gadgets GizmoWorks
SingleTouch $149.99 Photography Canon
MultiTouch $203.99 Household Hitachi
Category
Gadgets
SELECT category
Gadgets
FROM Product
Photography
Household
Compare to:
Category
SELECT DISTINCT category Gadgets
FROM Product
Photography
Household 29
Ordering the Results
Ordering is ascending, unless you specify the DESC keyword (at the end).
Ties are broken by the second attribute on the ORDER BY list, etc.
30
Ordering the Results
SELECT category
FROM Product
ORDER BY pname
?
Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks
SingleTouch $149.99 Photography Canon
MultiTouch $203.99 Household Hitachi
31
Ordering the Results
Category
SELECT DISTINCT category Gadgets
FROM Product
Household
ORDER BY category
Photography
Compare to:
SELECT category
FROM Product
ORDER BY pname
?
32
Joins in SQL
• Connect two or more tables:
34
Joins in SQL
Produc
Compan
t
y
PName Price Category Manufacturer Cname StockPrice Country
Gizmo $19.99 Gadgets GizmoWorks
GizmoWorks 25 USA
Powergizmo $29.99 Gadgets GizmoWorks Canon 65 Japan
SingleTouch $149.99 Photography Canon Hitachi 15 Japan
MultiTouch $203.99 Household Hitachi
PName Price
SingleTouch $149.99
35
Joins
Product (pname, price, category, manufacturer)
Company (cname, stockPrice, country)
SELECT country
FROM Product, Company
WHERE manufacturer=cname AND category=‘Gadgets’
36
Joins in SQL
Produc
Compan
t
y
Name Price Category Manufacturer Cname StockPrice Country
Gizmo $19.99 Gadgets GizmoWorks
GizmoWorks 25 USA
Powergizmo $29.99 Gadgets GizmoWorks Canon 65 Japan
SingleTouch $149.99 Photography Canon Hitachi 15 Japan
MultiTouch $203.99 Household Hitachi
SELECT country
FROM Product, Company
WHERE manufacturer=cname AND
category=‘Gadgets’
Country
What is
??
the problem
??
?
What’s the
solution ? 37
Joins
Product (pname, price, category, manufacturer)
Purchase (buyer, seller, store, product)
Person(persname, phoneNumber, city)
38
How to know two tables related?
• You guess they are
• I tell you so
• Foreign keys are a method for schema designers to
tell you so
– A foreign key states that a column is a reference to the key
of another table
ex: Product.manufacturer is foreign key of Company
Product (pname, price, category, manufacturer)
Company (cname, stockPrice, country)
39
Disambiguating Attributes
• Sometimes two relations have the same attr:
Person(pname, address, worksfor)
Company(cname, address)
Which
SELECT DISTINCT pname, address address ?
FROM Person, Company
WHERE worksfor = cname
Find all stores that sold at least one product that the store
‘BestBuy’ also sold:
Answer (store)
41
Meaning (Semantics) of SQL
Queries
SELECT a1, a2, …, ak
FROM R1 AS x1, R2 AS x2, …, Rn AS xn
WHERE Conditions
1. Nested loops:
Answer = {}
for x1 in R1 do
for x2 in R2 do
…..
for xn in Rn do
if Conditions
then Answer = Answer ∪ {(a1,…,ak)}
return Answer
42
Example table creation
Employee
Emp_Name Dept_no Gender Age salary
● When you insert a record and you have values for all attributes, there is no need to
specify the attributes names.
– Insert into employee
Values (‘Suzy Alan’, 10, ‘F’, 1200);
The main condition in this case, that both tables have the same attributes
and ordered in the same order
Delete
● Delete certain rows (depending on a
condition)
– Delete from employee where age<30;
• Truncate Syntax:
Truncate Table TableName, eg. : Truncate table employee
– Quicker way for deleting all the rows from a table
– It releases the space used by the table
Drop Syntax:
Drop TABLE TableName, eg. : Drop table employee
– Remove the table completely from the database
Relational Database Schema
• Examples:
U4A: DELETE FROM EMPLOYEE
WHERE LNAME='Brown’
• In this request, the modified SALARY value depends on the original SALARY
value in each tuple
• The reference to the SALARY attribute on the right of = refers to the old SALARY
value before modification
• The reference to the SALARY attribute on the left of = refers to the new SALARY
value after modification
AGGREGATE FUNCTIONS
• Include COUNT, SUM, MAX, MIN, and AVG
• Query 15: Find the maximum salary, the minimum salary, and
the average salary among all employees.
Slide 8-54
AGGREGATE FUNCTIONS (cont.)
• Query 16: Find the maximum salary, the minimum salary,
and the average salary among employees who work for
the 'Research' department.
Slide 8-55
AGGREGATE FUNCTIONS (cont.)
• Queries 17 and 18: Retrieve the total number of employees
in the company (Q17), and the number of employees in the
'Research' department (Q18).
Slide 8-56
GROUPING
• In many cases, we want to apply the aggregate
functions to subgroups of tuples in a relation
• Each subgroup of tuples consists of the set of tuples
that have the same value for the grouping
attribute(s)
• The function is applied to each subgroup
independently
• SQL has a GROUP BY-clause for specifying the
grouping attributes, which must also appear in the
SELECT-clause
Slide 8-57
GROUPING (cont.)
• Query 20: For each department, retrieve the department number, the
number of employees in the department, and their average salary.
Slide 8-58
GROUPING (cont.)
• Query 21: For each project, retrieve the project number, project
name, and the number of employees who work on that project.
– In this case, the grouping and functions are applied after the joining of
the two relations
Slide 8-59
THE HAVING-CLAUSE
• Sometimes we want to retrieve the values of
these functions for only those groups that
satisfy certain conditions
• The HAVING-clause is used for specifying a
selection condition on groups (rather than on
individual tuples)
Slide 8-60
THE HAVING-CLAUSE (cont.)
• Query 22: For each project on which more than two
employees work , retrieve the project number, project
name, and the number of employees who work on
that project.
Slide 8-61
Why Transactions?
• Database systems are normally being accessed
by many users or processes at the same time.
– For queries and modifications.
• a DBMS needs to keep processes from
troublesome interactions.
62
Example: Bad Interaction
63
Example: Bad Interaction
• You and your friend has a joint account. Both
trying to withdraw $100 from different ATM’s
at about the same time.
– The DBMS better make sure one account
deduction doesn’t get lost.
64
Transactions
• Transaction = process involving database
queries and/or modification.
• Normally with some strong properties
regarding concurrency.
• SQL – auto commit after every statement or
explicit programmer control.
– You can control using:
• SET AUTOCOMMIT ON/OFF
65
ACID Transactions
• ACID transactions are:
– Atomic : Whole transaction or none is done.
– Consistent : Database constraints preserved.
– Isolated : It appears to the user as if only one process
executes at a time.
– Durable : Effects of a process survive a crash.
66
COMMIT
67
ROLLBACK
68
Implementation Options
• SQLite
• Real Database (MySQL, Oracle, DB2, etc.)
69
SQLite mySQL
1. SQLite is a server-less database and is self-contained. 1. MySQL requires a server to run. MySQL will require a
This is also referred to as an embedded database which client and server architecture to interact over a network.
means the DB engine runs as a part of the app.
3. The SQLite library is about 250 KB in size. 3. MySQL server is about 600 MB.
The SQLite directly stores info in a single file, making it
easy to copy. Before Export additional steps needed.
4. Multiple Access and Scalability 4. MySQL has a well-constructed user management system
SQLite does not have any specific user management which can handle multiple users and grant various levels of
functionality and hence is not suitable for multiple user permission.
access.
5. On the contrary, MySQL is easily scalable and can handle
5. SQLite is suitable for smaller databases . Performance a bigger database with less effort.
optimization is harder when using SQLite.
6. Security and Ease of Setup 6. MySQL comes with a lot of inbuilt security features. This
SQLite does not have an inbuilt authentication mechanism. includes authentication with a username, password,
and SSH.
SQLite is easy to set up.
MySQL requires more setup needed.
70