[go: up one dir, main page]

0% found this document useful (0 votes)
18 views70 pages

SQL 1

The document provides an overview of databases, including definitions of databases, database management systems (DBMS), and database systems. It discusses data models, particularly the relational database model, and introduces the Entity Relationship (ER) model for conceptual data modeling. Additionally, it covers SQL, its components, and how to perform queries, including selections, joins, and handling attributes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views70 pages

SQL 1

The document provides an overview of databases, including definitions of databases, database management systems (DBMS), and database systems. It discusses data models, particularly the relational database model, and introduces the Entity Relationship (ER) model for conceptual data modeling. Additionally, it covers SQL, its components, and how to perform queries, including selections, joins, and handling attributes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 70

Database: What

• Database
– is collection of related data and its metadata organized in a structured
format
– for optimized information management

• Database Management System (DBMS)


– is a software that enables easy creation, access, and modification of
databases
– for efficient and effective database management

• Database System
– is an integrated system of hardware, software, people, procedures,
and data
– that define and regulate the collection, storage, management, and use
of data within a database environment
1
Database Management System
- manages interaction between end users and database

Database Systems: Design, Implementation, & Management: Rob & Coronel

2
Database System Environment

▪ Hardware
▪ Software
- OS
- DBMS
- Applications
▪ People
▪ Procedures
▪ Data

Database Systems: Design, Implementation, & Management: Rob & Coronel

3
Database: Data Models
• Importance
– Abstraction of complex real-word data structures in
relative simple (graphical) representations
– Facilitate interaction among the designer, the applications
programmer, and the end user

• Basic Building Blocks


– Entity
• thing about which data are to be collected and stored
– Attribute
• a characteristic of an entity
– Relationship
• describes an association among entities
– Constraint
• restrictions placed on the data 4
Relational Database
– E. F. Codd’s Relational Model proposal
• Separated the notion of physical representation (machine-view)
from logical representation (human-view)

– Relational Database Model


• Dominant database model of today
• Eliminated pointers and used tables to represent data
• Tables
– flexible logical structure for data representation
– a series of row/column intersections
– related by sharing common entity characteristic(s)

5
Relational Database: Example
■ Provides a logical “human-level” view of the data and associations
among groups of data (i.e., tables)

6
Relational Database: Pros & Cons
• Advantages
– Structural independence
• Separation of database design and physical data storage/access
• Easier database design, implementation, management, and use
– Ad hoc query capability with Structured Query Language (SQL)
• SQL translates user queries to codes

• Disadvantages
– Substantial hardware and system software overhead
• more complex system
– Poor design and implementation is made easy
• ease-of-use allows careless use of RDBMS

7
Entity Relationship Model
• Peter Chen’s Landmark Paper in 1976
– “The Relationship Model: Toward a Unified View of Data”
– Graphical representation of entities and their relationships

• Entity Relationship (ER) Model

– Based on Entity, Attributes & Relationships


• Entity is a thing about which data are to be collected and stored
– e.g. EMPLOYEE
• Attributes are characteristics of the entity
– e.g. SSN, last name, first name
• Relationships describe an associations between entities
– i.e. 1:M, M:N, 1:1

8
Entity Relationship Model
• Entity Relationship (ER) Model

– Complements the relational data model concepts


• Helps to visualize structure and content of data groups
– entity is mapped to a relational table
• Tool for conceptual data modeling (higher level representation)

– Represented in an Entity Relationship Diagram (ERD)


• Formalizes a way to describe relationships between groups of
data

9
E-R Diagram: Chen Model
• Entity
– represented by a rectangle
with its name in capital
letters.

• Relationships
– represented by an active or
passive verb inside the
diamond that connects the
related entities.

• Connectivities
– i.e., types of relationship
– written next to each entity
box.
Database Systems: Design, Implementation, & Management: Rob & Coronel

10
E-R Diagram: Crow’s Foot Model
• Entity
– represented by a rectangle
with its name in capital
letters.

• Relationships
– represented by an active or
passive verb that connects
the related entities.

• Connectivities
– indicated by symbols next to
entities.
• 2 vertical lines for 1
• “crow’s foot” for M
11
Crow’s Foot Model Symbols

12
E-R Model: Pros & Cons
• Advantages

– Exceptional conceptual simplicity


• easily viewed and understood representation of database
• facilitates database design and management
– Integration with the relational database model
• enables better database design via conceptual modeling

• Disadvantages
– Incomplete model on its own
• Limited representational power
– cannot model data constraints not tied to entity relationships
» e.g. attribute constraints
– cannot represent relationships between attributes within
entities
• No data manipulation language (e.g. SQL)
– Loss of information content
• Hard to include attributes in ERD 13
Web Database
• Internet is emerging as a prime business tool
– Shift away from models (e.g. relational)
– Emphasis on interfacing with the Internet

• Characteristics of “Internet age” databases


– Flexible, efficient, and secure Internet access
– Support for complex data types & relationships
– Seamless interfaces with multiple data sources and structures
– Ease of use for end-user, database architect, and database
administrator
• Simplicity of conceptual database model
• Many database design, implementation, and application
development tools
• Powerful DBMS GUI

14
15
SQL Introduction
Standard language for querying and manipulating data

Structured Query Language

Many standards out there:


• ANSI SQL
• SQL92 (a.k.a. SQL2)
• SQL99 (a.k.a. SQL3)
• :
• SQL:2016 row pattern matching, JSON
• SQL:2019 multidimensional arrays
• Vendors support various subsets of these
• What we discuss is common to all of them

16
SQL
• Data Definition Language (DDL)
– Create/alter/delete tables and their attributes
• Data Manipulation Language (DML)
– Query one or more tables
– Insert/delete/modify rows/tuples in tables
• Transact-SQL
– Idea: package a sequence of SQL statements server

17
Data in SQL
1. Atomic types, a.k.a. data types
2. Tables built from atomic types

18
Data Types in SQL
• Characters:
– CHAR(20) -- fixed length
– VARCHAR(40)-- variable length
• Numbers:
– BIGINT, INT, SMALLINT, TINYINT
– REAL, FLOAT -- differ in precision
– MONEY
• Times and dates:
– DATE
– DATETIME -- SQL Server
• Others... All are simple
19
Table name Attribute names
Tables in SQL
Product

PName Price Category Manufacturer

Gizmo $19.99 Gadgets GizmoWorks

Powergizmo $29.99 Gadgets GizmoWorks

SingleTouch $149.99 Photography Canon

MultiTouch $203.99 Household Hitachi

Tuples or rows
20
Tables Explained
• A tuple = a record
– Restriction: all attributes are of atomic type
• A table = a set of tuples
– Like a list…
– …but it is unordered: no first(), no next(), no last().

21
Tables Explained
• The schema of a table is the table name and
its attributes:
Product(PName, Price, Category, Manfacturer)

• A key is an attribute whose values are


unique;
we underline a key
Product(PName, Price, Category, Manfacturer)

22
SQL Query

Basic form: (plus many many more bells and whistles)

SELECT attributes
FROM relations (possibly multiple)
WHERE conditions (selections)

23
Simple SQL Query

Produc PName Price Category Manufacturer


t Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks
SingleTouch $149.99 Photography Canon
MultiTouch $203.99 Household Hitachi

SELECT *
FROM Product
WHERE category=‘Gadgets’

PName Price Category Manufacturer


Gizmo $19.99 Gadgets GizmoWorks

“selection” Powergizmo $29.99 Gadgets GizmoWorks

24
Simple SQL Query

Produc PName Price Category Manufacturer


t Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks
SingleTouch $149.99 Photography Canon
MultiTouch $203.99 Household Hitachi

SELECT PName, Price, Manufacturer


FROM Product
WHERE Price > 100

PName Price Manufacturer


“selection” and SingleTouch $149.99 Canon
“projection” MultiTouch $203.99 Hitachi

25
A Notation for SQL Queries
Input Schema

Product(PName, Price, Category,


Manfacturer)
SELECT PName, Price, Manufacturer
FROM Product
WHERE Price > 100

Answer(PName, Price,
Manfacturer)

Output Schema
26
Selections
What goes in the WHERE clause:
• x = y, x < y, x <= y, etc
– For number, they have the usual meanings
– For CHAR and VARCHAR: lexicographic ordering
• Expected conversion between CHAR and VARCHAR
– For dates and times, what you expect...
• Pattern matching on strings...

27
The LIKE operator
• s LIKE p: pattern matching on strings
• p may contain two special symbols:
▪ % = any sequence of characters
▪ _ = any single character

Product(PName, Price, Category, Manufacturer)


Find all products whose name mentions ‘gizmo’:

SELECT *
FROM Products
WHERE PName LIKE
‘%gizmo%’
28
Eliminating Duplicates
PName Price Category Manufacturer
Produc
Gizmo $19.99 Gadgets GizmoWorks
t
Powergizmo $29.99 Gadgets GizmoWorks
SingleTouch $149.99 Photography Canon
MultiTouch $203.99 Household Hitachi

Category
Gadgets
SELECT category
Gadgets
FROM Product
Photography
Household
Compare to:

Category
SELECT DISTINCT category Gadgets
FROM Product
Photography
Household 29
Ordering the Results

SELECT pname, price, manufacturer


FROM Product
WHERE category=‘gizmo’ AND price > 50
ORDER BY price, pname

Ordering is ascending, unless you specify the DESC keyword (at the end).

Ties are broken by the second attribute on the ORDER BY list, etc.

30
Ordering the Results

SELECT category
FROM Product
ORDER BY pname

PName Price Category Manufacturer

?
Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks
SingleTouch $149.99 Photography Canon
MultiTouch $203.99 Household Hitachi

31
Ordering the Results
Category
SELECT DISTINCT category Gadgets
FROM Product
Household
ORDER BY category
Photography

Compare to:

SELECT category
FROM Product
ORDER BY pname
?
32
Joins in SQL
• Connect two or more tables:

Produc PName Price Category Manufacturer


t Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks
SingleTouch $149.99 Photography Canon
MultiTouch $203.99 Household Hitachi

Compan Cname StockPrice Country


y
What is GizmoWorks 25 USA
the
connection Canon 65 Japan
between
them ? Hitachi 15 Japan 33
Joins
Product (pname, price, category, manufacturer)
Company (cname, stockPrice, country)

Find all products under $200 manufactured in Japan;


return their names and prices.
Join
between Product
and Company
SELECT pname, price
FROM Product, Company
WHERE manufacturer=cname AND country=‘Japan’
AND price <= 200

34
Joins in SQL
Produc
Compan
t
y
PName Price Category Manufacturer Cname StockPrice Country
Gizmo $19.99 Gadgets GizmoWorks
GizmoWorks 25 USA
Powergizmo $29.99 Gadgets GizmoWorks Canon 65 Japan
SingleTouch $149.99 Photography Canon Hitachi 15 Japan
MultiTouch $203.99 Household Hitachi

SELECT pname, price


FROM Product, Company
WHERE manufacturer=cname AND country=‘Japan’
AND price <= 200

PName Price
SingleTouch $149.99

35
Joins
Product (pname, price, category, manufacturer)
Company (cname, stockPrice, country)

Find all countries that manufacture some product in the ‘Gadgets’


category.

SELECT country
FROM Product, Company
WHERE manufacturer=cname AND category=‘Gadgets’

36
Joins in SQL
Produc
Compan
t
y
Name Price Category Manufacturer Cname StockPrice Country
Gizmo $19.99 Gadgets GizmoWorks
GizmoWorks 25 USA
Powergizmo $29.99 Gadgets GizmoWorks Canon 65 Japan
SingleTouch $149.99 Photography Canon Hitachi 15 Japan
MultiTouch $203.99 Household Hitachi

SELECT country
FROM Product, Company
WHERE manufacturer=cname AND
category=‘Gadgets’
Country
What is
??
the problem
??
?
What’s the
solution ? 37
Joins
Product (pname, price, category, manufacturer)
Purchase (buyer, seller, store, product)
Person(persname, phoneNumber, city)

Find names of people living in Seattle that bought some product in


the ‘Gadgets’ category, and the names of the stores they bought
such product from

SELECT DISTINCT persname, store


FROM Person, Purchase, Product
WHERE persname=buyer AND product = pname AND
city=‘Seattle’ AND category=‘Gadgets’

38
How to know two tables related?
• You guess they are
• I tell you so
• Foreign keys are a method for schema designers to
tell you so
– A foreign key states that a column is a reference to the key
of another table
ex: Product.manufacturer is foreign key of Company
Product (pname, price, category, manufacturer)
Company (cname, stockPrice, country)

– Gives information and enforces constraint

39
Disambiguating Attributes
• Sometimes two relations have the same attr:
Person(pname, address, worksfor)
Company(cname, address)
Which
SELECT DISTINCT pname, address address ?
FROM Person, Company
WHERE worksfor = cname

SELECT DISTINCT Person.pname, Company.address


FROM Person, Company
WHERE Person.worksfor = Company.cname
40
Tuple Variables
Purchase (buyer, seller, store,
product)

Find all stores that sold at least one product that the store
‘BestBuy’ also sold:

SELECT DISTINCT x.store


FROM Purchase AS x, Purchase AS y
WHERE (x.product = y.product) AND (y.store = ‘BestBuy’ )
AND (x.store != ‘BestBuy’)

Answer (store)
41
Meaning (Semantics) of SQL
Queries
SELECT a1, a2, …, ak
FROM R1 AS x1, R2 AS x2, …, Rn AS xn
WHERE Conditions

1. Nested loops:
Answer = {}
for x1 in R1 do
for x2 in R2 do
…..
for xn in Rn do
if Conditions
then Answer = Answer ∪ {(a1,…,ak)}
return Answer

42
Example table creation
Employee
Emp_Name Dept_no Gender Age salary

Sara John 2 M 27 1000


Sally Wood 2 F 27 2600
John Smith 1 M 32 5000
Mary Smith 10 F 42 1550

CREATE TABLE Employee ( ❖Varchar2 treats NULL and empty


Emp_Name VARCHAR2(12), string same.
Dept_no numeric(2), ❖SQL is case insensitive
Gender CHAR(1),
Age Numeric(3),
Salary numeric(8,2));

● After creating the table, you can view it using the


command: desc tableName
ALTER TABLE
• ALTER TABLE table_name
ADD column_name datatype
– Adds a column to the table
Ex : Alter table employee add address varchar2(40);
• ALTER TABLE table_name
DROP COLUMN column_name
– Removes a column (and all its data) from the table
Ex : Alter table employee drop column address;
• ALTER TABLE table_name
MODIFY (column_name newType/length)
Ex : Alter table employee modify gender varchar2(5);
INSERT INTO (DML)
• Adds data to a table
• Syntax:
INSERT INTO table_name (column, …, column)
VALUES (value, …, value);

• The columns are the names of columns you are


putting data into, and the values are that data
• String data must be enclosed in single quotes
• Numbers are not quoted
• You can omit the column names if you supply a value
for every column
• Important Note: Only the constraints specified in the
DDL commands are automatically enforced by the
DBMS when updates are applied to the database
INSERT INTO (Cont.)
● Inserting into a table
– Insert into employee (emp_Name, Dept_no, gender, salary)
Values (‘Sara johns’, 1, ‘F’, 1440);

● When you insert a record and you have values for all attributes, there is no need to
specify the attributes names.
– Insert into employee
Values (‘Suzy Alan’, 10, ‘F’, 1200);

● Inserting from another table


– INSERT INTO emp_senior
select * from employee where age > 60;

The main condition in this case, that both tables have the same attributes
and ordered in the same order
Delete
● Delete certain rows (depending on a
condition)
– Delete from employee where age<30;

● Delete all rows


– Delete from employee;
DELETE (cont.)

• Truncate Syntax:
Truncate Table TableName, eg. : Truncate table employee
– Quicker way for deleting all the rows from a table
– It releases the space used by the table
Drop Syntax:
Drop TABLE TableName, eg. : Drop table employee
– Remove the table completely from the database
Relational Database Schema
• Examples:
U4A: DELETE FROM EMPLOYEE
WHERE LNAME='Brown’

U4B: DELETE FROM EMPLOYEE


WHERE SSN='123456789’

U4C: DELETE FROM EMPLOYEE


WHERE DNO IN (SELECT
DNUMBER
FROM DEPARTMENT
WHERE DNAME='Research')

U4D: DELETE FROM EMPLOYEE


UPDATE
• Used to modify attribute values of one or more
selected tuples
• A WHERE-clause selects the tuples to be modified
• An additional SET-clause specifies the attributes to
be modified and their new values
• Each command modifies tuples in the same relation
(Can not operate on multiple tables.
Solution: Use transaction)
• Referential integrity should be enforced
UPDATE (cont.)
• Example: Change the location (plocation) and controlling
department number(dnum) of project number 10 to
'Bellaire' and 5, respectively.

U5: UPDATE PROJECT


SET PLOCATION = 'Bellaire', DNUM = 5
WHERE PNUMBER=10
UPDATE (cont.)
• Example: Give all employees in the 'Research' department a 10% raise
in salary.

U6: UPDATE EMPLOYEE


SET SALARY = SALARY *1.1
WHERE DNO IN (SELECT DNUMBER
FROM DEPARTMENT
WHERE DNAME='Research')

• In this request, the modified SALARY value depends on the original SALARY
value in each tuple
• The reference to the SALARY attribute on the right of = refers to the old SALARY
value before modification
• The reference to the SALARY attribute on the left of = refers to the new SALARY
value after modification
AGGREGATE FUNCTIONS
• Include COUNT, SUM, MAX, MIN, and AVG
• Query 15: Find the maximum salary, the minimum salary, and
the average salary among all employees.

Q15: SELECT MAX(SALARY),


MIN(SALARY), AVG(SALARY)
FROM EMPLOYEE

– Some SQL implementations may not allow more than one


function in the SELECT-clause

Slide 8-54
AGGREGATE FUNCTIONS (cont.)
• Query 16: Find the maximum salary, the minimum salary,
and the average salary among employees who work for
the 'Research' department.

Q16: SELECT MAX(SALARY), MIN(SALARY),


AVG(SALARY)
FROM EMPLOYEE, DEPARTMENT
WHERE DNO=DNUMBER AND
DNAME='Research'

Slide 8-55
AGGREGATE FUNCTIONS (cont.)
• Queries 17 and 18: Retrieve the total number of employees
in the company (Q17), and the number of employees in the
'Research' department (Q18).

Q17: SELECT COUNT (*)


FROM EMPLOYEE

Q18: SELECT COUNT (*)


FROM EMPLOYEE, DEPARTMENT
WHERE DNO=DNUMBER AND
DNAME='Research’

Slide 8-56
GROUPING
• In many cases, we want to apply the aggregate
functions to subgroups of tuples in a relation
• Each subgroup of tuples consists of the set of tuples
that have the same value for the grouping
attribute(s)
• The function is applied to each subgroup
independently
• SQL has a GROUP BY-clause for specifying the
grouping attributes, which must also appear in the
SELECT-clause

Slide 8-57
GROUPING (cont.)
• Query 20: For each department, retrieve the department number, the
number of employees in the department, and their average salary.

Q20: SELECT DNO, COUNT (*), AVG (SALARY)


FROM EMPLOYEE
GROUP BY DNO

– In Q20, the EMPLOYEE tuples are divided into groups--each group


having the same value for the grouping attribute DNO
– The COUNT and AVG functions are applied to each such group of
tuples separately
– The SELECT-clause includes only the grouping attribute and the
functions to be applied on each group of tuples
– A join condition can be used in conjunction with grouping

Slide 8-58
GROUPING (cont.)
• Query 21: For each project, retrieve the project number, project
name, and the number of employees who work on that project.

Q21: SELECT PNUMBER, PNAME, COUNT (*)


FROM PROJECT, WORKS_ON
WHERE PNUMBER=PNO
GROUP BY PNUMBER, PNAME

– In this case, the grouping and functions are applied after the joining of
the two relations

Slide 8-59
THE HAVING-CLAUSE
• Sometimes we want to retrieve the values of
these functions for only those groups that
satisfy certain conditions
• The HAVING-clause is used for specifying a
selection condition on groups (rather than on
individual tuples)

Slide 8-60
THE HAVING-CLAUSE (cont.)
• Query 22: For each project on which more than two
employees work , retrieve the project number, project
name, and the number of employees who work on
that project.

Q22: SELECT PNUMBER, PNAME, COUNT (*)


FROM PROJECT, WORKS_ON
WHERE PNUMBER=PNO
GROUP BY PNUMBER, PNAME
HAVING COUNT (*) > 2

Slide 8-61
Why Transactions?
• Database systems are normally being accessed
by many users or processes at the same time.
– For queries and modifications.
• a DBMS needs to keep processes from
troublesome interactions.

62
Example: Bad Interaction

63
Example: Bad Interaction
• You and your friend has a joint account. Both
trying to withdraw $100 from different ATM’s
at about the same time.
– The DBMS better make sure one account
deduction doesn’t get lost.

64
Transactions
• Transaction = process involving database
queries and/or modification.
• Normally with some strong properties
regarding concurrency.
• SQL – auto commit after every statement or
explicit programmer control.
– You can control using:
• SET AUTOCOMMIT ON/OFF

65
ACID Transactions
• ACID transactions are:
– Atomic : Whole transaction or none is done.
– Consistent : Database constraints preserved.
– Isolated : It appears to the user as if only one process
executes at a time.
– Durable : Effects of a process survive a crash.

66
COMMIT

• The SQL statement COMMIT causes a


transaction to complete.
– It’s database modifications are now permanent
in the database.

67
ROLLBACK

• The SQL statement ROLLBACK also causes


the transaction to end, but by aborting.
– No effects on the database.
• Failures like division by 0 or a constraint
violation can also cause rollback, even if
the programmer does not request it.

68
Implementation Options
• SQLite
• Real Database (MySQL, Oracle, DB2, etc.)

69
SQLite mySQL
1. SQLite is a server-less database and is self-contained. 1. MySQL requires a server to run. MySQL will require a
This is also referred to as an embedded database which client and server architecture to interact over a network.
means the DB engine runs as a part of the app.

2. Datatypes supported: 2. Datatypes supported:


SQLite supports these datatypes: Blob, Integer, Null, Text, Tinyint, Smallint, Mediumint, Int, Bigint, Double, Float,
Real Real, Decimal, Double precision, Numeric, Timestamp,
Date, Datetime, etc.

3. The SQLite library is about 250 KB in size. 3. MySQL server is about 600 MB.
The SQLite directly stores info in a single file, making it
easy to copy. Before Export additional steps needed.

4. Multiple Access and Scalability 4. MySQL has a well-constructed user management system
SQLite does not have any specific user management which can handle multiple users and grant various levels of
functionality and hence is not suitable for multiple user permission.
access.
5. On the contrary, MySQL is easily scalable and can handle
5. SQLite is suitable for smaller databases . Performance a bigger database with less effort.
optimization is harder when using SQLite.

6. Security and Ease of Setup 6. MySQL comes with a lot of inbuilt security features. This
SQLite does not have an inbuilt authentication mechanism. includes authentication with a username, password,
and SSH.
SQLite is easy to set up.
MySQL requires more setup needed.
70

You might also like