Unit 1
Unit 1
Unit 1
Outline
Introduction
Traditional File based Systems
Database Approach
Roles in Database Environment
History of Database management systems
Advantages and Disadvantages of DBMS’s.
Structure of Relational Databases
Database Schema
Keys
Schema Diagram
Relational Query Languages.
Relational Operations
Introduction
Data
Information
DBMS (Database Management System)
Relational Queries
SQL (Structured Query Language)
Indexing
Hashing
Transactions
Concurrency Control
Recovery Systems
Database Management System
(DBMS)
DBMS contains information about a particular enterprise
Collection of interrelated data
Set of programs to access the data
An environment that is both convenient and efficient to use
Database Applications:
Banking: transactions
Airlines: reservations, schedules
Universities: registration, grades
Sales: customers, products, purchases
Online retailers: order tracking, customized
recommendations
Manufacturing: production, inventory, orders, supply chain
Human resources: employee records, salaries, tax
deductions
Databases can be very large.
Databases touch all aspects of our lives
University Database Example
Application program examples
Add new students, instructors, and courses
Register students for courses, and generate
class rosters
Assign grades to students, compute grade
point averages (GPA) and generate transcripts
In the early days, database applications were
built directly on top of file systems
Traditional File based Systems
File-based systems were an early attempt to
computerize the manual filing system that we are all
familiar with.
We may have divisions in the filing system or separate
folders for different types of item that are in some way
logically related.
The manual filing system works well while the number
of items to be stored is small.
It even works quite adequately when there are large
numbers of items and we have only to store and retrieve
them.
Drawbacks of using file systems to
store data
Atomicity of updates
Failures may leave database in an inconsistent state with
partial updates carried out
Example: Transfer of funds from one account to another
should either complete or not happen at all
Concurrent access by multiple users
Concurrent access needed for performance
Uncontrolled concurrent accesses can lead to inconsistencies
Example: Two people reading a balance (say 100) and
updating it by withdrawing money (say 50 each) at the same
time
Security problems
Hard to provide user access to some, but not all, data
D 1 x D 2 x … x Dn
Thus, a relation is a set of n-tuples (a1, a2, …, an) where
The current values (relation instance) of a relation are
each ai Di
specified by a table
An element t of r is a tuple, represented by a row in a
table
For each attribute of a relation, there is a set of
permitted values, called the domain of that attribute.
We require that, for all relations r, the domains of all
attributes of r be atomic.
atomic
A domain is atomic if elements of the domain are
considered to be indivisible units.
Database Schema
Database schema is the logical design of the database,
and the database instance,
instance which is a snapshot of the
data in the database at a given instant in time.
In general, a relation schema consists of a list of
attributes and their corresponding domains.
The schema of a relation does not generally change.
Example Schema
Keys
Super key
An attribute, or set of attributes, that uniquely
identifies a tuple within a relation.
Candidate Key
A super key such that no proper subset is a super key
within the relation.
When a key consists of more than one attribute, we
call it a composite key.
Primary
The candidate key that is selected to identify tuples
uniquely within the relation.
Foreign
An attribute, or set of attributes, within one relation
that matches the candidate key of some (possibly the
same) relation.
Keys
Let K R
K is a superkey of R if values for K are sufficient to identify a
unique tuple of each possible relation r(R)
Example: {ID} and {ID,name} are both superkeys of
instructor.
Superkey K is a candidate key if K is minimal
Example: {ID} is a candidate key for Instructor
One of the candidate keys is selected to be the primary key.
which one?
Foreign key constraint: Value in one relation must appear in
another
Referencing relation
Referenced relation
Example – dept_name in instructor is a foreign key from
instructor referencing department
Schema Diagrams
A database schema, along with primary key and foreign
key dependencies, can be depicted by schema diagrams
Relational Query Languages
A query language is a language in which a user requests
information from the database. These languages are
usually on a level higher than that of a standard
programming language.
Query languages can be categorized as either
procedural or nonprocedural.
In a procedural language, the user instructs the system
to perform a sequence of operations on the database to
compute the desired result.
In a nonprocedural language, the user describes the
desired information without giving a specific procedure
for obtaining that information.
Relational Operations
RELATIONAL ALGEBRA
The relational algebra defines a set of operations on
relations, paralleling the usual algebraic operations
such as addition, subtraction or multiplication, which
operate on numbers.
Select Operation – selection of rows
(tuples)
Relation r
Relation r:
A,C (r)
Union of two relations
Relations r, s:
r s:
Set difference of two relations
Relations r, s:
r – s:
Set intersection of two relations
Relation r, s:
rs
Note: r s = r – (r – s)
joining two relations -- Cartesian-
product
Relations r, s:
r x s:
Cartesian-product – naming issue
Relations r, s: B
r x s: r.B s.B
Notes about Relational Languages
Each Query input is a table (or set of tables)
Each query output is a table.
All data in the output table appears in one of the input
tables
Relational Algebra is not Turning complete
Can we compute:
SUM
AVG
MAX
MIN
Summary of Relational Algebra
Operators
Symbol (Name) Example of Use
σ
σ
(Selection) salary > = 85000 (instructor)
Return rows of the input relation that satisfy the predicate.
Π
Π
(Projection) ID, salary (instructor)
Output specified attributes from all rows of the input relation. Remove
duplicate tuples from the output.
x
(Cartesian Product) instructor x department
Output pairs of rows from the two input relations that have the same value on
all attributes that have the same name.
∪
Π ∪ Π
(Union) name (instructor) name (student)
Output the union of tuples from the two input relations.
-
Π -- Π
(Set Difference) name (instructor) name (student)
Output the set difference of tuples from the two input relations.
⋈
(Natural Join) instructor ⋈ department
Output pairs of rows from the two input relations that have the same value on
all attributes that have the same name.
End of Unit 1