Data Base Management Systems
(DBMS) – (1)
Code: INF311
Lecture 3
In this Lecture
Data Models
Relational Model
Relational Data structure
Relational Data manipulation
Relational Data integrity
Model
A Model is a representation of reality, ’real world’
objects and events and their associations.
A database model is an organizing principle that
specifies particular mechanism of data storage
and retrieval.
The model explains, in terms of services available
to an interfacing application, how to access a
data element when other related elements are
known.
A schema is a description of a particular collection
of data, using the given data model.
Components of Data Models
Structure Part
– Consisting of set of rules according to which
databases can be constructed.
Manipulative Part
– Define the types of operation that are
allowed on the data.
Set of Integrity Rules
– Which ensures that data is accurate.
Purpose of Data Model
To represent data.
To make the data understandable.
Types of Data Models I
Object Based Data Models
– use concepts such as entities, attributes
and relationships.
– Types of Object Based Data Models
Entity Relationship
Object Oriented
Semantic
Functional
Types of Data Models II
Physical Data Models
– Describe how data is stored in the
computer, representing information
such as record structures, record
ordering and access paths.
Types of Data Models III
Record Based Logical Data Models
– It is used to specify the overall logical
structure of the database and to
provide a higher-level description of the
implementation.
– Structured database in fixed formats.
– Types
Hierarchical Model
Network Model
Relational Model
Relational
The Data is perceived by the user as tables
(and nothing but tables)
The operators available to the user are
operators that drive new tables from old
ones
The foundation of modern database
technology is without question the
relational model.
Relational Database: Definitions
Relational database: a set of relations.
Relation: made up of 2 parts:
– Schema : specifies name of relation, plus name and
type of each column.
E.g. Students
(sid: string, name: string, login: string, age: integer, gpa: real)
– Instance : a table, with rows and columns.
#rows = cardinality
#fields = degree
Example of a Relation
attributes
(or columns)
tuples
(or rows)
--------
---------
---------
Domain
Relational model Aspects
Data structure
Data manipulation
– Relational Algebra
– Relational Calculus
Data integrity
Relational Data structure
Data are organized in two-dimensional tables with
rows and columns.
Table consists of rows (records) and named
columns (attribute or field)
Requirements for a table to qualify as a relation:
– It must have a unique name
– Every attribute value must be atomic (not multivalued)
– Every row must be unique (can’t have two rows with
exactly the same values for all their fields)
– Attributes (columns) in tables must have unique names
– The order of the columns must be irrelevant
– The order of the rows must be irrelevant
Structures
attribute types
entity types
relationship types
FLIGHT-SCHEDULE DEPT-AIRPORT
FLIGHT# AIRLINE WEEKDAY PRICE FLIGHT# AIRPORT-CODE
101 delta mo 156 101 atl
545 americE we 110 912 Egy
912 gyair fr 450 545 lax
242 usair mo 231
Keys (special fields)
Primary key
– an attribute that uniquely identifies each row of the relation
in question.
– Examples include employee numbers, social security
numbers, etc. Can be composite
Candidate key
– An attribute that could be a primary key
Alternate key
– A candidate key that is not selected as the primary key
Foreign key
– an attribute in a relation of a database that serves as the
primary key of another relation in the same database.
– Used to represent relationship between two tables. Can be
composite
Schema for four relations
Primary Key
Foreign Key
(implements 1:N relationship
between customer and order)
Combined, these are a composite
primary key (uniquely identifies the
order line)…individually they are
foreign keys (implement M:N
relationship between order and product)
16
Data structure terminology
Formal relational term Informal equivalents
Relation table
Tuple Row or record
Cardinality Number of rows
Attribute Column or field
Degree Number of columns
Primary key Unique identifier
domain Pool of legal values
Relational data manipulation
Data stored in the tables may be
manipulated through the use of a command
language.
Formal languages associated with the
relational model:
– Relational algebra
– Relational calculus
Relational data manipulation
Informally, relational algebra is a (high-level)
procedural language and relational calculus a
non-procedural language.
However, formally both are equivalent to one
another.
A language that produces a relation that can
be derived using relational calculus is
relationally complete.
Relational Algebra
Relational algebra operations work on one or more
relations to define another relation without changing
the original relations.
Both operands and results are relations, so output
from one operation can become input to another
operation.
Allows expressions to be nested, just as in
arithmetic. This property is called closure.
Relational algebra is the foundation of SQL
Relational Algebra
Five basic operations in relational algebra:
Selection, Projection, Cartesian product, Union, and
Set Difference.
These perform most of the data retrieval operations
needed.
Also have Join, Intersection, and Division operations,
which can be expressed in terms of 5 basic
operations.
Selection (or Restriction)
Extracts rows from a single relation
T
W X Y Z
Example - Selection (or Restriction)
List all staff with a salary greater than £10,000.
salary > 10000 (Staff)
Projection
Extracts columns from a single relation
T
W X Y Z
Example - Projection
Produce a list of salaries for all staff, showing only
staffNo, fName, lName, and salary details.
staffNo, fName, lName, salary(Staff)
Union
Creates a new relation containing rows
appearing in one or both relations
Duplicate rows are automatically eliminated
Relations must be union compatible
A B A UNION B
X Y X Y X Y
x1 y1 x2 y2 x1 y1
x2 y2 x4 y4 x2 y2
x3 y3 x3 y3
x4 y4
Example - Union
List all cities where there is either a branch office
or a property for rent.
city(Branch) city(PropertyForRent)
Set Difference
Creates a relation containing rows in the first
relation but not in the second
Relations must be union compatible
A B A MINUS B
X Y X Y X Y
x1 y1 x1 y1
x2 y2
x2 y2 x3 y3
x4 y4
x3 y3
Example - Set Difference
List all cities where there is a branch office but no
properties for rent.
city(Branch) – city(PropertyForRent)
Intersection
Creates a new relation containing rows
appearing in both relations
Relations must be union compatible
A INTERSECT B
A B
X Y X Y
X Y
x2 y2 x2 y2
x1 y1
x4 y4
x2 y2
x3 y3
Example - Intersection
List all cities where there is both a branch office and at
least one property for rent.
city(Branch) city(PropertyForRent)
Cartesian product
Creates a new relation from all possible combinations
of rows in two other relations
A B
V W X Y Z
v1 w1 x1 y1 z1
v2 w2 x2 y2 z2
v3 w3
A TIMES B
V W X Y Z
v1 w1 x1 y1 z1
v1 w1 x2 y2 z2
v2 w2 x1 y1 z1
v2 w2 x2 y2 z2
v3 w3 x1 y1 z1
v3 w3 x2 y2 z2
Example - Cartesian product
List the names and comments of all clients who have
viewed a property for rent.
(clientNo, fName, lName(Client)) X (clientNo, propertyNo, comment (Viewing))
Join Operations
Creates a new relation from all combinations of
rows satisfying the join condition
A join B where W = Z
A B
V W X Y Z
v1 wz1 x1 y1 wz1
v2 wz2 x2 y2 wz3
v3 wz3
A EQUIJOIN B
V W X Y Z
v1 wz1 x1 y1 wz1
v3 wz3 x2 y2 wz3
Join Operations
Join is a derivative of Cartesian product.
Equivalent to performing a Selection, using join predicate
as selection formula, over Cartesian product of the two
operand relations.
One of the most difficult operations to implement
efficiently in an RDBMS and one reason why RDBMSs
have intrinsic performance problems.
Join Operations
Various forms of join operation
– Theta join
– Equijoin (a particular type of Theta join)
– Natural join
– Outer join
– Semijoin
Division
Is there a value in the X column of A (e.g., x1) that has a value
in the Y column of A for every value of y in the Y column of B?
A B
X Y Y
x1 y1 y1
x1 y2 y2
x1 y3
x2 y1
x2 y3
A DIVIDE B
x1
Example - Division
Identify all clients who have viewed all properties with
three rooms.
(clientNo, propertyNo(Viewing))
(propertyNo(rooms = 3 (PropertyForRent)))
Relational Calculus
Relational calculus query specifies what is to be retrieved
rather than how to retrieve it.
– No description of how to evaluate a query.
In first-order logic (or predicate calculus), predicate is a
truth-valued function with arguments.
When we substitute values for the arguments, function
yields an expression, called a proposition, which can be
either true or false.
Relational Calculus
If predicate contains a variable (e.g. ‘x is a member of
staff’), there must be a range for x.
When we substitute some values of this range for x,
proposition may be true; for other values, it may be
false.
When applied to databases, relational calculus has
forms: tuple and domain.
Relational Integrity
Databases are used to store data
The data is used to create information which is
needed for making decisions
Therefore, we need to make sure that the data
which is stored in the database is correct and
consistent
This is known as data integrity
Business rules may be defined that maintain the
integrity of data when they are manipulated.
Relational Integrity
Domain constraints: the set of values that can be
assigned: data type, size, allowable values
Entity integrity: no key attribute may be null
Referential integrity: the value of a non-null foreign
key must be an actual key value in some relation
Operational constraints: business rules for which
logic must be embedded in the system
Entity Integrity
The entity integrity rule applies to Primary Keys
The entity integrity rule says that the value of a
Primary Key in a table must be unique and it can
never have no value (null)
Operations on the database which insert new data,
update existing data, or delete data must follow this
rule
43
Referential Integrity
The referential integrity key applies to Foreign Keys
The referential integrity key says that the value of a
Foreign key must either be null (i.e have no value) or
be equal to the value in the linked table where the
Foreign Key is the Primary Key
44
Next Lecture
ER diagram
SQL
– DDL
– DML
– DCL