NF PDF
NF PDF
NF PDF
and
Normalization
Design Steps
A database is a collection of information that is organized so that it
can easily be accessed, managed, and updated.
Step (3) to step (4) is based on a “design theory” for relations and is
called “normalization”.
Database design lifecycle
Requirements analysis
User needs; what must database do?
Conceptual design
High-level description; often using E/R model
Logical design
Translate E/R model into relational schema
Schema refinement
Check schema for redundancies and anomalies
Physical design/tuning
Consider typical workloads, and further optimise
Normalization or Schema Refinement
*Normalization or Schema Refinement is a technique of
organizing the data in the database.
*It is a systematic approach of decomposing tables to
eliminate data redundancy and undesirable
characteristics like Insertion, Update and Deletion
Anomalies.
*The best technique of schema refinement is
decomposition.
*The Basic Goal of Normalization is used to eliminate
redundancy.
*Redundancy refers to repetition of same data or duplicate
copies of same data stored in different locations.
Redundancy:
Insertion anomalies
Deletion anomalies
Updation anomalies
Problems Caused by Redundancy
Storing the same information redundantly, that is, in more than
one place within a database, can lead to several problems:
That is, we need not perform joins of the smaller relations to check
whether a constraint on the original relation is violated.
PROPERTIES OF DECOMPOSITION
Following are the properties of Decomposition,
1. Lossless Decomposition
2. Dependency Preservation
The set of all FDs implied by a given set F of FDs is called the closure
of F and is denoted as F+.
Example:
Contracts(cid,sid,jid,did,pid,qty,value), and:
C is the key: C CSJDPQV
Project purchases each part using single contract: JP
C
Dept purchases at most one part from a supplier: SD
P
JP C, C CSJDPQV imply JP CSJDPQV
SD P implies SDJ JP
SDJ JP, JP CSJDPQV imply SDJ
2. ATTRIBUTE CLOSURE (AC)
Closure of a Set of Attributes X (Attribute closure), from
functional dependencies F, is the set of attributes which
are functionally dependent from the set of Attributes X
and it is denoted with X+.
Properties of a Key:
A key X, which belongs in relation R has the following
properties:
1. X ⊆ R
2. X → R (complete key)
3. There is no X′ ⊂ X such that X′ →R (minimal key)
Example for producing a FD based on AA
Given a relation R with attributes W, U, V, X, Y, Z and functional
dependencies: W UV, U Y, VX YZ. Prove that it holds: WX
Z.
Solution:
1. (with decomposition) From W V we take W V.
2. (with augmentation) WX VX.
3. (with transitivity) WX YZ.
4. (with decomposition) WX Z.
Algorithm to find the Closure of a Set of attributes X
1. Let X=A.
2. Among the functional Dependencies of F+, we
search for dependencies C D, where C ⊆ X. If
we found such a dependency, then we add D in X.
3. We repeat Step 2 till we cannot add additional
attributes in X.
Example 1
Let R= (V, Y, Z, W) and F+ = {V Z, VZ W, W Y, VY W}
Find the closure of attribute V.
Solution:
Step 1: X=V.
Step 2: X=VZ because of V Z.
Step 3: X=VZW because of VZ W.
Step 3: X=VZWY because of W Y.
Step 3: No more repeats can be made.
Example 2
Let R = ( V, Y, Z, W) and F+ = {V Υ, W Y, V W}
Find the closure of attribute V.
Solution:
Step 1: X=V.
Step 2: X=VY because of V Y.
Step 3: X=VYW because of V W.
Step 3: no more repeats can be made.
Other Kinds of Dependencies
(OR) Types of Functional Dependencies:
Multivalued dependency
Partial Dependency
Transitive dependency
Trivial functional dependency
Non-trivial functional dependency
Multivalued Dependency
Multivalued Attributes (or repeating groups): non-key
attributes or groups of non-key attributes the values
of which are not uniquely identified by (directly or
indirectly) (not functionally dependent on) the value
of the Primary Key (or its part).
S TUDENT
S tu d _ I D Nam e C o u r se _ ID U n i ts
C u st_ I D Nam e O rd e r_ ID
125 C is c o 1250
Transitive Dependency
Transitive Dependency – A functional dependency is said to be
transitive if it is indirectly formed by two functional dependencies.
The advantage of removing transitive dependency is -
• Amount of data duplication is reduced.
• Data integrity achieved.
Transitive
Dependency
EMPLO YEE
E m p _ ID F_Nam e L_Nam e D e p t_ I D D e p t_ N a m e
111 M a ry Jo n e s 1 A cct
122 S a ra h S m it h 2 M k tg
Trivial Functional dependency
The dependency of an attribute on a set of attributes is known as trivial
functional dependency if the set of attributes includes that attribute.
Symbolically: A ->B is trivial functional dependency if B is a subset of A.
The following dependencies are also trivial: A->A & B->B
Example:
Consider a table with two columns: Student_id and Student_Name.
Also, Student_Id -> Student_Id & Student_Name -> Student_Name are trivial
dependencies too.
Non - Trivial Functional
dependency
If a functional dependency X->Y holds true where Y is not a subset of X then this dependency is
called non trivial Functional dependency.
Example:
An employee table with three attributes: emp_id, emp_name, emp_address.
If a FD X->Y holds true where X intersection Y is null then this dependency is said to be
completely non trivial function dependency.
EXERCISE
Suppose you are given the following functional
dependencies:
fd1: name → address, gender
fd2: address → rank
fd3: rank, gender → salary
FDs are constraints that are derived from the meaning and
interrelationships of the data attributes
X Y holds if whenever two tuples have the same value for X, they must
have the same value for Y
For any two tuples t1 and t2 in any relation instance r(R): If t1[X]=t2[X],
then t1[Y]=t2[Y]
Closure of a set F of FDs is the set F+ of all FDs that can be inferred
from F.
Normal form: Condition using keys and FDs of a relation to certify whether a
relation schema is in a particular normal form
Department
DNAME DNUMBER DMGRSSN DLOCATIONS
There are three main techniques to achieve first normal form for such a
relation:
2. Expand the key so that there will be a separate tuple in the original
DEPARTMENT relation for each location of a DEPARTMENT. In this
case, the primary key becomes the combination {DNUMBER,
DLOCATION}.
3.
EMP_PROJ1
ENO ENAME
EMP_PROJ2
ENO PNUMBER HOURS
First Normal Form: Problem
Person
ENO CAR_LIC PHONE
Here the student Jeet is used twice in the table and subject PHY is repeated Another
method is
to divide the relation into 2
Second Normal Form(2NF):
EMP_PROJ
FD2
FD3
Second Normal Form(2NF): Testing
The test for 2NF involves testing for functional dependencies whose left-
hand side attributes are part of the primary key.
If the primary key contains a single attribute, the test need not be applied.
The functional dependencies FD2 and FD3 make ENAME, PNAME, and
PLOCATION partially dependent on the primary key {SSN, PNUMBER}.
EMP_PROJ1 EMP_PROJ2
ENO PNUMBER HOURS ENO ENAME
FD1 FD2
EMP_PROJ3
PNUMBER PNAME HOURS
FD3
Example
Consider the following relation, not in 2NF
Is R in 1NF, if no decompose.
Is R in 2NF, if no decompose.
Third Normal Form(3NF):
EMP_PROJ
EMP_PROJ1
EMP_PROJ2
Is R in 3 NF, if no decompose.
Example
-With exam_name and total_marks added to our Score table, it saves more data now.
Primary key for our Score table is a composite key, which means it's made up of two
attributes or columns → student_id + subject_id.
-And what about our second new column total_marks? Does it depend on our Score
table's primary key?
-Well, the column total_marks depends on exam_name as with exam type the total score
changes. For example, practicals are of less marks while theory exams are of more marks.
-But, exam_name is just another column in the score table. It is not a primary key or even a
part of the primary key, and total_marks depends on it.
The only difference between the definitions of BCNF and 3NF is that
condition (b) of 3NF, which allows A to be prime, is absent from BCNF.
3NF Table Not in BCNF
Figure 4.7
Decomposition of Table
Structure to Meet BCNF
BCNF Conversion Results
Boyce-Codd Normal Form(BCNF): Example
Consider a relation R with simple attributes {A, B, C, D, E}.
R contain two candidate keys {A} and {B,C}
{A} is made primary key.
The following FDs hold over R
A BCD
BC AD
D B
A 3NF table which does not have multiple overlapping candidate keys
is said to be in BCNF.
if (FX union FY ) + = F +
Formally, if the MVD X Y holds over R and Z = R−XY , the following must
be true for every legal instance r of R:
C T
Does this hold on the above relation?
Multivalued Dependencies: Armstrong Axioms
Replication: If X Y, then X Y.
a) Y X or XY = R, or
b) X is a superkey.
Fourth Normal Form(4NF):