NF PDF

Functional Dependencies
and
Normalization
Design Steps
A database is a collection of information that is organized so that it
can easily be accessed, managed, and updated.
The design steps:

1. Real-World 2. ER model 3. Relational Schema
4. Better relational Schema 5. Relational DBMS
Step (3) to step (4) is based on a “design theory” for relations and is
called “normalization”.
Database design lifecycle
Requirements analysis
User needs; what must database do?
Conceptual design
High-level description; often using E/R model
Logical design
Translate E/R model into relational schema
Schema refinement
Check schema for redundancies and anomalies
Physical design/tuning
Consider typical workloads, and further optimise
Normalization or Schema Refinement
*Normalization or Schema Refinement is a technique of
organizing the data in the database.
*It is a systematic approach of decomposing tables to
eliminate data redundancy and undesirable
characteristics like Insertion, Update and Deletion
Anomalies.
*The best technique of schema refinement is
decomposition.
*The Basic Goal of Normalization is used to eliminate
redundancy.
*Redundancy refers to repetition of same data or duplicate
copies of same data stored in different locations.
Redundancy:
Problems with redundancy:
Information is stored redundantly wasting storage
Insertion anomalies
Deletion anomalies
Updation anomalies
Problems Caused by Redundancy
Storing the same information redundantly, that is, in more than
one place within a database, can lead to several problems:
Redundant storage: Some information is stored repeatedly.
Update anomalies: If one copy of such repeated data is updated,

an inconsistency is created unless all copies are similarly
updated.
Insertion anomalies: It may not be possible to store some

information unless some other information is stored as well.
Deletion anomalies: It may not be possible to delete some

information without losing some other information as well.
Updation/Modification Anomaly
If there is updation in the fee from 5000 to 7000, then we
have to update FEE column in all the rows, else data will
become inconsistent.
Insertion Anomaly and Deletion Anomaly
These anamolies exist only due to redundancy, otherwise
they
do not exist.
Insertion Anomaly :
New course is introduced C4, But no student is there
who is having C4 subject.
Cont……
Deletion Anomaly :
- Deletion of S3 student cause the deletion of course.
- Because of deletion of some data forced to delete
some
other useful data.
DECOMPOSITION
Decomposition is the process of breaking down in parts or
elements.
It replaces a relation with a collection of smaller relations.
It breaks the table into multiple tables in a database.
It should always be lossless, because it confirms that the
information in the original relation can be accurately
reconstructed based on the decomposed relations.
If there is no proper decomposition of the relation, then it
may lead to problems like loss of information.
Types of Decomposition
There are two types of decomposition :
1). Lossy Decomposition

2). Lossless Join Decomposition
1)Lossy Decomposition
"The decomposition of relation R into R1 and R2 is lossy
when the join of R1 and R2 does not yield the same
relation as in R.”
One of the disadvantages of decomposition into two or

more relational schemes (or tables) is that some
information is lost during retrieval of original relation or
table.
EXAMPLE: Consider that we have table STUDENT with three attribute roll_no ,
sname
and department.
2)Lossless Join Decomposition
"The decomposition of relation R into R1 and R2 is lossless
when the join of R1 and R2 yield the same relation as in R.
“
A relational table is decomposed (or factored) into two or
more smaller tables, in such a way that the designer can
capture the precise content of the original table by joining
the decomposed parts.
This is called lossless-join (or non-additive join)
decomposition.
This is also referred as non-additive decomposition.
The lossless-join decomposition is always defined with
respect to a specific set F of dependencies.
EXAMPLE: Consider that we have table STUDENT with three attribute roll_no ,
sname and department.
Problems Related to
Decomposition
Unless we are careful, decomposing a relation schema can create
more problems than it solves.
Two important questions must be asked repeatedly:
1.Do we need to decompose a relation?
If a relation schema is in one of these normal forms, we know that

certain kinds of problems cannot arise. Considering the normal form
of a given relation schema can help us to decide whether or not to
decompose it further.
If we decide that a relation schema must be decomposed further, we

must choose a particular decomposition(i.e., a particular collection
of smaller relations to replace the given relation).
Cont……
2. What problems (if any) does a given decomposition cause?
The lossless-join property enables us to recover any instance of the

decomposed relation from corresponding instances of the smaller
relations.
The dependency-preservation property enables us to enforce any

constraint on the original relation by simply enforcing some
constraints on each of the smaller relations.
That is, we need not perform joins of the smaller relations to check
whether a constraint on the original relation is violated.
PROPERTIES OF DECOMPOSITION
Following are the properties of Decomposition,
1. Lossless Decomposition
2. Dependency Preservation
3. Lack of Data Redundancy

1. Lossless Decomposition
Decomposition must be lossless.
It means that the information should not get lost from the relation that
is decomposed.
It gives a guarantee that the join will result in the same relation as it
was decomposed.
Example:
Let's take 'E' is the Relational Schema, With instance 'e';
is decomposed into: E1, E2, E3, . . . . En;
With instance: e1, e2, e3, . . . . en, If e1 ⋈ e2 ⋈ e3 . . . . ⋈ en, then it
is called as 'Lossless Join Decomposition'.
------- In the above example, it means that, if natural joins of all

the decomposition give the original relation, then it is said to be
lossless join decomposition.
Decompositio
n.
2. Dependency Preservation
Dependency is an important constraint on the database.
Every dependency must be satisfied by at least one
decomposed table.
If {A → B} holds, then two sets are functional dependent.
And, it becomes more useful for checking the
dependency easily if both sets in a same relation.
This decomposition property can only be done by
maintaining the functional dependency.
In this property, it allows to check the updates without
computing the natural join of the database structure.
3. Lack of Data Redundancy
Lack of Data Redundancy is also known as a Repetition of
Information.
The proper decomposition should not suffer from any data
redundancy.
The careless decomposition may cause a problem with the
data.
The lack of data redundancy property may be achieved by
Normalization process.
FUNCTIONAL DEPENDENCIES (FDs)
• It is a form of integrity constraint that can identify schemas
with such problems and suggest refinements.
A functional dependency X  Y holds over relation schema R if,

for every allowable instance r of R:
t1  r, t2  r, X (t1) = X (t2)
implies Y (t1) = Y (t2)
(where t1 and t2 are tuples; X and Y are sets of attributes)
In other words: X  Y means

Given any two tuples in r, if the X values are the same, then the Y
values must also be the same. (but not vice versa)
Read “” as “determines”.
Cont……
An FD is a statement about all allowable relations.
Must be identified based on semantics of application.
Given some instance r1 of R, we can check if r1 violates
some FD f, but we cannot determine if f holds over R.
Question: How related to keys?

If “K  all attributes of R” then K is a superkey for R
(does not require K to be minimal.)
FDs are a generalization of keys.

FUNCTIONAL DEPENDENCIES
(FDs)
A functional dependency (FD) is a relationship between two
attributes, typically between the PK and other non-key
attributes within a table.
For any relation R, attribute Y is functionally dependent on
attribute X (usually the PK), if for every valid instance of X, that
value of X uniquely determines the value of Y.
This relationship is indicated by the representation below :
X Y
The left side of the above FD diagram is called the determinant,
and the right side is the dependent.
Reasoning About FDs
The reasons about FDs are in two ways:
1. CLOSURE OF A SET OF FDs
2. ATTRIBUTE CLOSURE
1. CLOSURE OF A SET OF FDs (FDC):
The set of all FDs implied by a given set F of FDs is called the closure
of F and is denoted as F+.
An important question is how we can infer, or compute, the closure of a

given set F of FDs.
The following three rules, called Armstrong's Axioms, can be applied

repeatedly to infer all FDs implied by a set F of FDs.
1. CLOSURE OF A SET OF FDs
Given some FDs, we can usually infer or compute
additional FDs:
ssn did, did lot implies ssn lot
An FD f is implied by a set of FDs F if f holds whenever all
FDs in F hold.
F + = closure of F is the set of all FDs that are implied by F.
Armstrong’s Axioms (OR) AA (X, Y, Z are sets of attributes)
:
Reflexivity: If X C Y, then Y X
Augmentation: If X Y, then XZ YZ for any Z
Transitivity: If X Y and Y Z, then X Z
These are sound and complete inference rules for FDs!
Cont…..
Couple of additional rules (that follow from AA):
Union: If X Y and X Z, then X YZ
Decomposition: If X YZ, then X Y and X
Z
Example:
Contracts(cid,sid,jid,did,pid,qty,value), and:
C is the key: C CSJDPQV
Project purchases each part using single contract: JP
C
Dept purchases at most one part from a supplier: SD
P
JP C, C CSJDPQV imply JP CSJDPQV
SD P implies SDJ JP
SDJ JP, JP CSJDPQV imply SDJ
2. ATTRIBUTE CLOSURE (AC)
Closure of a Set of Attributes X (Attribute closure), from
functional dependencies F, is the set of attributes which
are functionally dependent from the set of Attributes X
and it is denoted with X+.
Properties of a Key:
A key X, which belongs in relation R has the following
properties:
1. X ⊆ R
2. X → R (complete key)
3. There is no X′ ⊂ X such that X′ →R (minimal key)
Example for producing a FD based on AA
Given a relation R with attributes W, U, V, X, Y, Z and functional
dependencies: W UV, U Y, VX YZ. Prove that it holds: WX
Z.
Solution:
1. (with decomposition) From W V we take W V.
2. (with augmentation) WX VX.
3. (with transitivity) WX YZ.
4. (with decomposition) WX Z.
Algorithm to find the Closure of a Set of attributes X
Given a relation R και its functional dependencies

F+, find the closure of attribute A.
1. Let X=A.
2. Among the functional Dependencies of F+, we
search for dependencies C D, where C ⊆ X. If
we found such a dependency, then we add D in X.
3. We repeat Step 2 till we cannot add additional
attributes in X.
Example 1
Let R= (V, Y, Z, W) and F+ = {V Z, VZ W, W Y, VY W}
Find the closure of attribute V.
Solution:
Step 1: X=V.
Step 2: X=VZ because of V Z.
Step 3: X=VZW because of VZ W.
Step 3: X=VZWY because of W Y.
Step 3: No more repeats can be made.
Example 2
Let R = ( V, Y, Z, W) and F+ = {V Υ, W Y, V W}
Find the closure of attribute V.
Solution:
Step 1: X=V.
Step 2: X=VY because of V Y.
Step 3: X=VYW because of V W.
Step 3: no more repeats can be made.
Other Kinds of Dependencies
(OR) Types of Functional Dependencies:
Multivalued dependency
Partial Dependency
Transitive dependency
Trivial functional dependency
Non-trivial functional dependency
Multivalued Dependency
Multivalued Attributes (or repeating groups): non-key
attributes or groups of non-key attributes the values
of which are not uniquely identified by (directly or
indirectly) (not functionally dependent on) the value
of the Primary Key (or its part).
S TUDENT
S tu d _ I D Nam e C o u r se _ ID U n i ts
101 Lennon M S I 250 3.00
101 Lennon M S I 415 3.00
125 Jo h n s o n M S I 331 3.00

Partial Dependency
Partial Dependency – when an non-key attribute is
determined by a part, but not the whole, of a
COMPOSITE primary key.
Partial
CUS TO MER Dependency
C u st_ I D Nam e O rd e r_ ID
101 A T& T 1234
101 A T& T 156
125 C is c o 1250
Transitive Dependency
Transitive Dependency – A functional dependency is said to be
transitive if it is indirectly formed by two functional dependencies.
The advantage of removing transitive dependency is -
• Amount of data duplication is reduced.
• Data integrity achieved.
Transitive
Dependency
EMPLO YEE
E m p _ ID F_Nam e L_Nam e D e p t_ I D D e p t_ N a m e
111 M a ry Jo n e s 1 A cct
122 S a ra h S m it h 2 M k tg
Trivial Functional dependency
The dependency of an attribute on a set of attributes is known as trivial
functional dependency if the set of attributes includes that attribute.
Symbolically: A ->B is trivial functional dependency if B is a subset of A.
The following dependencies are also trivial: A->A & B->B
Example:
Consider a table with two columns: Student_id and Student_Name.
{Student_Id, Student_Name} -> Student_Id is a trivial functional dependency as

Student_Id is a subset of {Student_Id, Student_Name}.
That makes sense because if we know the values of Student_Id and

Student_Name then the value of Student_Id can be uniquely determined.
Also, Student_Id -> Student_Id & Student_Name -> Student_Name are trivial
dependencies too.
Non - Trivial Functional
dependency
If a functional dependency X->Y holds true where Y is not a subset of X then this dependency is
called non trivial Functional dependency.
Example:
An employee table with three attributes: emp_id, emp_name, emp_address.
The following functional dependencies are non-trivial:
emp_id -> emp_name (emp_name is not a subset of emp_id)

emp_id -> emp_address (emp_address is not a subset of emp_id)
On the other hand, the following dependencies are trivial:
{emp_id, emp_name} -> emp_name [emp_name is a subset of {emp_id, emp_name}]
Completely non trivial FD:
If a FD X->Y holds true where X intersection Y is null then this dependency is said to be
completely non trivial function dependency.
EXERCISE
Suppose you are given the following functional
dependencies:
fd1: name → address, gender
fd2: address → rank
fd3: rank, gender → salary
--- Give a primary key of the relation r(name, address,

gender, rank, salary). Prove your answer formally using
Armstrong's Axioms.
SOLUTION
name is the primary key. To prove this, first prove that name is a superkey, then prove it is a candidate key and finally a primary key.
To prove that name is a superkey:
1 name → address decomposition on fd1

2 name → gender decomposition on fd1
3 name → rank 3transitivity on 1. and fd2
4 name, gender → salary 3. & pseudo-transitivity on fd2
5 name → name, gender augmentation of 2.
6 name → salary transitivity on 4. and 5.
7 name → name, address, union of 1., 2., 3., 6. &
rank, gender, salary augmentation w. name
From the given FD's, it is not possible to have an FD that only has 'name’ on its right-hand side, therefore, the attribute name must be part of any
super key. As the definition of a candidate key is a minimal super key and {name} is a superkey, and the empty set is not a superkey, 'name'
must be
a candidate key.
Since name is a candidate key and it is the only candidate key, it is also the primary key.
Functional Dependencies:
FDs are constraints that are derived from the meaning and
interrelationships of the data attributes
A set of attributes X functionally determines a set of attributes Y if the

value of X determines a unique value for Y
X Y holds if whenever two tuples have the same value for X, they must
have the same value for Y
For any two tuples t1 and t2 in any relation instance r(R): If t1[X]=t2[X],
then t1[Y]=t2[Y]
X Y in R specifies a constraint on all relation instances r(R)

Armstrong's inference rules:
IR1. (Reflexive) If Y is subset-of X, then X Y
IR2. (Augmentation) If X Y, then XZ YZ

(Notation: XZ stands for X U Z)
IR3. (Transitive) If X Y and Y Z, then X Z
Some additional inference rules that are useful:
(Decomposition) If X YZ, then X Y and X Z
(Union) If X Y and X Z, then X YZ
(Psuedotransitivity) If X Y and WY Z, then WX Z

Functional Dependencies:
Closure of a set F of FDs is the set F+ of all FDs that can be inferred
from F.
Closure of a set of attributes X with respect to F is the set X + of all

attributes that are functionally determined by X
Two sets of FDs F and G are equivalent if:
- every FD in F can be inferred from G, and
- every FD in G can be inferred from F
Hence, F and G are equivalent if F + =G +
Definition: F covers G if every FD in G can be inferred from F

(i.e., if G + subset-of F +)
F and G are equivalent if F covers G and G covers F

Minimal Sets of FDs
A set of FDs is minimal if it satisfies the following conditions:
(1) Every dependency in F has a single attribute for its RHS.
(1) We cannot replace any dependency X A in F with a dependency Y

A, where Y proper-subset-of X ( Y subset-of X) and still have a set of
dependencies that is equivalent to F.
(1) We cannot remove any dependency from F and have a set of

dependencies that is equivalent to F.
Normal Forms
Normalization of Relations
Normalization: The process of decomposing unsatisfactory "bad" relations

by breaking up their attributes into smaller relations
Normal form: Condition using keys and FDs of a relation to certify whether a
relation schema is in a particular normal form
Relationship Between Normal Forms

Types of Normal Form
First Normal Form (1NF) - included in the definition of a relation.
Second Normal Form (2NF) defined in terms of

functional dependencies
Third Normal Form(3NF)
Boyce-Codd Normal Form ( BCNF)
Fourth Normal Form (4NF) - defined using multivalued

dependencies
Fifth Normal Form (5NF) or Project Join Normal Form (PJNF) -

defined using join dependencies
Normal Forms:
First Normal Form

‘A relation R is in first normal form (1NF) if and only if all underlying
domains contain atomic values only.’
Second Normal Form

‘A relation R is in second normal form (2NF) if and only if it is in 1NF and
every non-key attribute is fully dependent on the primary key.’
Third Normal Form

‘A relation R is in third normal form (3NF) if and only if it is in 2NF and
every non-key attribute is non transitively dependent on the primary key.’
Boyce-Codd Normal Form

‘A relation R is in Boyce-Codd normal form (BCNF) if and only if every
determinant is a candidate key.’
First Normal Form(1NF):
‘A relation R is in first normal form (1NF) if and only if all underlying

domains contain atomic values only.’
Disallows composite attributes, multivalued attributes, and nested

relations; attributes whose values for an individual tuple are non-atomic
First Normal Form: Example Multivalued attributes
Department
DNAME DNUMBER DMGRSSN DLOCATIONS

Research 5 333445555 Bellaire, Sugarland, Houston
Administration 4 987654321 Stanfford
Headquarters 1 888665555 Houston
First Normal Form: Solutions
There are three main techniques to achieve first normal form for such a
relation:
1. Remove the attribute DLOCATIONS that violates 1NF and place it in a

separate relation DEPT_LOCATIONS along with the primary key
DNUMBER of DEPARTMENT.
2. Expand the key so that there will be a separate tuple in the original
DEPARTMENT relation for each location of a DEPARTMENT. In this
case, the primary key becomes the combination {DNUMBER,
DLOCATION}.
3. If a maximum number of values is known for the attribute-for example,

if
it is known that at most three locations can exist for a department-
replace the DLOCATIONS attribute by three atomic attributes:
DLOCATIONl, DLOCATION2, and DLOCATION3.
First Normal Form: Solutions to example
There are three main techniques to achieve first normal form for such a
relation:
Department Department_Loc
1. DNAME DNUMBER DMGRSSN DNUMBER DLOCATIONS

2.
Research 5 333445555 Bellaire
Research 5 333445555 Sugarland
Research 5 333445555 Houston
Administration 4 987654321 Stanfford
Headquarters 1 888665555 Houston
3.
DNAME DNUMBER DMGRSSN DLOCATIONS1 DLOCATIONS2 DLOCATIONS3

First Normal Form: Best solution
Of the three solutions above, the first is generally considered best

because it does not suffer from redundancy and it is completely general,
having no limit placed on a maximum number of values.
First Normal Form: Nested relations
First normal form also disallows multivalued attributes that are
themselves composite.
These are called nested relations because each tuple can have a
relation within it.
PROJS
EMP_PROJ ENO ENAME PNUMBER HOURS
ENO ENAME PNUMBER HOURS

11111 Mahesh 1 32.5
2 7.5
22222 Ramesh 2 40.5
3 35.6
33333 Kiran 2 25.8
10 45
20 38.7
44444 Karthik 30 9.7
First Normal Form: Solution
To normalize this into INF,

we remove the nested relation attributes into a new relation and
propagate the primary key into it;
the primary key of the new relation will combine the partial key with the
primary key of the
original relation.
EMP_PROJ1
ENO ENAME
EMP_PROJ2
ENO PNUMBER HOURS
First Normal Form: Problem
Person
ENO CAR_LIC PHONE
Where CAR_LIC and PHONE are multivalued attributes
Is this in 1NF, if no decompose.

Example
Example
A row of data cannot contain repeating group of data i.e atomic value
Cid Cname Subject
101 Jeet PHY
101 Jeet CHE
102 Seet PHY
103 Swet SOCIAL
Here the student Jeet is used twice in the table and subject PHY is repeated Another
method is
to divide the relation into 2
Second Normal Form(2NF):
A relation R is in second normal form (2NF) if and only if it is in 1NF and

every non-key attribute is fully functionally dependent on the primary key
(or candidate key).
A functional dependency X Y is a full functional dependency if removal

of any attribute A from X means that the dependency does not hold any
more.
Second Normal Form(2NF): Examples
EMP_PROJ
ENO PNUMBER HOURS ENAME PNAME PLOCATION

FD1
FD2
FD3
Second Normal Form(2NF): Testing
The test for 2NF involves testing for functional dependencies whose left-
hand side attributes are part of the primary key.
If the primary key contains a single attribute, the test need not be applied.
If a relation schema is not in 2NF, it can be "second normalized" or "

2NFnormalized" into a number of 2NF relations in which nonprime
attributes are associated only with the part of the primary key on which they
are fully functionally dependent.
Second Normal Form(2NF): Solution to example
In functional dependency FD1, the attribute HOURS is fully functionally

dependent on the primary key {SSN, PNUMBER}.
The functional dependencies FD2 and FD3 make ENAME, PNAME, and
PLOCATION partially dependent on the primary key {SSN, PNUMBER}.
Hence, decompose EMP_PROJ into the three relation schemas

EMP_PROJ1, EMP_PROJ 2, and EMP_PROJ 3
EMP_PROJ1 EMP_PROJ2
ENO PNUMBER HOURS ENO ENAME
FD1 FD2
EMP_PROJ3
PNUMBER PNAME HOURS
FD3
Example
Consider the following relation, not in 2NF
-HerHere Cid & Order_id is PK

-It is in 1NF
-Not in 2NF, there are partial dependencies of columns on Primary key
-Cname is only dependent on Cid
-Order_name is dependent on order_id
-There is no link between Cname & Sale_details
-To reduce this table into 2NF, break the table into 3 different tables
e Cid & Order_id is PK
-It is in 1NF
-Not in 2NF, there are partial dependencies of columns on Primary key
Cont……
Here Cid & Order_id is PK

-It is in 1NF
-Not in 2NF, there are partial dependencies
of columns on Primary key
-There is no link between Cname &
Sale_details
-To reduce this table into 2NF, break the
table into 3 different tables
Second Normal Form(2NF): Example2
Consider a relation R with simple attributes {A, B, C, D, E, F}.
R contain two candidate keys {A} and {B,C}
{A} is made primary key.
The following FDs hold over R
A BCDEF
BC ADEF
B F
D E
Is R in 1NF, if no decompose.
Is R in 2NF, if no decompose.
Third Normal Form(3NF):
A relation R is in third normal form (3NF) if and only if it is in 2NF and

every non-key attribute is non transitively dependent on the primary key.
A relation schema R is in third normal form (3NF) if, whenever a

nontrivial functional dependency X A holds in R, either (a) X is a super
key of R, or (b) A is a prime attribute of R.
Third Normal Form(3NF): Example
EMP_PROJ
ENO ENAME DOB ADDRESS DNUMBER DNAME DMGRENO
Here the attributes DNAME and DMGRENO are transitively dependent

on ENO.
Third Normal Form(3NF): Solution to example
Decompose the relation into two relations.
EMP_PROJ1
ENO ENAME DOB ADDRESS DNUMBER
EMP_PROJ2
DNUMBER DNAME DMGRENO

Third Normal Form(3NF): Example2
Consider a relation R with simple attributes {A, B, C, D, E}.
A BCDE
BC ADE
D E
Is R in 3 NF, if no decompose.
Example
-With exam_name and total_marks added to our Score table, it saves more data now.
Primary key for our Score table is a composite key, which means it's made up of two
attributes or columns → student_id + subject_id.
-Our new column exam_name depends on both student and subject. For example, a

mechanical engineering student will have Workshop exam but a computer science
student won't. And for some subjects you have Prctical exams and for some you don't. So
we can say that exam_name is dependent on both student_id and subject_id.
-And what about our second new column total_marks? Does it depend on our Score
table's primary key?
-Well, the column total_marks depends on exam_name as with exam type the total score
changes. For example, practicals are of less marks while theory exams are of more marks.
-But, exam_name is just another column in the score table. It is not a primary key or even a
part of the primary key, and total_marks depends on it.
-This is Transitive Dependency. When a non-prime attribute depends on other non-prime

attributes rather than depending upon the prime attributes or primary key.
Boyce-Codd Normal Form(BCNF):
A relation R is in Boyce-Codd normal form (BCNF) if and only if every

determinant is a candidate key.
A relation schema R is in BCNF if whenever a nontrivial functional

dependency X A holds in R, then X is a super key of R.
The only difference between the definitions of BCNF and 3NF is that
condition (b) of 3NF, which allows A to be prime, is absent from BCNF.
3NF Table Not in BCNF
Figure 4.7
Decomposition of Table
Structure to Meet BCNF
BCNF Conversion Results
Boyce-Codd Normal Form(BCNF): Example
Consider a relation R with simple attributes {A, B, C, D, E}.
A BCD
BC AD
D B
The relation R is in 3NF but not in BCNF.

After BCNF normalization
R1{A,C,D}
R2{D,B}
Boyce-Codd Normal Form(BCNF): Example2
Consider a relation TEACH with the following dependencies:

FD1: {STUDENT, COURSE} INSTRUCTOR
FD2: INSTRUCTOR COURSE
Note that {STUDENT, COURSE} is a candidate key for this relation

Boyce and Codd Normal Form
(BCNF)
Boyce and Codd Normal Form is a higher version of the Third Normal
form. This form deals with certain type of anomaly that is not
handled by 3NF.
A 3NF table which does not have multiple overlapping candidate keys
is said to be in BCNF.
For a table to be in BCNF, following conditions must be satisfied:

- R must be in 3rd Normal Form.
- and, for each functional dependency ( X → Y ), X should be a
super Key.
Example
Explanation about example
Cont……
Why the table is not in BCNF?

In the table above, student_id, subject form primary key, which
means subject column is a prime attribute.
But, there is one more dependency, professor → subject.
And while subject is a prime attribute, professor is a non-prime attribute,
which is not allowed by BCNF.
How to satisfy BCNF?

To make this relation(table) satisfy BCNF, we will decompose this table
into two tables, student table and professor table.
Cont……
And now, this relation satisfy Boyce-Codd Normal Form.

Problems with Decompositions:
Three problems to consider:
 Some queries become more expensive.
 Given instances of the decomposed relations, we may not be able

to reconstruct the corresponding instance of the original relation!
 Checking some dependencies may require joining the instances of

the decomposed relations.
Lossless Join Decompositions:
Decomposition of R into X and Y is lossless-join w.r.t. a set of FDs F if, for

every instance r that satisfies F:
Definition extended to decomposition into 3 or more relations in a

straightforward way.
It is essential that all decompositions used to deal with redundancy be

lossless!
Dependency Preserving Decomposition:
If R is decomposed into X, Y and Z, and we enforce the FDs that hold on

X, on Y and on Z, then all FDs that were given to hold on R must also
hold.
Decomposition of R into X and Y is dependency preserving
if (FX union FY ) + = F +
e.g., CSJDPQV, key C, JP C, SD P, J S

To deal with SD P, decompose into SDP, CSJDQV.
To deal with J S, decompose CSJDQV into JS and CJDQV
To ensure dependency preservation, one idea:
If X Y is not preserved, add relation XY.
Problem is that XY may violate 3NF!
e.g., consider the addition of CJP to `preserve’ JP C.
Refinement: Instead of the given set of FDs F, use a minimal cover for F.
Multivalued Dependencies:
Multivalued Dependencies: Definition
Let R be a relation schema and let X and Y be subsets of the attributes of R.
The multivalued dependency X Y is said to hold over R if, in every legal

instance r of R, each X value is associated with a set of Y values and this set
is independent of the values in the other attributes.
Formally, if the MVD X Y holds over R and Z = R−XY , the following must
be true for every legal instance r of R:
If t1 є r, t2 є r and t1:X = t2:X, then there must be some t3 є r such that

t1:XY = t3:XY and t2:Z = t3:Z.
Multivalued Dependencies:
C T
Does this hold on the above relation?
Multivalued Dependencies: Armstrong Axioms
MVD Complementation: If X Y, then X R − XY .
MVD Augmentation: If X Y and W Z, then WX YZ.
MVD Transitivity: If X Y and Y Z, then X (Z − Y )
Replication: If X Y, then X Y.
Coalescence: If X Y and there is a W such that W ∩ Y is empty,

W Z, and Y Z, then X Z.
Fourth Normal Form(4NF):
Let R be a relation schema, X and Y be nonempty subsets of the

attributes of R, and F be a set of dependencies that includes both FDs
and MVDs.
R is said to be in fourth normal form (4NF) if for every MVD X Y that

holds over R, one of the following statements is true:
a) Y X or XY = R, or
b) X is a superkey.
Fourth Normal Form(4NF):
C T We can eliminate the resulting redundancy

by decomposing CTB into CT and CB; each
Is this in 4NF?
of these relations is then in 4NF.
Join Dependencies:
Fifth Normal Form:

NF PDF

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

NF PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

NF PDF

Uploaded by

Copyright:

Available Formats

Functional Dependencies

The design steps:

Problems with redundancy:

Information is stored redundantly wasting storage

Redundant storage: Some information is stored repeatedly.

Update anomalies: If one copy of such repeated data is updated,

Insertion anomalies: It may not be possible to store some

Deletion anomalies: It may not be possible to delete some

There are two types of decomposition :

1). Lossy Decomposition

One of the disadvantages of decomposition into two or

1.Do we need to decompose a relation?

If a relation schema is in one of these normal forms, we know that

If we decide that a relation schema must be decomposed further, we

The lossless-join property enables us to recover any instance of the

The dependency-preservation property enables us to enforce any

3. Lack of Data Redundancy

------- In the above example, it means that, if natural joins of all

A functional dependency X  Y holds over relation schema R if,

In other words: X  Y means

Question: How related to keys?

FDs are a generalization of keys.

1. CLOSURE OF A SET OF FDs (FDC):

An important question is how we can infer, or compute, the closure of a

The following three rules, called Armstrong's Axioms, can be applied

Given a relation R και its functional dependencies

101 Lennon M S I 250 3.00

101 Lennon M S I 415 3.00

125 Jo h n s o n M S I 331 3.00

101 A T& T 1234

101 A T& T 156

{Student_Id, Student_Name} -> Student_Id is a trivial functional dependency as

That makes sense because if we know the values of Student_Id and

The following functional dependencies are non-trivial:

emp_id -> emp_name (emp_name is not a subset of emp_id)

On the other hand, the following dependencies are trivial:

{emp_id, emp_name} -> emp_name [emp_name is a subset of {emp_id, emp_name}]

Completely non trivial FD:

--- Give a primary key of the relation r(name, address,

To prove that name is a superkey:

1 name → address decomposition on fd1

A set of attributes X functionally determines a set of attributes Y if the

X Y in R speciﬁes a constraint on all relation instances r(R)

IR1. (Reﬂexive) If Y is subset-of X, then X Y

IR2. (Augmentation) If X Y, then XZ YZ

IR3. (Transitive) If X Y and Y Z, then X Z

Some additional inference rules that are useful:

(Decomposition) If X YZ, then X Y and X Z

(Union) If X Y and X Z, then X YZ

(Psuedotransitivity) If X Y and WY Z, then WX Z

Closure of a set of attributes X with respect to F is the set X + of all

Two sets of FDs F and G are equivalent if:

- every FD in F can be inferred from G, and

- every FD in G can be inferred from F

Hence, F and G are equivalent if F + =G +

Deﬁnition: F covers G if every FD in G can be inferred from F

F and G are equivalent if F covers G and G covers F

A set of FDs is minimal if it satisﬁes the following conditions: