[go: up one dir, main page]

0% found this document useful (0 votes)
52 views98 pages

NF PDF

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 98

Functional Dependencies

and
Normalization
Design Steps
A database is a collection of information that is organized so that it
can easily be accessed, managed, and updated.

The design steps:


1. Real-World 2. ER model 3. Relational Schema
4. Better relational Schema 5. Relational DBMS

Step (3) to step (4) is based on a “design theory” for relations and is
called “normalization”.
Database design lifecycle
Requirements analysis
User needs; what must database do?
Conceptual design
High-level description; often using E/R model
Logical design
Translate E/R model into relational schema
Schema refinement
Check schema for redundancies and anomalies
Physical design/tuning
Consider typical workloads, and further optimise
Normalization or Schema Refinement
*Normalization or Schema Refinement is a technique of
organizing the data in the database.
*It is a systematic approach of decomposing tables to
eliminate data redundancy and undesirable
characteristics like Insertion, Update and Deletion
Anomalies.
*The best technique of schema refinement is
decomposition.
*The Basic Goal of Normalization is used to eliminate
redundancy.
*Redundancy refers to repetition of same data or duplicate
copies of same data stored in different locations.
Redundancy:

Problems with redundancy:

Information is stored redundantly wasting storage

Insertion anomalies

Deletion anomalies

Updation anomalies
Problems Caused by Redundancy
Storing the same information redundantly, that is, in more than
one place within a database, can lead to several problems:

Redundant storage: Some information is stored repeatedly.

Update anomalies: If one copy of such repeated data is updated,


an inconsistency is created unless all copies are similarly
updated.

Insertion anomalies: It may not be possible to store some


information unless some other information is stored as well.

Deletion anomalies: It may not be possible to delete some


information without losing some other information as well.
Updation/Modification Anomaly
If there is updation in the fee from 5000 to 7000, then we
have to update FEE column in all the rows, else data will
become inconsistent.
Insertion Anomaly and Deletion Anomaly
These anamolies exist only due to redundancy, otherwise
they
do not exist.
Insertion Anomaly :
New course is introduced C4, But no student is there
who is having C4 subject.
Cont……
Deletion Anomaly :
- Deletion of S3 student cause the deletion of course.
- Because of deletion of some data forced to delete
some
other useful data.
DECOMPOSITION
Decomposition is the process of breaking down in parts or
elements.
It replaces a relation with a collection of smaller relations.
It breaks the table into multiple tables in a database.
It should always be lossless, because it confirms that the
information in the original relation can be accurately
reconstructed based on the decomposed relations.
If there is no proper decomposition of the relation, then it
may lead to problems like loss of information.
Types of Decomposition

There are two types of decomposition :

1). Lossy Decomposition


2). Lossless Join Decomposition
1)Lossy Decomposition
"The decomposition of relation R into R1 and R2 is lossy
when the join of R1 and R2 does not yield the same
relation as in R.”

 One of the disadvantages of decomposition into two or


more relational schemes (or tables) is that some
information is lost during retrieval of original relation or
table.
EXAMPLE: Consider that we have table STUDENT with three attribute roll_no ,
sname
and department.
2)Lossless Join Decomposition
"The decomposition of relation R into R1 and R2 is lossless
when the join of R1 and R2  yield the same relation as in R.

A relational table is decomposed (or factored) into two or
more smaller tables, in such a way that the designer can
capture the precise content of the original table by joining
the decomposed parts.
This is called lossless-join (or non-additive join)
decomposition.
This is also referred as non-additive decomposition.
The lossless-join decomposition is always defined with
respect to a specific set F of dependencies.
EXAMPLE: Consider that we have table STUDENT with three attribute roll_no ,
sname and department.
Problems Related to
Decomposition
Unless we are careful, decomposing a relation schema can create
more problems than it solves.
Two important questions must be asked repeatedly:

1.Do we need to decompose a relation?

If a relation schema is in one of these normal forms, we know that


certain kinds of problems cannot arise. Considering the normal form
of a given relation schema can help us to decide whether or not to
decompose it further.

If we decide that a relation schema must be decomposed further, we


must choose a particular decomposition(i.e., a particular collection
of smaller relations to replace the given relation).
Cont……
2. What problems (if any) does a given decomposition cause?

The lossless-join property enables us to recover any instance of the


decomposed relation from corresponding instances of the smaller
relations.

The dependency-preservation property enables us to enforce any


constraint on the original relation by simply enforcing some
constraints on each of the smaller relations.

That is, we need not perform joins of the smaller relations to check
whether a constraint on the original relation is violated.
PROPERTIES OF DECOMPOSITION
Following are the properties of Decomposition,

1. Lossless Decomposition

2. Dependency Preservation

3. Lack of Data Redundancy


1. Lossless Decomposition
Decomposition must be lossless.
It means that the information should not get lost from the relation that
is decomposed.
It gives a guarantee that the join will result in the same relation as it
was decomposed.
Example:
Let's take 'E' is the Relational Schema, With instance 'e';
is decomposed into: E1, E2, E3, . . . . En;
With instance: e1, e2, e3, . . . . en, If e1 ⋈ e2 ⋈ e3 . . . . ⋈ en, then it
is called as 'Lossless Join Decomposition'.

------- In the above example, it means that, if natural joins of all


the decomposition give the original relation, then it is said to be
lossless join decomposition.
Decompositio
n.
2. Dependency Preservation
Dependency is an important constraint on the database.
Every dependency must be satisfied by at least one
decomposed table.
If {A → B} holds, then two sets are functional dependent.
And, it becomes more useful for checking the
dependency easily if both sets in a same relation.
This decomposition property can only be done by
maintaining the functional dependency.
In this property, it allows to check the updates without
computing the natural join of the database structure.
3. Lack of Data Redundancy
Lack of Data Redundancy is also known as a Repetition of
Information.
The proper decomposition should not suffer from any data
redundancy.
The careless decomposition may cause a problem with the
data.
The lack of data redundancy property may be achieved by
Normalization process.
FUNCTIONAL DEPENDENCIES (FDs)
• It is a form of integrity constraint that can identify schemas
with such problems and suggest refinements.

A functional dependency X  Y holds over relation schema R if,


for every allowable instance r of R:
t1  r, t2  r, X (t1) = X (t2)
implies Y (t1) = Y (t2)
(where t1 and t2 are tuples; X and Y are sets of attributes)

In other words: X  Y means


Given any two tuples in r, if the X values are the same, then the Y
values must also be the same. (but not vice versa)
Read “” as “determines”.
Cont……
An FD is a statement about all allowable relations.
Must be identified based on semantics of application.
Given some instance r1 of R, we can check if r1 violates
some FD f, but we cannot determine if f holds over R.

Question: How related to keys?


If “K  all attributes of R” then K is a superkey for R
(does not require K to be minimal.)

FDs are a generalization of keys.


FUNCTIONAL DEPENDENCIES
(FDs)
A functional dependency (FD) is a relationship between two
attributes, typically between the PK and other non-key
attributes within a table.
For any relation R, attribute Y is functionally dependent on
attribute X (usually the PK), if for every valid instance of X, that
value of X uniquely determines the value of Y.
This relationship is indicated by the representation below :
X Y
The left side of the above FD diagram is called the determinant,
and the right side is the dependent.
Reasoning About FDs
The reasons about FDs are in two ways:
1. CLOSURE OF A SET OF FDs
2. ATTRIBUTE CLOSURE

1. CLOSURE OF A SET OF FDs (FDC):

The set of all FDs implied by a given set F of FDs is called the closure
of F and is denoted as F+.

An important question is how we can infer, or compute, the closure of a


given set F of FDs.

The following three rules, called Armstrong's Axioms, can be applied


repeatedly to infer all FDs implied by a set F of FDs.
1. CLOSURE OF A SET OF FDs
Given some FDs, we can usually infer or compute
additional FDs:
ssn did, did lot implies ssn lot
An FD f is implied by a set of FDs F if f holds whenever all
FDs in F hold.
F + = closure of F is the set of all FDs that are implied by F.
Armstrong’s Axioms (OR) AA (X, Y, Z are sets of attributes)
:
Reflexivity: If X C Y, then Y X
Augmentation: If X Y, then XZ YZ for any Z
Transitivity: If X Y and Y Z, then X Z
These are sound and complete inference rules for FDs!
Cont…..
Couple of additional rules (that follow from AA):
Union: If X Y and X Z, then X YZ
Decomposition: If X YZ, then X Y and X
Z

Example:
Contracts(cid,sid,jid,did,pid,qty,value), and:
C is the key: C CSJDPQV
Project purchases each part using single contract: JP
C
Dept purchases at most one part from a supplier: SD
P
JP C, C CSJDPQV imply JP CSJDPQV
SD P implies SDJ JP
SDJ JP, JP CSJDPQV imply SDJ
2. ATTRIBUTE CLOSURE (AC)
Closure of a Set of Attributes X (Attribute closure), from
functional dependencies F, is the set of attributes which
are functionally dependent from the set of Attributes X
and it is denoted with X+.

Properties of a Key:
A key X, which belongs in relation R has the following
properties:
1. X ⊆ R
2. X → R (complete key)
3. There is no X′ ⊂ X such that X′ →R (minimal key)
Example for producing a FD based on AA
Given a relation R with attributes W, U, V, X, Y, Z and functional
dependencies: W UV, U Y, VX YZ. Prove that it holds: WX
Z.

Solution:
1. (with decomposition) From W V we take W V.
2. (with augmentation) WX VX.
3. (with transitivity) WX YZ.
4. (with decomposition) WX Z.
Algorithm to find the Closure of a Set of attributes X

Given a relation R και its functional dependencies


F+, find the closure of attribute A.

1. Let X=A.
2. Among the functional Dependencies of F+, we
search for dependencies C D, where C ⊆ X. If
we found such a dependency, then we add D in X.
3. We repeat Step 2 till we cannot add additional
attributes in X.
Example 1
Let R= (V, Y, Z, W) and F+ = {V Z, VZ W, W Y, VY W}
Find the closure of attribute V.

Solution:

Step 1: X=V.
Step 2: X=VZ because of V Z.
Step 3: X=VZW because of VZ W.
Step 3: X=VZWY because of W Y.
Step 3: No more repeats can be made.
Example 2
Let R = ( V, Y, Z, W) and F+ = {V Υ, W Y, V W}
Find the closure of attribute V.

Solution:

Step 1: X=V.
Step 2: X=VY because of V Y.
Step 3: X=VYW because of V W.
Step 3: no more repeats can be made.
Other Kinds of Dependencies
(OR) Types of Functional Dependencies:

Multivalued dependency
Partial Dependency
Transitive dependency
Trivial functional dependency
Non-trivial functional dependency
Multivalued Dependency
Multivalued Attributes (or repeating groups): non-key
attributes or groups of non-key attributes the values
of which are not uniquely identified by (directly or
indirectly) (not functionally dependent on) the value
of the Primary Key (or its part).

S TUDENT

S tu d _ I D Nam e C o u r se _ ID U n i ts

101 Lennon M S I 250 3.00

101 Lennon M S I 415 3.00

125 Jo h n s o n M S I 331 3.00


Partial Dependency
Partial Dependency – when an non-key attribute is
determined by a part, but not the whole, of a
COMPOSITE primary key.
Partial
CUS TO MER Dependency

C u st_ I D Nam e O rd e r_ ID

101 A T& T 1234

101 A T& T 156

125 C is c o 1250
Transitive Dependency
Transitive Dependency – A functional dependency is said to be
transitive if it is indirectly formed by two functional dependencies.
The advantage of removing transitive dependency is -
• Amount of data duplication is reduced.
• Data integrity achieved.

Transitive
Dependency

EMPLO YEE

E m p _ ID F_Nam e L_Nam e D e p t_ I D D e p t_ N a m e

111 M a ry Jo n e s 1 A cct

122 S a ra h S m it h 2 M k tg
Trivial Functional dependency
The dependency of an attribute on a set of attributes is known as trivial
functional dependency if the set of attributes includes that attribute.
Symbolically: A ->B is trivial functional dependency if B is a subset of A.
The following dependencies are also trivial: A->A & B->B

Example:
Consider a table with two columns: Student_id and Student_Name.

{Student_Id, Student_Name} -> Student_Id is a trivial functional dependency as


Student_Id is a subset of {Student_Id, Student_Name}. 

That makes sense because if we know the values of Student_Id and


Student_Name then the value of Student_Id can be uniquely determined.

Also, Student_Id -> Student_Id & Student_Name -> Student_Name are trivial
dependencies too.
Non - Trivial Functional
dependency
If a functional dependency X->Y holds true where Y is not a subset of X then this dependency is
called non trivial Functional dependency.
Example:
An employee table with three attributes: emp_id, emp_name, emp_address.

The following functional dependencies are non-trivial:

emp_id -> emp_name (emp_name is not a subset of emp_id)


emp_id -> emp_address (emp_address is not a subset of emp_id)

On the other hand, the following dependencies are trivial:

{emp_id, emp_name} -> emp_name [emp_name is a subset of {emp_id, emp_name}]

Completely non trivial FD:

If a FD X->Y holds true where X intersection Y is null then this dependency is said to be
completely non trivial function dependency.
EXERCISE
Suppose you are given the following functional
dependencies:
fd1: name → address, gender
fd2: address → rank
fd3: rank, gender → salary

--- Give a primary key of the relation r(name, address,


gender, rank, salary). Prove your answer formally using
Armstrong's Axioms.
SOLUTION
name is the primary key. To prove this, first prove that name is a superkey, then prove it is a candidate key and finally a primary key.

To prove that name is a superkey:

1 name → address decomposition on fd1


2 name → gender decomposition on fd1
3 name → rank 3transitivity on 1. and fd2
4 name, gender → salary 3. & pseudo-transitivity on fd2
5 name → name, gender augmentation of 2.
6 name → salary transitivity on 4. and 5.
7 name → name, address, union of 1., 2., 3., 6. &
rank, gender, salary augmentation w. name
From the given FD's, it is not possible to have an FD that only has 'name’ on its right-hand side, therefore, the attribute name must be part of any
super key. As the definition of a candidate key is a minimal super key and {name} is a superkey, and the empty set is not a superkey, 'name'
must be
a candidate key.
Since name is a candidate key and it is the only candidate key, it is also the primary key.
Functional Dependencies:

FDs are constraints that are derived from the meaning and
interrelationships of the data attributes

A set of attributes X functionally determines a set of attributes Y if the


value of X determines a unique value for Y

X Y holds if whenever two tuples have the same value for X, they must
have the same value for Y

For any two tuples t1 and t2 in any relation instance r(R): If t1[X]=t2[X],
then t1[Y]=t2[Y]

X Y in R specifies a constraint on all relation instances r(R)


Armstrong's inference rules:

IR1. (Reflexive) If Y is subset-of X, then X Y

IR2. (Augmentation) If X Y, then XZ YZ


(Notation: XZ stands for X U Z)

IR3. (Transitive) If X Y and Y Z, then X Z

Some additional inference rules that are useful:

(Decomposition) If X YZ, then X Y and X Z

(Union) If X Y and X Z, then X YZ

(Psuedotransitivity) If X Y and WY Z, then WX Z


Functional Dependencies:

Closure of a set F of FDs is the set F+ of all FDs that can be inferred
from F.

Closure of a set of attributes X with respect to F is the set X + of all


attributes that are functionally determined by X

Two sets of FDs F and G are equivalent if:

- every FD in F can be inferred from G, and

- every FD in G can be inferred from F

Hence, F and G are equivalent if F + =G +

Definition: F covers G if every FD in G can be inferred from F


(i.e., if G + subset-of F +)

F and G are equivalent if F covers G and G covers F


Minimal Sets of FDs

A set of FDs is minimal if it satisfies the following conditions:

(1) Every dependency in F has a single attribute for its RHS.

(1) We cannot replace any dependency X A in F with a dependency Y


A, where Y proper-subset-of X ( Y subset-of X) and still have a set of
dependencies that is equivalent to F.

(1) We cannot remove any dependency from F and have a set of


dependencies that is equivalent to F.
Normal Forms
Normalization of Relations

Normalization: The process of decomposing unsatisfactory "bad" relations


by breaking up their attributes into smaller relations

Normal form: Condition using keys and FDs of a relation to certify whether a
relation schema is in a particular normal form

Relationship Between Normal Forms


Types of Normal Form
First Normal Form (1NF) - included in the definition of a relation.

Second Normal Form (2NF) defined in terms of


functional dependencies
Third Normal Form(3NF)

Boyce-Codd Normal Form ( BCNF)

Fourth Normal Form (4NF) - defined using multivalued


dependencies

Fifth Normal Form (5NF) or Project Join Normal Form (PJNF) -


defined using join dependencies
Normal Forms:

First Normal Form


‘A relation R is in first normal form (1NF) if and only if all underlying
domains contain atomic values only.’

Second Normal Form


‘A relation R is in second normal form (2NF) if and only if it is in 1NF and
every non-key attribute is fully dependent on the primary key.’

Third Normal Form


‘A relation R is in third normal form (3NF) if and only if it is in 2NF and
every non-key attribute is non transitively dependent on the primary key.’

Boyce-Codd Normal Form


‘A relation R is in Boyce-Codd normal form (BCNF) if and only if every
determinant is a candidate key.’
First Normal Form(1NF):

‘A relation R is in first normal form (1NF) if and only if all underlying


domains contain atomic values only.’

Disallows composite attributes, multivalued attributes, and nested


relations; attributes whose values for an individual tuple are non-atomic
First Normal Form: Example Multivalued attributes

Department
DNAME DNUMBER DMGRSSN DLOCATIONS

DNAME DNUMBER DMGRSSN DLOCATIONS


Research 5 333445555 Bellaire, Sugarland, Houston
Administration 4 987654321 Stanfford
Headquarters 1 888665555 Houston
First Normal Form: Solutions

There are three main techniques to achieve first normal form for such a
relation:

1. Remove the attribute DLOCATIONS that violates 1NF and place it in a


separate relation DEPT_LOCATIONS along with the primary key
DNUMBER of DEPARTMENT.

2. Expand the key so that there will be a separate tuple in the original
DEPARTMENT relation for each location of a DEPARTMENT. In this
case, the primary key becomes the combination {DNUMBER,
DLOCATION}.

3. If a maximum number of values is known for the attribute-for example,


if
it is known that at most three locations can exist for a department-
replace the DLOCATIONS attribute by three atomic attributes:
DLOCATIONl, DLOCATION2, and DLOCATION3.
First Normal Form: Solutions to example
There are three main techniques to achieve first normal form for such a
relation:
Department Department_Loc
1. DNAME DNUMBER DMGRSSN DNUMBER DLOCATIONS

DNAME DNUMBER DMGRSSN DLOCATIONS


2.
Research 5 333445555 Bellaire
Research 5 333445555 Sugarland
Research 5 333445555 Houston
Administration 4 987654321 Stanfford
Headquarters 1 888665555 Houston

3.

DNAME DNUMBER DMGRSSN DLOCATIONS1 DLOCATIONS2 DLOCATIONS3


First Normal Form: Best solution

Of the three solutions above, the first is generally considered best


because it does not suffer from redundancy and it is completely general,
having no limit placed on a maximum number of values.
First Normal Form: Nested relations
First normal form also disallows multivalued attributes that are
themselves composite.
These are called nested relations because each tuple can have a
relation within it.
PROJS
EMP_PROJ ENO ENAME PNUMBER HOURS

ENO ENAME PNUMBER HOURS


11111 Mahesh 1 32.5
2 7.5
22222 Ramesh 2 40.5
3 35.6
33333 Kiran 2 25.8
10 45
20 38.7
44444 Karthik 30 9.7
First Normal Form: Solution

To normalize this into INF,


we remove the nested relation attributes into a new relation and
propagate the primary key into it;
the primary key of the new relation will combine the partial key with the
primary key of the
original relation.

EMP_PROJ1
ENO ENAME

EMP_PROJ2
ENO PNUMBER HOURS
First Normal Form: Problem
Person
ENO CAR_LIC PHONE

Where CAR_LIC and PHONE are multivalued attributes

Is this in 1NF, if no decompose.


Example
Example
A row of data cannot contain repeating group of data i.e atomic value
Cid Cname Subject
101 Jeet PHY
101 Jeet CHE
102 Seet PHY
103 Swet SOCIAL

Here the student Jeet is used twice in the table and subject PHY is repeated Another
method is
to divide the relation into 2
Second Normal Form(2NF):

A relation R is in second normal form (2NF) if and only if it is in 1NF and


every non-key attribute is fully functionally dependent on the primary key
(or candidate key).

A functional dependency X Y is a full functional dependency if removal


of any attribute A from X means that the dependency does not hold any
more.
Second Normal Form(2NF): Examples

EMP_PROJ

ENO PNUMBER HOURS ENAME PNAME PLOCATION


FD1

FD2

FD3
Second Normal Form(2NF): Testing

The test for 2NF involves testing for functional dependencies whose left-
hand side attributes are part of the primary key.

If the primary key contains a single attribute, the test need not be applied.

If a relation schema is not in 2NF, it can be "second normalized" or "


2NFnormalized" into a number of 2NF relations in which nonprime
attributes are associated only with the part of the primary key on which they
are fully functionally dependent.
Second Normal Form(2NF): Solution to example

In functional dependency FD1, the attribute HOURS is fully functionally


dependent on the primary key {SSN, PNUMBER}.

The functional dependencies FD2 and FD3 make ENAME, PNAME, and
PLOCATION partially dependent on the primary key {SSN, PNUMBER}.

Hence, decompose EMP_PROJ into the three relation schemas


EMP_PROJ1, EMP_PROJ 2, and EMP_PROJ 3

EMP_PROJ1 EMP_PROJ2
ENO PNUMBER HOURS ENO ENAME
FD1 FD2

EMP_PROJ3
PNUMBER PNAME HOURS
FD3
Example
Consider the following relation, not in 2NF

-HerHere Cid & Order_id is PK


-It is in 1NF
-Not in 2NF, there are partial dependencies of columns on Primary key
-Cname is only dependent on Cid
-Order_name is dependent on order_id
-There is no link between Cname & Sale_details
-To reduce this table into 2NF, break the table into 3 different tables
e Cid & Order_id is PK
-It is in 1NF
-Not in 2NF, there are partial dependencies of columns on Primary key
-Cname is only dependent on Cid
-Order_name is dependent on order_id
Cont……

Here Cid & Order_id is PK


-It is in 1NF
-Not in 2NF, there are partial dependencies
of columns on Primary key
-Cname is only dependent on Cid
-Order_name is dependent on order_id
-There is no link between Cname &
Sale_details
-To reduce this table into 2NF, break the
table into 3 different tables
Second Normal Form(2NF): Example2
Consider a relation R with simple attributes {A, B, C, D, E, F}.
R contain two candidate keys {A} and {B,C}
{A} is made primary key.
The following FDs hold over R
A BCDEF
BC ADEF
B F
D E

Is R in 1NF, if no decompose.
Is R in 2NF, if no decompose.
Third Normal Form(3NF):

A relation R is in third normal form (3NF) if and only if it is in 2NF and


every non-key attribute is non transitively dependent on the primary key.

A relation schema R is in third normal form (3NF) if, whenever a


nontrivial functional dependency X A holds in R, either (a) X is a super
key of R, or (b) A is a prime attribute of R.
Third Normal Form(3NF): Example

EMP_PROJ

ENO ENAME DOB ADDRESS DNUMBER DNAME DMGRENO

Here the attributes DNAME and DMGRENO are transitively dependent


on ENO.
Third Normal Form(3NF): Solution to example

Decompose the relation into two relations.

EMP_PROJ1

ENO ENAME DOB ADDRESS DNUMBER

EMP_PROJ2

DNUMBER DNAME DMGRENO


Third Normal Form(3NF): Example2
Consider a relation R with simple attributes {A, B, C, D, E}.
R contain two candidate keys {A} and {B,C}
{A} is made primary key.
The following FDs hold over R
A BCDE
BC ADE
D E

Is R in 3 NF, if no decompose.
Example
-With exam_name and total_marks added to our Score table, it saves more data now.
Primary key for our Score table is a composite key, which means it's made up of two
attributes or columns → student_id + subject_id.

-Our new column exam_name depends on both student and subject. For example, a


mechanical engineering student will have Workshop exam but a computer science
student won't. And for some subjects you have Prctical exams and for some you don't. So
we can say that exam_name is dependent on both student_id and subject_id.

-And what about our second new column total_marks? Does it depend on our Score
table's primary key?

-Well, the column total_marks depends on exam_name as with exam type the total score
changes. For example, practicals are of less marks while theory exams are of more marks.

-But, exam_name is just another column in the score table. It is not a primary key or even a
part of the primary key, and total_marks depends on it.

-This is Transitive Dependency. When a non-prime attribute depends on other non-prime


attributes rather than depending upon the prime attributes or primary key.
Boyce-Codd Normal Form(BCNF):

A relation R is in Boyce-Codd normal form (BCNF) if and only if every


determinant is a candidate key.

A relation schema R is in BCNF if whenever a nontrivial functional


dependency X A holds in R, then X is a super key of R.

The only difference between the definitions of BCNF and 3NF is that
condition (b) of 3NF, which allows A to be prime, is absent from BCNF.
3NF Table Not in BCNF

Figure 4.7
Decomposition of Table
Structure to Meet BCNF
BCNF Conversion Results
Boyce-Codd Normal Form(BCNF): Example
Consider a relation R with simple attributes {A, B, C, D, E}.
R contain two candidate keys {A} and {B,C}
{A} is made primary key.
The following FDs hold over R
A BCD
BC AD
D B

The relation R is in 3NF but not in BCNF.


After BCNF normalization
R1{A,C,D}
R2{D,B}
Boyce-Codd Normal Form(BCNF): Example2

Consider a relation TEACH with the following dependencies:


FD1: {STUDENT, COURSE} INSTRUCTOR
FD2: INSTRUCTOR COURSE

Note that {STUDENT, COURSE} is a candidate key for this relation


Boyce and Codd Normal Form
(BCNF)
Boyce and Codd Normal Form is a higher version of the Third Normal
form. This form deals with certain type of anomaly that is not
handled by 3NF.

A 3NF table which does not have multiple overlapping candidate keys
is said to be in BCNF.

For a table to be in BCNF, following conditions must be satisfied:


- R must be in 3rd Normal Form.
- and, for each functional dependency ( X → Y ), X should be a
super Key.
Example
Explanation about example
Cont……

Why the table is not in BCNF?


In the table above, student_id, subject form primary key, which
means subject column is a prime attribute.
But, there is one more dependency, professor → subject.
And while subject is a prime attribute, professor is a non-prime attribute,
which is not allowed by BCNF.

How to satisfy BCNF?


To make this relation(table) satisfy BCNF, we will decompose this table
into two tables, student table and professor table.
Cont……

And now, this relation satisfy Boyce-Codd Normal Form.


Problems with Decompositions:

Three problems to consider:

 Some queries become more expensive.

 Given instances of the decomposed relations, we may not be able


to reconstruct the corresponding instance of the original relation!

 Checking some dependencies may require joining the instances of


the decomposed relations.
Lossless Join Decompositions:

Decomposition of R into X and Y is lossless-join w.r.t. a set of FDs F if, for


every instance r that satisfies F:

Definition extended to decomposition into 3 or more relations in a


straightforward way.

It is essential that all decompositions used to deal with redundancy be


lossless!
Dependency Preserving Decomposition:

If R is decomposed into X, Y and Z, and we enforce the FDs that hold on


X, on Y and on Z, then all FDs that were given to hold on R must also
hold.
Decomposition of R into X and Y is dependency preserving

if (FX union FY ) + = F +

e.g., CSJDPQV, key C, JP C, SD P, J S


To deal with SD P, decompose into SDP, CSJDQV.
To deal with J S, decompose CSJDQV into JS and CJDQV
To ensure dependency preservation, one idea:
If X Y is not preserved, add relation XY.
Problem is that XY may violate 3NF!
e.g., consider the addition of CJP to `preserve’ JP C.
Refinement: Instead of the given set of FDs F, use a minimal cover for F.
Multivalued Dependencies:
Multivalued Dependencies: Definition

Let R be a relation schema and let X and Y be subsets of the attributes of R.

The multivalued dependency X Y is said to hold over R if, in every legal


instance r of R, each X value is associated with a set of Y values and this set
is independent of the values in the other attributes.

Formally, if the MVD X Y holds over R and Z = R−XY , the following must
be true for every legal instance r of R:

If t1 є r, t2 є r and t1:X = t2:X, then there must be some t3 є r such that


t1:XY = t3:XY and t2:Z = t3:Z.
Multivalued Dependencies:

C T
Does this hold on the above relation?
Multivalued Dependencies: Armstrong Axioms

MVD Complementation: If X Y, then X R − XY .

MVD Augmentation: If X Y and W Z, then WX YZ.

MVD Transitivity: If X Y and Y Z, then X (Z − Y )

Replication: If X Y, then X Y.

Coalescence: If X Y and there is a W such that W ∩ Y is empty,


W Z, and Y Z, then X Z.
Fourth Normal Form(4NF):

Let R be a relation schema, X and Y be nonempty subsets of the


attributes of R, and F be a set of dependencies that includes both FDs
and MVDs.

R is said to be in fourth normal form (4NF) if for every MVD X Y that


holds over R, one of the following statements is true:

a) Y X or XY = R, or

b) X is a superkey.
Fourth Normal Form(4NF):

C T We can eliminate the resulting redundancy


by decomposing CTB into CT and CB; each
Is this in 4NF?
of these relations is then in 4NF.
Join Dependencies:
Fifth Normal Form:

You might also like