Chapter 7 - Relational Database Design
Functional Dependency
f : α -> β
α β α β α β
a 1 a 1 a 1
b 2 a 2 b 1
c 3 c 3 c 3
d 4 d 4 d 4
Given the value of α, one can search value of β
Functional Dependency
α -> β
Trivial Non Trivial
AB -> A β ⊄α
β⊆α AB -> ABC
Functional Dependency
R
A B C D E
a 2 3 4 5 α -> β
2 a 3 4 5
If α is unique, no need to check β
a 2 3 6 5 If β is same, no need to check α
a 2 3 6 6
Find which of the following is true
1. A -> BC
2. DE -> C
3. C -> DE
4. BC -> A
Functional Dependency
R
A B C
1 2 4
3 5 4
3 7 2
1 4 2
Find which of the following is true
1. A -> B && BC -> A
2. C -> B && CA -> B
3. B -> C && AB -> C
4. A -> C && BC -> A
Closure of Attribute Set
Attribute closure of an attribute set ‘A’ can be defined
as a set of attributes which can be functionally
determined from it denoted by F+
R(ABC)
A-> B
B -> C
Find (A) +
(A) + = A
= AB
= ABC
Closure of Attribute Set
R(ABCDEF)
A -> B
C -> DE
AC -> F
D -> AF
E -> CF
Find (D) +
(D) + = D
= DAF
= DABF
Find (DE) +
(DE) + = DE
= DAFE
= DAFCE
= DABFCE
Closure of Attribute Set
R(ABCDEFG) Find (AC) + = ABCDE
A -> B
BC -> DE
AEG -> G
R(ABCDE)
A -> BC Find (B) + = BD
CD -> E
B -> D
E -> A
R(ABCDEF)
AB-> C Find (AB) + = ABCDE
BC -> AD
D -> E
CF -> B
R(ABCDEFGH)
A -> BC Find whether BCD-> H
= yes
CD -> E
E -> C
ED-> AEH
ABH -> BD
DH -> BC
Equivalence of Functional Dependency
R(ACDEH)
F: A C G: A -> CD
AC -> D E -> AH
E -> AD
E -> H
Find which one holds true
a. F ⊆ G
b. G ⊆ F
c. F = G
d. F ≠ G
Soln :- A+ = ACD A+ = ACD
AC + = ACD E + = ACDEH
E + = ACDEH
F⊆G G⊆F
Therefore, c. F=G is true
Equivalence of Functional Dependency
R(PQRS)
X: P Q Y: P -> QR
Q -> R R -> S
R -> S
Find which one holds true
a. X ⊆ Y
b. Y ⊆ X
c. X = Y
d. X ≠ Y
Soln :- P+ = PQRS P+ = PQRS
Q+ = Q R + = RS
R + = RS
X ⊄ Y therefore X≠Y
Therefore, b. Y ⊆ X is correct
Equivalence of Functional Dependency
R(VWXYZ)
F: WX G: W -> XY
WX -> Y Z -> WX
Z -> WY
Z -> V
Find which one holds true
a. F ⊆ G
b. G ⊆ F
c. F = G
d. F ≠ G
Soln :- ?
Armstrong’s Axiom
Primary Rules(RAT)
1. Reflexivity :- if y ⊆ x AB ⊆ ABC
then, x-> y ABC -> AB
2. Augmentation :- if x -> y
then, xz -> yz
3. Transitivity :- if x-> y && y-> z
then x -> z
Secondary Rules
1. Union:- if x-> y && x-> z
then, x -> yz
2. Decomposition :- if x-> yz
then, x-> y && x-> z
Eg:- AB -> CD UnionAB -> C
AB -> C A -> C Decomposition
AB -> D B -> C
Armstrong’s Axiom
Secondary Rules
3. Pseudo transitivity:- if x-> y && wy-> z
then, wx -> z
Proof:- x -> y
wx -> wy (Augmentation)
wy -> z (given)
therefore, wx -> z (Transitivity)
4. Composition :- if x-> y && z-> w
then, xz -> yw
Proof :- x -> y xz -> yz (Augmentation)
z -> w xz -> xw (Augmentation)
therefore, xz -> yzxw
xz -> yw
Canonical Cover
Minimal Set / Irreducible set of Functional Dependency
R(WXYZ)
X -> W
WZ -> XY
Y -> WXZ
STEP 1 : Decompose RHS STEP 2: Find Closure with and without statement
X -> W X + = XW X+ = X
WZ -> X
WZ -> Y
Y -> W
Y -> X
Y -> Z
Canonical Cover
Minimal Set / Irreducible set of Functional Dependency
R(WXYZ)
X -> W
WZ -> XY
Y -> WXZ
STEP 1 : Decompose RHS STEP 2: Find Closure with and without statement
X -> W X + = XW X+ = X
WZ -> X
WZ -> Y
Y -> W
Y -> X
Y -> Z
Canonical Cover
Minimal Set / Irreducible set of Functional Dependency
R(WXYZ)
X -> W
WZ -> XY
Y -> WXZ
STEP 1 : Decompose RHS STEP 2: Find Closure with and without statement
X -> W X + = XW X+ = X
WZ -> X WZ + = WZXY WZ + = WZXY
WZ -> Y WZ + = WZYX WZ + = WZ
Y -> WY + = YWXZ Y + = YWXZ
Y -> X Y + = YXZW Y + = YZ
Y -> Z Y + = YXZW Y + = YWX
Canonical Cover
R(WXYZ)
X -> W
WZ -> XY
Y -> WXZ
STEP 3: Consider remaining dependencies
X -> W
WZ -> Y
Y -> X
Y -> Z
STEP 4: Decompose LHS
WZ + = WZYX
W+ = W
Z+ = Z
OUTPUT:
X -> W
WZ -> Y
Y -> XZ
Find Canonical Cover
R(ABCD)
A -> B
C -> B
D -> ABC
AC -> D
STEP 1 : Decompose RHS STEP 2: Find Closure with and without statement
A -> B A + = AB A + = A
C -> B C + = CB C + = C
D -> A D + = DABC D + = DBC
D -> B D + = DABC D + = DABC
D -> C D + = DACB D + = DAB
AC -> D AC + = ACDB AC + = ACB
Find Canonical Cover
R(ABCD)
A -> B
C -> B
D -> ABC
AC -> D
STEP 3: Consider remaining dependencies
A -> B
C -> B
D -> A
D -> C
AC -> D
STEP 4: Decompose LHS
AC + = ACDB
A + = AB
C + = CB
OUTPUT:
A -> B
C -> B
D -> AC
AC -> D
Find Canonical Cover
R(vwxyz)
V -> W
VW -> X
Y -> VXZ
Soln :- ?
Keys
R(ABCD)
A -> BC
A + = ABC
A + ≠ ABCD ‘A’ CANNOT BE A KEY
(Key) + = R
A super key is a set of one or more attributes (columns), which can
uniquely identify a row in a table
R(ABCD)
ABC -> D (ABC) + = ABCD Super Key
AB -> CD (AB) + = ABCD Super Key
Super Key Candidate
A -> BCD (A) + = ABCD
Key
A candidate key is a minimal set of super key
A primary key can be any candidate key selected by DBA.
Keys
R(ABCD)
B -> ACD
ACD -> B
Find Super key, candidate key and primary key
(B) + = ABCD Super Key Candidate
Key
(ACD) = ABCD
+
Super Key Candidate
Find Super key, candidate key and primaryKey
key
R(ABCD)
AB -> C (AB) + = ABCD
C -> BD (C) + = ABCD
D -> A (D) + = AD
Compute Candidate keys
R(ABCD)
A -> B
B -> C
C -> A
Find essential attribute, and then combine it with LHS
attributes
Here D is essential attribute
(AD) + = ABCD
(BD) + = ABCD
(CD) + = ABCD
Here candidate keys are AD, BD, CD
Compute Candidate keys
R(ABCD)
AB -> CD
D -> A
Find essential attribute, and then combine it with LHS
attributes
Here B is essential attribute
(AB) + = ABCD
(BC) + = BC
(BD) + = ABCD
Here candidate keys are AB and BD.
Compute Candidate keys
R(ABCDEF)
AB -> C
C -> D
B -> AE
Here B and F are essential attributes
(BF) + = BFAECD
Here candidate key is BF.
Compute Candidate keys
Q) R(ABCD)
AB -> CD
C -> A
D -> B
Here, no essential attribute present.
Check individual attributes
A+ = A AB + = ABCD
B+ = B AC + = AC
C + = CA AD + = ADBC
D + = DB BC + = BCAD
CK are AB,AD,BC and CD. BD +
= BD
CD + = CDBA
Compute Candidate keys
Q) R(ABCDE)
AB -> CD
D -> A
BC -> DE
Here, no essential attribute present.
Check individual attributes
A+ = A AB + = ABCD
B+ = B AC + = AC
C + = CA AD + = ADBC
D + = DB BC + = BCAD
CK are AB,AD,BC and CD. BD +
= BD
CD + = CDBA
Compute Candidate keys
Q) R(ABCDEF)
AB -> C
DC -> AE
E -> F
CK = ABD,BCD
Q) R(ABCDEF)
AB -> C
C -> D
D -> BE
E -> F
F -> A
CK = C,D,AB,BE,BF
Data Redundancy
Stu ID Name Age Sem Branch ID Branch HOD
Name
1 A 17 1 101 IT XY
2 B 17 1 101 IT XY
3 C 18 1 101 IT XY
4 D 18 1 102 CE PQ
5 E 19 1 102 CE PQ
6 F 17 1 103 IC AB
• Data Redundancy :- when same data is stored multiple times
unnecessarily in a database
• Disadvantages
• Insertion Anomalies, Deletion Anomalies and Updation Anomalies
• Inconsistent Data
• Database size increases, time of search also increases.
Normalization
Stu ID Name Age Sem Branch ID Branch HOD
Name
1 A 17 1 101 IT XY
2 B 17 1 101 IT XY
3 C 18 1 101 IT XY
4 D 18 1 102 CE PQ
5 E 19 1 102 CE PQ
6 F 17 1 103 IC AB
Stu ID Name Age Sem Branch Branch ID Branch Name HOD
ID
101 IT XY
1 A 17 1 101
102 CE PQ
2 B 17 1 101
103 IC AB
3 C 18 1 101
4 D 18 1 102
5 E 19 1 102
6 F 17 1 103
Normalization - 1NF
A relation is in first normal form if and only if the
domain of each attribute contains only atomic
(indivisible) values, and the value of each attribute
contains only a single value from that domain.
Stu ID Stu Name Course Stu ID Stu Name Course
101 ABC DBMS, 101 ABC DBMS
TAFL
101 ABC TAFL
102 PQR CJT,
AMP 102 PQR CJT
102 PQR AMP
Stu_id
Stud Course
Stu Name
Normalization - 1NF
Emp ID Emp Name Contact No
101 ABC 8412332142,
8456321111
102 PQR 9676543212,
9698765412
Emp ID Emp Name Contact No
101 ABC 8412332142
101 ABC 8456321111
102 PQR 9676543212
102 PQR 9698765412
Normalization - 2NF
Must be in 1NF
No partial dependency should be present
R(A B C D)
AB->D
B -> C Partial Dependency exists
Find candidate key
ans :- AB
Prime Attributes = A ,B
Non Prime Attributes = C,D
Normalization - 2NF
A B
R(A B C D)
-- 1 AB->D
2 -- B -> C B -> C
-- -- CK = AB
3 4
C is dependent on B
How to decompose
R1(ABD)
R2(BC)
Normalization - 2NF
A B C R(ABC)
a 1 X B -> C
b 2 Y
CK = ?
a 3 z
c 3 z A B B C
d 3 z a 1 1 X
e 3 z b 2 2 Y
a 3 3 z
• CK = AB c 3
• Prime Attributes = AB d 3
• Non Prime Attributes = C e 3
• Partial dependency exists as C is only
dependent on B instead of AB
R1 R2
• Soln:- ?
• R1(AB)
• R2(BC)
Normalization - 2NF
R(ABCDE)
A -> B Partial Dependency
B -> E
C -> D Partial Dependency
Find whether it is in 2NF, if not then convert it into 2NF
CK = AC
Prime attributes = AC
Non prime attributes = BDE
Soln :-
R1(ABE)
R2(CD)
R3(AC)
Normalization - 2NF
R(ABCDE)
AB -> C
D -> E
Find whether is it in 2NF, if not then convert it into 2NF
Sol = R1(ABC)
R2(DE)
R3(ABD)
Normalization - 2NF
Stud_No Course_No Course_Fee Stud_No Course_No
1 C1 1000 1 C1
2 C2 1500 2 C2
1 C4 2000
1 C4 R1
4 C3 1000
4 C3
4 C1 1000
4 C1
2 c5 2000
2 c5
CK = Stud_No, Course_No
Course_No -> Course_Fee Partial Dependency
R1(Stud_No, Course_No) Course_No Course_Fee
C1 1000
R2(Course_No, Course_Fee)
C2 1500
C3 1000 R2
C4 2000
C5 2000
Normalization - 2NF
Stud_Id Stud_Name P_Id Project_Name
1 Abc 101 HMS
2 Def 102 LMS
3 Pqr 101 HMS
4 Xyz 103 BMS
Find whether it is in 2NF, if not then convert it into 2NF
Find dependency
PD
Stud_Id -> Stud_Name
PD
P_Id -> Project Name
CK = Stud_Id, P_Id
Sol = R1(Stud_Id, Stud_Name)
R2(P_Id, Project_Name)
R3(Stud_Id, P_Id)
Normalization - 3NF
• Must be in 2NF
• No Transitive dependency should be present
• R(ABC)
A -> B
B -> C TD
• Find CK
• CK = A
• Transitive dependency exists if non prime attribute
depends on another non prime attribute
x -> y
• x,y both are non prime
Normalization - 3NF
A B C A B
a 1 x a 1
b 1 x b 1
c 1 x c 1 R1
d 2 y d 2
e 2 y e 2
f 3 z
f 3
g 3 z
g 3
• R(ABC) B C
A -> B 1 x R2
B -> C
2 y
• Find CK
3 z
• CK = A
• Soln : R1(AB)
R2(BC)
Normalization - 3NF
For every dependency from x -> y, transitive dependency
does not exists if
Either x is super key
Or y is a prime attribute
Partial dependency :- Prime -> non prime
Transitive dependency :- non Prime -> non prime
Normalization - 3NF
Q. R(ABCDE)
A -> B
B -> E Not in 3rd NF
C -> D
CK = AC
Prime Attributes = AC
Non Prime Attributes = BDE
Soln = R1(ABE) => R1(AB), R2(BE)
R3(CD)
R4(AC)
Normalization - 3NF
Q. R(ABCDEFGHIJ)
AB -> C
A -> DE
Not in 3rd NF
B -> F
F -> GH
D -> IJ
CK = AB
Prime Attributes = AB
Non Prime Attributes = CDEFGHIJ
Soln = R1(ABC)
R2(ADE)
R3(DIJ)
R4(BF)
R5(FGH)
Normalization - 3NF
Q. R(ABCDE)
AB -> C
B -> D Not in 3rd NF
D -> E
CK = AB
Prime Attributes = AB
Non Prime Attributes = CDE
Soln = R1(ABC)
R2(BD)
R3(DE)
Normalization - 3NF
Q. R(ABCDEFGHIJ)
AB -> C A -> I
AD -> GH H -> J
BD -> EF
CK = ABD
Prime Attributes = ABD
Non Prime Attributes = CEFGHIJ
Soln = R1(ABC)
R2(ADGH)
R3(HJ)
R4(BDEF)
R5(AI)
R6(ABD)
Normalization - 3NF
Q. Cust(CID, Cname, Accno, BankCode, Branch)
Find dependency
CID -> Cname, Accno, BankCode
BankCode -> Branch
CK = CID
Prime Attributes = CID
Non Prime Attributes = Cname, Accno, BankCode,
Branch
Soln = R1(CID, Cname, Accno, BankCode)
R2(BankCode, Branch)
Normalization – BCNF or 3.5NF
Boyce-Codd Normal Form
Advanced version of 3NF or stricter version of 3NF
Every relation in BCNF is also in 3NF
Partial dependency :- Prime -> non prime
Transitive dependency :- non Prime -> non prime
Dependency :- prime / non prime -> prime
R(ABC)
CK = AB, AC
AB -> C Prime Attributes = ABC
C -> B Non Prime Attributes = none
No partial dependency
No Transitive dependency
Normalization – BCNF or 3.5NF
R(ABC)
Not in BCNF
AB -> C
C -> B
It is in 3NF,
A relation is in BCNF if and only if there are no non-trivial
functional dependencies of attributes on anything other
than a superset of a candidate key.
A relation is in BCNF if
relation is in 3NF, and
every functional dependency
X → Y, X is the super key of the table.
Normalization – BCNF or 3.5NF
R(ABC)
AB -> C Not in BCNF
C -> B CK = AB, AC
A B C A Soln:-
C R1(CB)
C B
a 1 x a x
x R2(AB) 1
Not considered
b 2 y b y y 2
R3(AC)
z 2
c 2 z c z
c 3 w
Final Soln:-R1(CB)
c w w 3
d 3 w d w R2(AC)
R2
e 3 w e w
R R1
Normalization – BCNF or 3.5NF
R(ABCD) x -> y
BCNF => x should be a super key
A -> BCD BCNF
3NF => x is super key or y is prime
BC -> AD BCNF 2NF => y is non prime and x is super key
D -> B 3NF
Find CK = A , BC
Prime Attributes = A, B, C
Non Prime Attributes = D
Soln :- R1(DB)
R2(ABCD)=>R(ACD)
Normalization – BCNF or 3.5NF
Stu_id Sub
Prof -> Sub Prof 3 NF
101Stu_id,
DBMSSub ->ANV
Prof
101 CJT HBP Prof Sub Stu_id Prof
102 CJT HBP ANV DBMS 101 ANV
102 TAFL NPD HBP CJT 101 HBP
103 DBMS ANV NPD TAFL 102 HBP
104 DBMS VBS VBS DBMS 102 NPD
103 ANV
R1
CK = Stu_id, Sub
104 VBS
Prime Attributes = stu_id, Sub
Non Prime Attributes = Prof R2
Soln : R1(Prof, Sub)
R2(Stu_id, Sub, Prof) => R2(Stu_id, Prof)
Find Normal Forms
BCNF(3 NF, 2NF, 1NF)
3NF(2NF, 1NF)
2NF(1NF)
1NF
Find Normal Forms x -> y
R(ABCDE) BCNF => x should be a super key
3NF => x is super key or y is prime
CE -> D BCNF 2NF => x is super key and y is non prime
or prime
D -> B Not in BCNF Not in 3NF
C -> A Not in 2NF
CK = CE
Prime attribute = CE
Non Prime Attribute = ABD
Conclusion is :- Relation is in 1NF
Find Normal Forms x -> y
R(ABCDEF) BCNF => x should be a super key
3NF => x is super key or y is prime
AB -> C Not in BCNF 3NF 2NF => x is super key and y is non prime
or prime
DC -> AE Not in 3NF Not in 2NF
E -> F
CK = ABD , BCD
Prime attribute = ABCD
Non Prime Attribute = EF
Conclusion is :- Relation is in 1NF
Find Normal Forms x -> y
R(ABCDEFGHI) BCNF => x should be a super key
Not in BCNF 3NF => x is super key or y is prime
AB -> C 2NF => x is super key and y is non prime
Not in 3NF
or prime
Not in 2NF
BD -> EF
AD -> GH
A -> I
CK = ABD
Prime attribute = ABD
Non Prime Attribute = CEFGHI
Conclusion is :- Relation is in 1NF
Find Normal Forms x -> y
R(ABCDE) BCNF => x should be a super key
BCNF 3NF => x is super key or y is prime
AB -> CD 2NF => x is super key and y is non prime
or prime
D -> A Not in BCNF 3NF
BC -> DE
CK = AB, BD, BC
Prime attribute = ABCD
Non Prime Attribute = E
Conclusion is :- Relation is in 3NF
Find Normal Forms x -> y
R(ABCDEF) BCNF => x should be a super key
3NF => x is super key or y is prime
A -> BCDEF BCNF 2NF => x is super key and y is non prime
or prime
BC -> ADEF BCNF
DEF -> ABC BCNF
CK = ?
Prime attribute =
Non Prime Attribute =
Conclusion is :- Relation is in BCNF
Lossless Join Decomposition
Non Additive property
This property guarantees that the extra or loss tuple
generation problem does not occur after
decomposition
It is a mandatory property that must always hold true
A B C D
1 a p X
2 b q y R1, R2
R =>
A B
R1 , R2 => R
C D
1 a p X
2 b q y
Lossless Join Decomposition
If a relation R is decomposed into two relations R1 and
R2, then it will be lossless if
1. Attribute(R1) U Attribute(R2) = Attribute(R)
A B C D
1 a p R =>
X R1, R2
2 b q R1 y, R2 => R
R
A B D
1 a X
2 b y
R1 R2
Lossless Join Decomposition
If a relation R is decomposed into two relations R1 and
R2, then it will be lossless if
1. Attribute(R1) U Attribute(R2) = Attribute(R)
2. Attribute(R1) ∩ Attribute(R2) ≠ ∅
A B C D A B C D
1 a p X 1 a p X
2 b q y 1 a q Y
R 2 b p x
A B C D 2 b q y
1 a p X R1 X R2
2 b q y
R1 R2
Lossless Join Decomposition
If a relation R is decomposed into two relations R1 and
R2, then it will be lossless if
1. Attribute(R1) U Attribute(R2) = Attribute(R)
2. Attribute(R1) ∩ Attribute(R2) ≠ ∅
3. Attribute(R1) ∩ Attribute(R2) -> Attribute(R1) or
Attribute(R1) ∩ Attribute(R2) -> Attribute(R1)
A B C A B C
A B B C
1 a p 1 a p
R1 1 a a p R2
R 1 a r R1 ⋈ R2
2 b q 2 b b q
3 a r 2 b q
3 a a r
3 a p
3 a r
Lossless or Lossy decomposition
Find whether the decomposition is Lossless or Lossy
A B C D E
a 122 1 p w
R
b 234 2 q x
a 568 1 r y
c 347 3 s z
1. R1(AB) , R2(CD) Lossy
2. R1(ABC) , R2(DE) Lossy
3. R1(ABCD) , R2(ACDE) Lossless
4. R1(ABCD) , R2(DE) Lossless
5. R1(ABC) , R2(BCD), R3(DE) Lossless
R12(ABCD)
Lossless or Lossy decomposition
Find whether the decomposition is Lossless or Lossy
R(VWXYZ)
Z -> Y
Y -> Z
X -> YV
VW -> X
1. R1(VW) , R2(WXYZ) Nothing can be said
2. R1(VWX) , R2(YZ) Lossy
3. R1(VW) , R2(YZ) Lossy
4. R1(VWX) , R2(XYZ) Lossless
Dependency Preservation
If a Table R is having functional dependency set F, is decomposed into two
tables R1 and R2 having functional dependency set F1 and F2, then
F1 ⊆ F
F2 ⊆ F
(F1 U F2)+ = F+
R(ABC)
A -> B
B -> C
C -> A
R1(AB), R2(BC)
R1(AB) R2(BC)
A -> B B -> C
B -> A C-> B
(C)+ = CBA
There fore, dependency is preserved for given relation
Dependency Preservation
If a Table R is having functional dependency set F, is decomposed into two tables R1 and
R2 having functional dependency set F1 and F2, then
F1 ⊆ F
F2 ⊆ F
(F1 U F2)+ = F+
R(ABCD)
AB -> CD
D -> A
R1(AD), R2(BCD)
R1(AD) R2(BCD)
A -> A B -> B
D -> A C -> C
D -> D
BC -> BC
BD -> BDC
CD-> CD
(AB)+ = AB No Dependency Preservation
BCNF VS 3NF
It is always possible to decompose a relation into
relations in 3NF and
the decomposition is lossless
the dependencies are preserved
It is always possible to decompose a relation into
relations in BCNF and
the decomposition is lossless
it may not be possible to preserve dependencies.
Normalize Relation Table
R(ABCDEFGHIJ)
AB -> C
A -> DE
B -> F
F -> GH
D -> IJ
Find CK = AB
Find Prime and Non Prime Attributes = AB , = CDEFGHIJ
Find in which Normal form relation is = 1NF
Decompose = R1(ABC)
R2(ADEIJ) =>R21(ADE), R22(DIJ)
R3(BFGH) =>R31(BF), R32(FGH)
Find decomposition is lossy or lossless
Find whether dependency is preserved or not.
Design Goals
Goal for a relational database design is:
BCNF.
Lossless join.
Dependency preservation.
If we cannot achieve this, we accept one of
Lack of dependency preservation
Redundancy due to use of 3NF
Interestingly, SQL does not provide a direct way of specifying
functional dependencies other than super keys.
Can specify FDs using assertions, but they are expensive to test
Even if we had a dependency preserving decomposition, using SQL
we would not be able to efficiently test a functional dependency
whose left hand side is not a key.
Normalization – 4NF
Must be in BCNF
Not more than one Multivalued dependency should be present.
Multivalued Dependency :-
A table is said to have multi-valued dependency, if the following
conditions are true,
For a dependency A ->-> B, if for a single value of A, multiple
value of B exists, then the table may have multi-valued
dependency.
A table should have at-least 3 columns for it to have a multi-
valued dependency.
And, for a relation R(A,B,C), if there is a multi-valued
dependency between, A and B, then B and C should be
independent of each other.
Symbol is ->-> or ->>
Normalization – 4NF
Course Teacher Book
DBMS ANV
R(Course,
Korth
Teacher, Book)
Course ->-> Teacher
DBMS ANV Ullman
Course ->-> Book
DBMS VBC Korth
This relation is not in 4NF
DBMS VBC Ullman
Soln :- R1(Course, Teacher)
OS VKP William
R2(Course, Book)
OS VKP Shaw
Course Teacher Course Book
DBMS ANV DBMS Korth
DBMS VBC DBMS Ullman R2
OS VKP OS William
R1 OS Shaw
Normalization – 4NF
A relation schema R is in 4NF with respect to a set D
of functional and multivalued dependencies if for all
multivalued dependencies in D+ of the form ,
where R and R, at least one of the following
hold:
is trivial (i.e., or = R)
is a superkey for schema R
If a relation is in 4NF it is in BCNF
Normalization – 4NF
R =(A, B, C, G, H, I)
• Is dependency preserved in
F ={ A B following decomposition ?
B HI
CG H }
R is not in 4NF since A B and A is not a superkey for R
Decomposition
a) R1 = (A, B) (R1 is in 4NF)
b) R2 = (A, C, G, H, I) (R2 is not in 4NF)
c) R21 = (C, G, H) (R21 is in 4NF)
d) R3 = (A, C, G, I) (R3 is not in 4NF)
Since A B and B HI, A HI, A I
e) R31 = (A, I) (R31 is in 4NF)
f )R4 = (A, C, G) (R4 is in 4NF)
Normalization – 5NF
Must be in 4NF
It should have no join dependency and also the joining must be
lossless R2
A relation in 5NF cannot be decomposed further without any kind of
SID MNo
modification in the meaning or facts.
101 9876543210
5NF is also known as Project Join Normal Form (PJNF).
102 9812345678
SID MNo Activity MNo Activity
102 7777733333
101 9876543210 Dancing 9876543210 Dancing
103 9865432176
101 9876543210 Singing 9876543210 Singing
102 9812345678 Cricket 9812345678 Cricket SID Activity
102 7777733333 Cricket 7777733333 Cricket 101 Dancing
103 9865432176 Singing 9865432176 Singing 101 Singing
R R1 102 Cricket
R3
103 Singing
R = R1 R2 R3
Normalization – 5NF
Sub Lecturer Sem
Comp Adam Sem1
Comp John Sem1
Math John Sem1
Math Smith Sem2
Chemistry Ely Sem1
Denormalization for performance
May want to use non-normalized schema for performance
E.g. displaying customer-name along with account-number and
balance requires join of account with depositor
Alternative 1: Use denormalized relation containing attributes of
account as well as depositor with all above attributes
faster lookup
Extra space and extra execution time for updates
extra coding work for programmer and possibility of error in
extra code
Alternative 2: use a materialized view defined as
account depositor
Benefits and drawbacks same as above, except no extra coding
work for programmer and avoids possible errors
Some design issues
Some aspects of database design are not caught by
normalization
Examples of bad database design, to be avoided:
Instead of earnings(company-id, year, amount), use
earnings-2000, earnings-2001, earnings-2002, etc., all on the
schema (company-id, earnings).
Above are in BCNF, but make querying across years difficult
and needs new table each year
company-year(company-id, earnings-2000, earnings-2001,
earnings-2002)
Also in BCNF, but also makes querying across years difficult
and requires new attribute each year.
Is an example of a crosstab, where values for one attribute
become column names
Used in spreadsheets, and in data analysis tools