[go: up one dir, main page]

0% found this document useful (0 votes)
43 views17 pages

Review - Normal Forms2

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views17 pages

Review - Normal Forms2

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 17

CMSC325

Normal Forms
DR. RICHA SHARMA
L O C K H AV E N U N I V E R S I T Y

1
Introduction
 Process for correcting the table structures to minimize
data redundancies and hence, reduce data anomalies!
 We progressively break table into new set of relations
based on identified functional dependencies!

 The normal forms that we try to get include:


 First Normal Form (1 NF)
 Second Normal Form (2 NF)
 Third Normal Form (3 NF)
 Boyce-Codd Normal Form (BCNF)
 Fourth Normal Form (4 NF)
 Fifth Normal Form (5 NF)
2
Introduction (ctd.)
 For most practical business purposes, 3NF is the most
desirable normal form.

 Higher levels of normalization can lead to more complex


database designs and queries – leading to poor performance
of the database!

 Overall normalization process should ensure:


 Each table represents a single entity or object.
 Each row/column intersection contains only one value!
 No data item will be unnecessarily stored in more than
one table.
 All attributes in a table are dependent on the primary key.
 Each table has no insertion, update, deletion anomalies.
3
First Normal Form
 Most basic level of normalization:

 Each table cell should contain only a single value!

 We should eliminate nulls, repeating groups or any


collection type of values to convert a relation to 1NF.

 Example:
 name, address attribute should be separated into distinct
columns.
 multiple data values separated by comma are there on
any attribute, say a list of employees on a project – add
rows for such repeating groups.

4
Functional Dependency
 An attribute B is said to be fully functionally dependent on the
attribute A if each value of A determines one and only one
value of B.
 Example: Employee Id (determinant attribute) determines
Employee name (dependent attribute)
 Attribute A determines attribute B if all of the rows in the table
that agree in value for attribute A also agree in value for
attribute B.
 What if A is a composite key? - If attribute B is functionally
dependent on a composite key A but not on any subset of that
composite key, the attribute B is fully functionally dependent
on A.
5
Functional Dependency
 Partial Dependency – exists when there is a functional
dependence in which the determinant is only part of the
primary key.
 Example: if (A, B) (C, D), B C, and (A, B) is the primary
key, then the functional dependence B C is a partial
dependency!
 Transitive Dependency – exists when there are functional
dependencies such that X Y, Y Z, and X is the primary key,
then the dependency X Z is a transitive dependency
because X determines the value of Z via Y.

6
Second Normal Form
 Eliminates redundant data by requiring that each non-key
attribute should be dependent on the primary key.

 Conversion to 2NF is required when 1NF has a composite


primary key.
 For a relation to be in 2NF, it must be in 1NF and must not
contain any partial dependency! i.e., no non-prime attribute
(attributes which are not part of any candidate key) is
dependent on any proper subset of any candidate key of the
table.

7
Third Normal Form
 3NF builds on 2NF by requiring that all non-key attributes are
independent of each other.
 A table is in 3NF if it is in 2NF and contains no transitive
dependencies.
 Is this table in 1NF, 2NF, 3NF?
Student Id Course Code Name City Course Name Semester Year Grade

S1 C1 ABC Lock Haven Python Prog Fall 2022 A

S1 C2 ABC Lock Haven DB Fundamentals Spring 2023 B

S2 C3 XYZ New York Advanced SQL Fall 2023

S2 C1 XYZ New York Python Prog Fall 2022 B

8
Third Normal Form
 Is this table in 1NF, 2NF, 3NF?

Branch Code City Contact Number Mgr Id Mgr Name

B001 Lock Haven 1111111100 E04 ABC11

B002 Jersey Shore 2222111100 E05 XYZ11

B003 State College 4444411100 E15 ABC12

B004 Williamsport 7771111100 E24 XYZ12

9
BCNF
 BCNF is a stricter form of 3NF that ensures that each
determinant in a table is a candidate key.
 BCNF ensures that each non-key attribute is dependent
only on the candidate key!

Image Source: Chapter 6: Database Systems by Coronel & Morris


10
BCNF (ctd.)
 Here, we have following functional dependencies with (A, B)
as primary key:
(A, B) C, D
(A, C) B, D
C B
 (A, C) too is a candidate key!
 Dependencies indicate there is neither partial dependency nor
transitive dependency – this relation is in 3NF but not in
BCNF!
 Here, one key attribute is determining part of the primary key
– this cannot be called partial or transitive dependency as the
dependent is a prime attribute!
11
BCNF (ctd.)
 To get this corrected, lets consider (A, C) as primary key, then
we’ll have the problem of partial dependency with C
determining B!

 To fix that problem, we can split the relation into relations:

 (A, C, D)
 (C, B)
 Both of these relations are in 3NF as well as BCNF!

12
Fourth Normal Form
 4NF is a further refinement of BCNF that ensures that a table
does not contain any multi-valued dependencies.
 For a table to satisfy the 4NF, it should satisfy the following
two conditions:
 It should be in BCNF.
 The table should not have any Multi-valued
Dependency.

 For a dependency A -> B, if for a single value of A, multiple


values of B exist, then the table is said to have a multi-valued
dependency.
 The table should have at least 3 attributes and B and C
should be independent for A ->> B multivalued dependency.

13
Fifth Normal Form
 5NF is the highest level of normalization, also known as
Project-join NF (PJ/NF).
 It involves decomposing a table into smaller tables to remove
data redundancy, improve data integrity and ensure lossless
join property (no join dependency)
 Lossless join property - ensures that no spurious or extra
tuples are generated when relations are reunited through a
natural join.

 A relation R is said to be in 5NF if:


 R is already in 4NF.
 R does not have join dependency.

14
Closure of a set of attributes
 Closure of an Attribute: it is a set of attributes that can be
functionally determined from it using the given functional
dependencies and the inference rules from FDs [closure of
FDs].

 Given R( A, B, C, D) and F = { A → B, B → CD} then closure


of attribute A and {F}+ can be obtained as:

{F}+ = { A → B, B → CD, A → CD }

(A)+ = ABCD

15
Closure to find candidate keys
 Given relational schema R(X, Y, Z, W) and FD= { XYZ → W, XY
→ ZW and X → YZW }. Let’s work out the candidate keys for R.

 We’ll find closure of each FD assuming antecedent as


potential candidate key:

 (XYZ)+ = XYZW
 (XY) + = XYZW
 (X) + = XYZW
 Since antecedent of each FD is determining all the attributes of
the table, therefore each of these is a superkey.
 Candidate key is the minimal superkey, therefore X is the
candidate key here!
16
Closure - exercise
 Given relational schema R( P, Q, R, S, T, U, V) having set of
functional dependencies. Let’s find (QR)+ and (PR)+

F = { P->Q, QR->ST, PTV->V }.

 Given relational schema R( X, Y, Z, W) having set of functional


dependencies: { XY → Z, Z → YW, and W → X }. Let’s find the
superkey and candidate key for R.

17

You might also like