Database Normalisation (WEEK 5)
Outline
Concept of Functional dependency (FD)
The characteristics of FD
Identifying FD for a given relation
FD to identify the primary key for a relation
Normalization Process
Various Normal Forms
First Normal Form(1NF)
Second Normal Form(2NF)
Third Normal Form(3NF)
Case Study : DreamHome
Functional Dependencies (FD)
An important concept associated with normalization.
Given a relation R, a set of attributes B in R is functionally dependent on another set of
attributes A in R, if each value of A is associated with exactly one value of B. Written as A → B
Diagrammatic representation.
A is called the determinant
B is dependant attribute(s)
Depends on the semantics of a relation – hold on all instances of a relation (all the time)
Functional Dependency that holds for all Time
Based on sample data in Staff relation, the following two functional dependencies appear to
hold.
1. staffNo → sName
2. sName → staffNo
However, the only functional dependency that remains true for all possible values for the
staffNo and sName attributes of the Staff relation is:
staffNo → sName
There could be two staffs with the same name
Characteristics of Functional Dependencies
Full functional dependency
Determinants should have the minimal number of attributes necessary to maintain the
functional dependency with the attribute(s) on the right hand-side.
if A and B are attributes of a relation, B is fully functionally dependent on A, if B is
functionally dependent on A, but not on any proper subset of A.
Main characteristics of functional dependencies used in normalization:
There is a one-to-one relationship between the attribute(s) on the left-hand side
(determinant) and those on the right-hand side of a functional dependency.
Holds for all time.
The determinant has the minimal number of attributes necessary to maintain the
dependency with the attribute(s) on the right hand-side.
Transitive Dependencies
A transitive dependency in a relation can potentially cause update anomalies. Transitive
dependency describes a condition where A, B, and C are attributes of a relation such that If A →
B and B → C, then A → C That is C is transitively dependent on A via B (provided that A is not
functionally dependent on B or C).
Identifying Functional Dependencies
Identifying all functional dependencies between a set of attributes is relatively simple if
The meaning of each attribute and the relationships between the attributes are well
understood.
This information should be provided by the enterprise in the form of documentation.
Very often database designer use their common sense and/or experience if the
information is missing
Purpose of Normalization
Normalization is a technique for producing a set of suitable relations that support the data
requirements of an enterprise. Characteristics of a suitable set of relations include:
the minimal number of attributes necessary to support the data requirements of the
enterprise;
minimal redundancy with each attribute represented only once with the important
exception of attributes that form all or part of foreign keys.
Benefits of Normalization
The benefits of using a database that has a suitable set of relations is that the database
will be:
easier for the user to access and maintain the data;
take up minimal storage space on the computer.
Potential benefits for implemented database include:
Updates to the data stored in the database are achieved with a minimal
number of operations us reducing the opportunities for data
inconsistencies.
Reduction in the file storage space required by the base relations thus
minimizing costs.
Normal Forms (NF)
Normal form (NF):
Condition using keys and FDs of a relation to certify whether a relation
schema is in a particular normal form
Three initial Normal Forms are
First Normal Form (1NF)
Second Normal Form (2NF)
Third Normal Form (3NF)
A stronger definition of 3NF
Boyce-Codd Normal Form (BCNF)
Higher Normal Forms
Forth Normal Form (4NF)
Fifth Normal Form (5NF)
Relationship between various normal forms?
Practical Use of Normal Forms
Normalization is carried out in practice so that the resulting designs are of high quality and
meet the desirable properties. The database designers need not normalize to the highest
possible normal form (usually up to 3NF, BCNF or 4NF). The higher normal forms deal with very
rare situations
Denormalization: The process of storing the join of higher normal form relations as a base
relation—which is in a lower normal form.
Diagrammatic Illustration of Normalization
Summary of first three Normal Forms
Normal Form Condition
First Normal Form (1NF) A relation is in 1NF when it contains no repeating groups
of data
Second Normal Form (2NF) A relation is in 2NF when it is in 1NF and when all of its
non-key attributes are fully dependent on its primary key.
Third Normal Form (3NF) A relation is in 3NF when it is in 2NF and all the non-key
attributes are directly dependent on the primary key.
General Definitions of 2NF and 3NF
Second normal form (2NF)
A relation that is in first normal form and every non-primary-key attribute is fully
functionally dependent on any candidate key.
Third normal form (3NF)
A relation that is in first and second normal form and in which no non-primary-key
attribute is transitively dependent on any candidate key.