NIT 4
U
chema Refinement (Normalization)
S
Purpose of Normalization
efinition: Normalization is the process of organizinga database to reduce redundancy and
D
ensure data integrity.
bjectives:
O
Reduce Data Redundancy: Avoids storing duplicate data, which saves storage and reduces
the chance of errors.
Minimize Data Anomalies: Prevents issues during data operations:
Insertion Anomaly: Problems when adding data, often requiring extra fields or incomplete data.
Update Anomaly: Errors from updating redundant data in multiple places.
Deletion Anomaly: Losing crucial data when deleting rows with dependencies.
Uses: Normalization improves data consistency, efficiency,and ease of maintenance.
Functional Dependency (FD)
efinition: A functional dependency exists whenone attribute (or set of attributes) uniquely
D
determines another attribute in a relation.
Notation: Written as \( X \rightarrow Y \), meaningthat if two tuples (rows) share the same
value of \( X \), they must also share the same value of \( Y \).
xample: In a Student table, if StudentID determinesStudentName, then the dependency can
E
be written as StudentID → StudentName. This implies that knowing a student's ID allows us to
uniquely identify their name.
ses in Normalization: Functional dependencieshelp identify and structure data to prevent
U
redundancy and improve data organization.
Normal Forms Based on Functional Dependency
a.First Normal Form (1NF) :
efinition: A table is in 1NF if:
D
Each cell contains only atomic (indivisible) values.
Each row holds unique values, with no repeating groups.
bjective: Ensure the table structure is simple,with single values per cell and each attribute
O
representing one type of data
Example: A table with multiple phone numbers in one cell (like "123-4567, 765-4321") violates
1NF. To meet 1NF, each phone number should be in a separate row or column.
.Second Normal Form (2NF):
b
Definition: A table is in 2NF if:
It is in 1NF.
All non-key attributes are fully dependent on the entire primary key, not just a part of it.
Objective: Remove partial dependencies (where non-keyattributes depend only on part of a
composite primary key).
Example: In an Orders table, if OrderID and ProductIDtogether form a composite primary
key, and ProductName depends only on ProductID, this violates 2NF. ProductName should be
stored in a separate table with ProductID as the key.
c.Third Normal Form (3NF):
Definition: A table is in 3NF if:
It is in 2NF.
All non-primary-key attributes are not transitively dependent on the primary key.
Objective: Remove transitive dependencies (wherenon-key attributes depend on other
non-key attributes).
Example: In a Student table with StudentID, StudentName, DepartmentName, and
DepartmentHead, the dependency DepartmentName → DepartmentHead violates 3NF.
DepartmentHead should be moved to a separate Department table to remove the transitive
dependency.
Lossless Join and Dependency Preserving Decomposition
ossless Join: Ensures that, when a table is decomposed into smaller tables, it can be
L
recombined (or joined) without any data loss.
Dependency Preservation: Ensures that all functional dependencies in the original table are
preserved in the decomposed tables, allowing the database to maintain the same constraints
and integrity.
Importance:
Lossless join guarantees no data loss during table recombination.
Dependency preservation ensures that decomposing a table doesn’t affect the original
relationships and dependencies.
Example:
Surrogate Key
Definition: A surrogate key is an artificial or substitute key used as the primary key, typically
created when there is no suitable natural key.
Characteristics:
Often an auto-incremented integer (like an ID number).
Does not hold any real-world business meaning.
Example: In a Customer table, using CustomerIDas a surrogate key instead of relying on
customer name or email (which might not be unique or stable).
Purpose: Useful for simplifying data management and ensuring unique, stable identifiers in
each table.
Boyce-Codd Normal Form (BCNF)
Definition: A table is in BCNF if:
It is in 3NF.
Every functional dependency \( X \rightarrow Y \) requires that \( X \) be a superkey (a unique
identifier for each row).
Objective: BCNF addresses redundancies that can remain in 3NF by requiring all
dependencies to be supported by superkeys.
Example: In a Classroom table with attributes RoomNumberand Instructor, if Instructor
depends on RoomNumber but RoomNumber isn’t a superkey, it violates BCNF. This table
should be decomposed to meet BCNF.
Multivalued Dependency (MVD) and Fourth Normal Form(4NF)
Definition(MVD):
Occurs when one attribute determines multiple independent values of another attribute, often
causing unnecessary duplication.
Example: In a Project table, if ProjectID determinesboth TeamMembers and Tools, with
TeamMembers and Tools independent of each other, this creates an MVD.
Fourth Normal Form (4NF):
A table is in 4NF if it is in BCNF and has no multivalued dependencies.
Objective: Eliminate tables with MVDs to furtherreduce redundancy.
Fifth Normal Form (5NF)
efinition: A table is in 5NF if it is in 4NF andhas no join dependencies (complex
D
relationships requiring multiple joins for proper data representation).
bjective: 5NF decomposes tables to eliminate complex dependencies, ensuring each data
O
fact stands independently.
xample: Consider a table where a combination of Supplier, Part, and Project needs to be
E
recorded. To achieve 5NF, this information might be split into three tables to avoid needing
multiple joins.
------------------------------------ THE END ---------------------------------------