Normalization in Database Management Systems (DBMS)
Normalization is a process of organizing data in a database to minimize
redundancy and improve data integrity. It involves decomposing a large table
into smaller, related tables and defining relationships between them.
Goals of Normalization
1. Eliminate Redundancy: Reduce duplicate data to save storage and maintain
consistency.
2. Ensure Data Integrity: Prevent anomalies and maintain logical consistency
in the data.
3. Improve Query Performance: Simplify the structure of the database to
make data retrieval efficient.
Normalization Levels (Normal Forms)
1. First Normal Form (1NF)
2. Second Normal Form (2NF)
3. Third Normal Form (3NF)
4. Boyce-Codd Normal Form (BCNF)
First Normal Form (1NF):
● A table is in 1NF if:
○ It has only atomic (indivisible) values.
○ Each column contains unique values (no repeating groups or arrays).
● Example:
Unnormalized Table:
StudentID | Name | Subjects
1 | Alice | Math, Physics
2 | Bob | Chemistry
1NF Table:
StudentID | Name | Subject
1 | Alice | Math
1 | Alice | Physics
2 | Bob | Chemistry
Second Normal Form (2NF):
● A table is in 2NF if:
○ It is in 1NF.
○ All non-key attributes are fully functionally dependent on the primary
key.
● Example:
1NF Table:
StudentID | CourseID | StudentName | CourseName
1 | 101 | Alice | Math
1 | 102 | Alice | Physics
2NF Tables:
Student Table:
StudentID | StudentName
1 | Alice
Course Table:
CourseID | CourseName
101 | Math
102 Physics
Enrollment Table:
StudentID | CourseID
1 | 101
1 | 102
3. Third Normal Form (3NF)
● A table is in 3NF if:
○ It is in 2NF.
○ There are no transitive dependencies (non-key attributes are not
dependent on other non-key attributes).
Example:
2NF Table:
EmployeeID | DepartmentID | EmployeeName | DepartmentName | ManagerID | ManagerName
1 | 101 | John | Sales | 201 | Alice
2 | 102 | Mary HR | 202 | Bob
Explanation:
● The primary key is (EmployeeID).
● Non-key attributes (DepartmentName, ManagerID, ManagerName) are
dependent on DepartmentID, not directly on EmployeeID.
Decomposed 3NF Tables:
Employee Table:
EmployeeID | EmployeeName | DepartmentID
1 | John | 101
2 | Mary | 102
Department Table:
DepartmentID | DepartmentName | ManagerID
101 | Sales | 201
102 | HR | 202
Manager Table:
ManagerID | ManagerName
201 | Alice
202 | Bob
4. Boyce-Codd Normal Form (BCNF)
● A table is in BCNF if:
○ It is in 3NF.
○ Every determinant is a candidate key.
● Example: Consider the following table:
StudentID | CourseID | InstructorID
1 | 101 | 201
1 | 102 | 202
2 | 101 | 201
2 | 103 | 203
Here, the composite primary key is (StudentID, CourseID). However,
InstructorID depends only on CourseID, not on the entire primary key, leading to
a violation of BCNF.
Decomposed Tables in BCNF:
Course-Instructor Table:
CourseID | InstructorID
101 | 201
102 | 202
103 | 203
Student-Course Table:
StudentID | CourseID
1 | 101
1 | 102
2 | 101
2 | 103
In the decomposed tables, every determinant is a candidate key, and the schema is
now in BCNF.
Summary of Normal Forms
Normal Form Requirement
1NF Atomic values, no repeating groups
2NF 1NF + no partial dependency
3NF 2NF + no transitive dependency
BCNF 3NF + every determinant is a candidate key
ACID Properties in DBMS
ACID is an acronym that represents four essential properties of a
transaction to ensure data integrity and reliability. These properties are
Atomicity, Consistency, Isolation, and Durability. Transactions in DBMS
are sequences of operations performed as a single logical unit of work.
1. Atomicity (All or Nothing)
● What it Means:
A transaction is all or nothing. Either the entire transaction happens,
or nothing happens at all.
● Example:
You’re transferring $100 from Account A to Account B:
1. $100 is deducted from Account A.
2. $100 is added to Account B.
If something goes wrong (like a power cut), the database will
undo everything and ensure no money is lost or partially
transferred.
2. Consistency (Data Rules Always Followed)
● What it Means:
After a transaction, the database must still follow all its rules and
constraints.
● Example:
If you transfer $100 between accounts, the total money in the system
(Account A + Account B) must remain the same. The database won’t
allow changes that break its rules.
3. Isolation (No Overlapping Transactions)
● What it Means:
If multiple people are using the database at the same time, their
transactions won’t interfere with each other.
● Example:
○ You are transferring $100 from Account A to Account B.
○ At the same time, someone else is withdrawing $50 from
Account A.
Isolation ensures that these two actions don’t cause errors, like
losing money or showing the wrong balance.
4. Durability (Changes are Permanent)
● What it Means:
Once a transaction is completed, the changes are saved
permanently, even if there’s a system crash.
● Example:
After transferring $100 from Account A to Account B, the new
balances will still be there even if the server goes down right after.