Normalization Lesson
Normalization Lesson
Normalization is a database design technique that reduces data redundancy and eliminates
undesirable characteristics like Insertion, Update and Deletion Anomalies. Normalization rules
divides larger tables into smaller tables and links them using relationships. The purpose of
Normalization in SQL is to eliminate redundant (repetitive) data and ensure data is stored
logically.
Normalization is used for mainly two purposes,
Updation Anomaly
What if Mr. X leaves the college? or is no longer the HOD of computer science department?
In that case all the student records will have to be updated, and if by mistake we miss any
record, it will lead to data inconsistency. This is Updation anomaly.
Deletion Anomaly
In our Student table, two different information’s are kept together, Student information and
Branch information. Hence, at the end of the academic year, if student records are deleted, we
will also lose the branch information. This is Deletion anomaly.
Normalization rules are divided into the following normal forms:
Database Normalization Example can be easily understood with the help of a case study.
Assume, a video library maintains a database of movies rented out. Without any normalization
in database, all information is stored in one table as shown below. Let's understand
Normalization in database with tables example:
Table 1
Here you see Movies Rented column has multiple values. Now let's move into 1st Normal
Forms:
1NF Example
Hence, we require both Full Name and Address to identify a record uniquely. That is a
composite key.
Let's move into second normal form 2NF
Table 1
Table 2
We have divided our 1NF table into two tables viz. Table 1 and Table2. Table 1 contains
member information. Table 2 contains information on movies rented.
We have introduced a new column called Membership_id which is the primary key for table 1.
Records can be uniquely identified in Table 1 using membership id
Database - Foreign Key
In Table 2, Membership_ID is the Foreign Key
Foreign Key references the primary key of another Table! It helps connect your Tables
A foreign key can have a different name from its primary key
It ensures rows in one table have corresponding rows in another
Unlike the Primary key, they do not have to be unique. Most often they aren't
Foreign keys can be null even though primary keys can not
Why do you need a foreign key?
Suppose, a novice inserts a record in Table B such as
You will only be able to insert values into your foreign key that exist in the unique key in the
parent table. This helps in referential integrity.
The above problem can be overcome by declaring membership id from Table2 as foreign key
of membership id from Table1
Now, if somebody tries to insert a value in the membership id field that does not exist in the
parent table, an error will be shown!
Subject Table
Score Table
1 10 1 70
2 10 2 75
3 11 1 80
In the Score table, we need to store some more information, which is the exam name
and total marks, so let's add 2 more columns to the Score table.
score_id student_id subject_id marks exam_name total_marks
1 Workshop 200
2 Mains 70
3 Practicals 30
If all these conditions are true for any relation(table), it is said to have multi-valued
dependency.
Time for an Example
Below we have a college enrolment table with columns s_id, course and hobby.
1 Science Cricket
1 Maths Hockey
2 C# Cricket
2 Php Hockey
As you can see in the table above, student with s_id 1 has opted for two
courses, Science and Maths, and has two hobbies, Cricket and Hockey.
You must be thinking what problem this can lead to, right?
Well the two records for student with s_id 1, will give rise to two more records, as shown
below, because for one student, two hobbies exists, hence along with both the courses, these
hobbies should be specified.
s_id course hobby
1 Science Cricket
1 Maths Hockey
1 Science Hockey
1 Maths Cricket
And, in the table above, there is no relationship between the columns course and hobby. They
are independent of each other.
So there is multi-value dependency, which leads to un-necessary repetition of data and other
anomalies as well.
s_id course
1 Science
1 Maths
2 C#
2 Php
s_id hobby
1 Cricket
1 Hockey
2 Cricket
2 Hockey