0% found this document useful (0 votes)

393 views26 pages

Normalization

The document discusses normalization of databases to reduce data redundancy and inconsistencies. It begins by defining normalization and its purposes. It then walks through normalizing a sample database of projects and employees to transform it from zero normal form to 1st, 2nd, and 3rd normal form through a process of identifying dependencies and moving data to separate tables. The result is a set of tables with minimized redundancy and inconsistencies.

Uploaded by

Hemant Sharma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

393 views26 pages

Normalization

Uploaded by

Hemant Sharma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

Normalization

Normalization is performed to reduce or eliminate Insertion, Deletion or Update anomalies. However, a completely normalized database may not be the most efficient or effective implementation. Denormalization is sometimes used to improve efficiency. Normalization splits database information across multiple tables. To retrieve complete information from a normalized database, the JOIN operation must be used. JOIN tends to be expensive in terms of processing time, and very large joins are very expensive.

Normalization -Contd..
Introduction In this exercise we are looking at the optimisation of data

structure. The example system we are going to use as a model is a database to keep track of employees of an organisation working on different projects.
Objectives By the end of the exercise you should be able to: Show understanding of why we normalize data Give formal definitions of 1NF, 2NF & 3NF Apply the process of normalization to your own project

The Scenario
The data we would want to store could be expressed as:
Project No Project Name Employee No Employee Name Jessica Brookes
Andy Evans Max Fat Jessica Brookes Alex Branton

Rate category A
B C A B

Rate

1203

Madagascar travel site

11
12 16

90
80 70 90 80

1506

Online estate agency

11 17

Why Normalization?
Three problems become apparent with our current model: Tables in a RDBMS use a simple grid structure Each project has a set of employees so we cant even use this format to enter data into a table. How would you construct a query to find the employees working on each project? All tables in an RDBMS need a key Each record in a RDBMS must have a unique identity. Which field should be the primary key? Data entry should be kept to a minimum Our main problem is that each project contains repeating groups, which lead to redundancy and inconsistency.

Why Normalization? Contd..

We could place the data into a table called: tblProjects_Employees
Project No. 1203 1203 Project Name Employee No. Employee Name Jessica Brookes Andy Evans A B Rate category Rate

Madagascar travel site Madagascar travel site

11 12

90 80

1203
1506 1506

Madagascat travel site

Online estate agency Online estate agency

16
11 17

Max Fat
Jessica Brookes Alex Branton

C
A B

70
90 70

Why Normalization? Contd..

Addressing our three problems: Tables in a RDBMS use a simple grid structure We can find members of each project using a simple SQL or QBE search on either Project Number or Project Name
All tables in an RDBMS need a key We CAN uniquely identify each record. Although no primary key exists we can use two or more fields to create a composite key. Data entry should be kept to a minimum Our main problem that each project contains repeating groups still remains. To create a RDBMS we have to eliminate these groups or sets.

Why Normalization? Contd..

Did you notice that Madagascar was misspelled in the 3rd record! Imagine trying to spot this error in thousands of records. By using this structure (flat filing) we create: Redundant data Duplicate copies of data we would have to key in Madagascar travel site 3 times. Not only do we waste storage space we risk creating Inconsistent data;
Inconsistent data The more often we have to key in data the more likely we are to make mistakes.

Normalization Process
The solution is simply to take out the duplication. We do this by: Identifying a key In this case we can use the project no and employee no to uniquely identify each row
Project No 1203 1203 1203 1506 1506 Employee No 11 12 16 11 17

Unique Identifier
120311

120312
120316 150611

150617

Normalization Process Contd..

We look for partial dependencies We look for fields that depend on only part of the key and not the entire key.
Field Project Name Project No

Employee No

Employee Rate Category

Rate

We remove partial dependencies The fields listed are only dependent on part of the key so we remove them from the table.

Normalization Process Contd..

We create new tables Clearly we cant take the data out and leave it out of our database. We put it into a new table consisting of the field that has the partial dependency and the field it is dependent on. Looking at our example we will need to create two new tables:
Dependent On Partially Dependent Project Name Dependent On
Employee No

Partially Dependent
Employee Name Rate category Rate

Project No

Normalization Process Contd..

We now have 3 tables:
tblProjects_Employees
Project No Employee No

tblProjects
Project No
1203

Project Name
Madagascar travel site

1203 1203
1203 1506 1506

11 1506 12
16 11 17

tblEmployees
Employee No 11 12 16 Employee Name Jessica Brookes Andy Evans Max Fat Rate Category A B C

Online estate agency

Rate 90 80 70

Alex Branton

Normalization Process Contd..

Looking at the project, note the reduction in: Redundant data The text Madagascar travel site is stored once only, not for each occurrence of an employee working on the project.
Inconsistent data Because we only store the project name once we are less likely to enter Madagascat The link is made through the key, Project No. Obviously there is no way to remove this duplication without losing the relation altogether, but it is far more efficient storing a short number repeatedly, than a large chunk of text.

Normalization Process Contd..

Our model has improved but is still far from perfect. There is still room for inconsistency.
Employee No Employee Name Rate Category Rate Alex Branton is being paid 80 while Jessica Brookes gets 90 but theyre in the same rate category!

11
12 16 17

Jessica Brookes
Andy Evans Max Fat Alex Branton

A
B C A

90
80 70 80

Again, we have stored redundant data: the hourly rate- rate category relationship is being stored in its entirety i.e. We have to key in both the rate category AND the hourly rate.

Normalization Process Contd..

The solution, as before, is to remove this excess data to another table. We do this by:
Looking for Transitive Relationships Relationships where a non-key attribute is dependent on another non-key attribute. Hourly rate should depend on rate category BUT rate category is not a key. Removing Transitive Relationships As before we remove the redundant data and place it in a separate table. In this case we create a new table tblRates and add the fields rate category and hourly rate. We then delete hourly rate from the employees table.

Normalization Process Contd..

We now have 4 tables:
tblProjects_Employees
Project No
1023 1023

tblProjects
Project No Project Name

Employee No
11

1023
1056

Madagascar travel site

Online estate agency

tblEmployees
Employee No 11
12 16

1023
1056 1056

16
11 17

Employee Name Jessica Brookes

Andy Evans Max Fat

Rate Category A
B

tblRates
Rate Category A B Rate 90 80 70

Alex Branton

Normalization Process Contd..

Again, we have cut down on redundancy and it is now impossible to assume Rate category A is associated with anything but 90. Our model is now in its most efficient format with: Minimum REDUNDANCY
Minimum INCONSISTENCY

Normalization Process Contd..

What we have formally done is NORMALIZE the database: At the beginning we had a data structure:
Project No Project Name Employee No (1n) Employee name (1n) Rate Category (1n) Hourly Rate (1n)

(1n indicates there are many occurrences of the field it is a repeating group).

To begin the normalization process we start by moving from zero normal form to 1st normal form.

st 1

Normal Form

The definition of 1st normal form; There are no repeating groups All the key attributes are defined All attributes are dependent on the primary key So far, we have no keys, and there are repeating groups. So we remove the repeating groups and define the keys and are left with: Employee Project table Project number part of key Project name Employee number part of key Employee name Rate category Hourly rate This table is in first normal form (1NF)

2nd Normal Form

A table is in 2nd normal form if: Its already in first normal form It includes no partial dependencies (where an attribute is dependent on only part of the key) We look through the fields: Project name is dependent only on project number Employee name, rate category and hourly rate are dependent only on employee number. So we remove them, and place these fields in a separate table, with the key being that part of the original key they are dependent on. We are left with the following three tables:

2nd Normal Form-Contd..

Employee Project table Project number part of key Employee number part of key Employee table Employee number - primary key Employee name Rate category Hourly rate Project table Project number - primary key Project name

The tables are now in 2nd normal form (2NF). Are they in 3rd normal form?

rd 3

Normal Form

A table is in 3rd normal form if: Its already in second normal form It includes no transitive dependencies (where a non-key attribute is dependent on another non-key attribute) We can narrow our search down to the Employee table, which is the only one with more than one non-key attribute. Employee name is not dependent on either Rate category or Hourly rate, the same applies to Rate category, but Hourly rate is dependent on Rate category.

So, as before, we remove it, placing it in it's own table, with the attribute it was dependent on as key.

3rd Normal Form Contd..

Employee project table Project number part of key Employee number part of key Employee table Employee number - primary key Employee name Rate Category Rate table Rate category - primary key Hourly rate

Project Table Project number - primary key Project name These tables are all now in 3rd normal form, and ready to be implemented.

Other Normal Forms

Boyce-Codd Normal Form (BCNF) Strengthens 3NF by requiring the keys in the functional dependencies to be superkeys (a column or columns that uniquely identify a row)
A 3NF relation is NOT in BCNF if: Candidate keys in the relation are composite keys (they are not single attributes) There is more than one candidate key in the relation, and The keys are not disjoint, that is, some attributes in the keys are common

Other Normal Forms Contd..

Fourth Normal Form (4NF) Eliminate trivial multivalued dependencies. Eliminate non-trivial multivalued dependencies by projecting into simpler tables
Fifth Normal Form (5NF) Eliminate dependencies not determined by keys. A relation is in 5NF if every join dependency in the relation is implied by the keys of the relation Implies that relations that have been decomposed in previous NF can be recombined via natural joins to recreate the original relation. These Forms are very rarely used for business applications. In most cases, tables in 3rd normal form are already in these normal forms anyway

Summary
What normal form is this table in? Giving it a quick glance, we see:

no repeating groups, and a primary key defined, so it's at least in 1st normal form. There's only one key, so we needn't even look for partial dependencies, so it's at least in 2nd normal form. How about transitive dependencies? Well, it looks like Town might be determined by Postcode. And in most parts of the world that's usually the case.
So we should remove Town, and place it in a separate table, with Postcode as the key?

Summary Contd..
No! Although this table is not technically in 3rd normal form, removing this information is not worth it. Creating more tables increases the load slightly, slowing processing down. This is often counteracted by the reduction in table sizes, and redundant data. But in this case, where the town would almost always be referenced as part of the address, it isn't worth it. Perhaps a company that uses the data to produce regular mailing lists of thousands of customers should normalize fully. It always comes down to how the data is going to be used. Normalization is just a helpful process that usually results in the most efficient table structure, and not a rule for database design.

Database Normalization Guide
No ratings yet
Database Normalization Guide
31 pages
IM Module 3, Lesson 3
No ratings yet
IM Module 3, Lesson 3
51 pages
Lec 5 Normalization
No ratings yet
Lec 5 Normalization
25 pages
Database Normalization Guide
No ratings yet
Database Normalization Guide
62 pages
Normalization
No ratings yet
Normalization
31 pages
LESSON 7. Normalization of Database Tables
No ratings yet
LESSON 7. Normalization of Database Tables
34 pages
Chapter 5 Normalization of Database Tables
No ratings yet
Chapter 5 Normalization of Database Tables
58 pages
Lecture 7 - 8 - Normalization
No ratings yet
Lecture 7 - 8 - Normalization
30 pages
Study Material: Vivekananda College Thakurpukur
No ratings yet
Study Material: Vivekananda College Thakurpukur
10 pages
12 Normalization
No ratings yet
12 Normalization
41 pages
Database System Lect 07
No ratings yet
Database System Lect 07
75 pages
Chapter 4 DB
No ratings yet
Chapter 4 DB
30 pages
L6 - Normalization (Student)
No ratings yet
L6 - Normalization (Student)
28 pages
Normalization
No ratings yet
Normalization
30 pages
NORMALIZATION
No ratings yet
NORMALIZATION
6 pages
Unit 3 Relational Database Designs
No ratings yet
Unit 3 Relational Database Designs
36 pages
The Process of Normalisation: Studentnum Coursenum Student Name Address Course
No ratings yet
The Process of Normalisation: Studentnum Coursenum Student Name Address Course
10 pages
Chapter 4
No ratings yet
Chapter 4
53 pages
Week 2
No ratings yet
Week 2
34 pages
Unit IV
No ratings yet
Unit IV
65 pages
Normalization of Database Tables
No ratings yet
Normalization of Database Tables
23 pages
Lecture Note 4 - Dependencies and Normalization
No ratings yet
Lecture Note 4 - Dependencies and Normalization
35 pages
Dependency
No ratings yet
Dependency
47 pages
Database Normalization Updated
No ratings yet
Database Normalization Updated
22 pages
Lesson5 NORMALIZATION (Midtrem)
No ratings yet
Lesson5 NORMALIZATION (Midtrem)
29 pages
Data Normalization
No ratings yet
Data Normalization
41 pages
Database Normalization Guide
No ratings yet
Database Normalization Guide
43 pages
Data Normalization
No ratings yet
Data Normalization
25 pages
Final Explanation
No ratings yet
Final Explanation
211 pages
Topic 5 Normalization - Summary
No ratings yet
Topic 5 Normalization - Summary
13 pages
Database Normalization Guide
No ratings yet
Database Normalization Guide
192 pages
What Is A Relational Database?
No ratings yet
What Is A Relational Database?
13 pages
Chapter 6
No ratings yet
Chapter 6
10 pages
Adbms (Bca) 1 1744715686575
No ratings yet
Adbms (Bca) 1 1744715686575
39 pages
4th Module DBMS Notes
No ratings yet
4th Module DBMS Notes
23 pages
Lecture 6
No ratings yet
Lecture 6
23 pages
Normalization
No ratings yet
Normalization
5 pages
Normalization: Normalization Is A Method For Organizing Data Elements in A Database Into Tables
No ratings yet
Normalization: Normalization Is A Method For Organizing Data Elements in A Database Into Tables
4 pages
Database Normalization Guide
No ratings yet
Database Normalization Guide
50 pages
DBMS Normalization
No ratings yet
DBMS Normalization
53 pages
Normalization Module3 Complete
No ratings yet
Normalization Module3 Complete
12 pages
Unit 3 Normalization
No ratings yet
Unit 3 Normalization
157 pages
Unit 3 Normalization
No ratings yet
Unit 3 Normalization
157 pages
Normalization
No ratings yet
Normalization
21 pages
Dbms Theory Notes Unit IV
No ratings yet
Dbms Theory Notes Unit IV
73 pages
Database Design With Normalization
No ratings yet
Database Design With Normalization
30 pages
Unit 6 DBMS
No ratings yet
Unit 6 DBMS
14 pages
DB Week 10 Lec 1
No ratings yet
DB Week 10 Lec 1
32 pages
Redundancy Dependency Loss of Information
No ratings yet
Redundancy Dependency Loss of Information
61 pages
Description of The Database Normalization Basics
No ratings yet
Description of The Database Normalization Basics
5 pages
Data Modelling 2 Normalisation: by Haik Richards
No ratings yet
Data Modelling 2 Normalisation: by Haik Richards
28 pages
Normalization
No ratings yet
Normalization
7 pages
DBMS MP
No ratings yet
DBMS MP
15 pages
Normalization
No ratings yet
Normalization
30 pages
4.what Is Normalization PDF
No ratings yet
4.what Is Normalization PDF
9 pages
Topic 07
No ratings yet
Topic 07
56 pages
Assignment 2
No ratings yet
Assignment 2
2 pages
Lab Notes RDS v1 02
100% (1)
Lab Notes RDS v1 02
48 pages
Nihal Ashik - PythonSQL Program of The BOISSS PDF
No ratings yet
Nihal Ashik - PythonSQL Program of The BOISSS PDF
13 pages
Oracle: Oracle Database 11g: Administration I
No ratings yet
Oracle: Oracle Database 11g: Administration I
8 pages
EDB Postgres Advanced Server Guide v10
100% (1)
EDB Postgres Advanced Server Guide v10
328 pages
SQL Step by Step
No ratings yet
SQL Step by Step
2 pages
Dbms Lab
No ratings yet
Dbms Lab
36 pages
Lab 1 Indexing
No ratings yet
Lab 1 Indexing
2 pages
Database Project Final
No ratings yet
Database Project Final
15 pages
Oracle Performance Checks
No ratings yet
Oracle Performance Checks
6 pages
Cef Log
No ratings yet
Cef Log
36 pages
ACID Properties in Databases
No ratings yet
ACID Properties in Databases
17 pages
NCC Database ASSIGNMENT Wai Yan Ye Yint 00185279 DBS
No ratings yet
NCC Database ASSIGNMENT Wai Yan Ye Yint 00185279 DBS
31 pages
Muhammad Syamil Muwahhid - 24 JOIN UNION
No ratings yet
Muhammad Syamil Muwahhid - 24 JOIN UNION
25 pages
Kviz 12
No ratings yet
Kviz 12
10 pages
24 To 26
No ratings yet
24 To 26
3 pages
SQL Assignment
No ratings yet
SQL Assignment
7 pages
Database - SOP 3
No ratings yet
Database - SOP 3
6 pages
Linking Oracle Tables To Microsoft Access
No ratings yet
Linking Oracle Tables To Microsoft Access
15 pages
Evolution of Data Analytics
No ratings yet
Evolution of Data Analytics
2 pages
01 QuanLyNguoiDung
No ratings yet
01 QuanLyNguoiDung
15 pages
RSLTE058 - Timing advance-NPLB 0283
No ratings yet
RSLTE058 - Timing advance-NPLB 0283
72 pages
19 ETL Testing Interview Questions and Answers - ETL in Cts
50% (2)
19 ETL Testing Interview Questions and Answers - ETL in Cts
3 pages
Caching Strategies in Databases
No ratings yet
Caching Strategies in Databases
21 pages
Emerging Trends in Database
No ratings yet
Emerging Trends in Database
4 pages
Micro-Project Report - PHP
No ratings yet
Micro-Project Report - PHP
32 pages
Database Management System
No ratings yet
Database Management System
86 pages
1z0 931 23
No ratings yet
1z0 931 23
6 pages
Unit 3 Relational Data Model
No ratings yet
Unit 3 Relational Data Model
14 pages
Oracle 12c Database Architecture Guide
No ratings yet
Oracle 12c Database Architecture Guide
15 pages

Normalization

Uploaded by

Normalization

Uploaded by

Normalization

Madagascar travel site

Online estate agency

Why Normalization? Contd..

Madagascar travel site Madagascar travel site

Madagascat travel site

Why Normalization? Contd..

Why Normalization? Contd..

Normalization Process Contd..

Employee Rate Category

Normalization Process Contd..

Normalization Process Contd..

Online estate agency

Normalization Process Contd..

Normalization Process Contd..

Normalization Process Contd..

Normalization Process Contd..

Madagascar travel site

Employee Name Jessica Brookes

Normalization Process Contd..

Normalization Process Contd..

2nd Normal Form

2nd Normal Form-Contd..

3rd Normal Form Contd..

Other Normal Forms

Other Normal Forms Contd..

You might also like