DBMS ch-4
DBMS ch-4
DBMS ch-4
Unit 4
Data Normalization
NORMALIZATION:
Normalization is a process of efficiently organizing data in a database. Normalization is the
way by which a given set of relations are transformed to a new set of relations with a more solid
structure.
Constraint: Constraints are certain properties that data and database should satisfy
with. They can be apply to column, row ,table or entire schema.
Domain constraints
Key constraints
Integrity constraints
Entity constraints
Referential Constraints
1.Semantics of attributes
The main goal of this is to minimize the storage space used by the base relations
by avoiding anomalies
Database Anomalies:
Anomalies are the problems in database relations that occur due to redundancy in
database. These anomalies affect the process of inserting, deleting and modifying
data in relations.
In the above table , until the new faculty member vasudha is assigned to teach
atleast one course, her details cannot be recorded.
In the above table, all information about ashwini is lost when she temporarily
ceased to assign any course.
In the above table, faculty 407 is having different addresses on different records as
shown.
If attributes does not apply to the tuples in the relation we end up with
many nulls in those tuples. This may lead to wastage of memory. It usually
happens when
Decomposition:
The decomposition of a relation schema R=(A1,A2……An)is its replacement by a set of relation
schemes (R1,R2……Rm),such that R1<=R FOR 1<=i<=m and R1 U R2 U R m=R.
Example of decomposition
R=(x,y)U(y,z)
R=(x,y,z)
R(x,y,z)
GOALS OF DECOMPOSITION:
Given instances of the decomposed relations,we may not be able to reconstruct the
corresponding instance of the original relation-information loss.
LOSSY DECOMPOSITION:
Additional tuples are obtained along with original tuples,although there are more tuples,this
leads to less information.Due to the loss of information,this decomposition is called lossy
decomposition.
FUNCTIONAL DEPENDENCIES
A functional dependency (FD) is a constraint between two sets of attributes in a relation from a
database . Given a relation R, a set of attributes X in R is said to functionally determine another
attribute Y also in R,(written X->Y) if and only if each Xvalue is associated with precisely one
Y value . Here X the determinant set and Y the dependent attribute.
FUNCTIONAL DEPENDENCY:[DEFINITION]
A functional dependency is denoted by X-> Y between two sets of attributes X and Y that are
subsets of R specifies a constraint on the possible tuple that can form a relation state r of R.
The constraint is for any two tuples t1 and t2 in r if t1[x]=t2[x] then they have t1[y]=t2[y].This
means the value of X component of a tuple uniquely determines the value of component Y.
A functional dependency is an association between two attributes of the same relational database
table. One of the attributes is called the determinant and the other attribute is called the
dependent. For each value of the determinant there is associated one and only one value of the
dependent.
If A is the determinant and B is the dependent then we say that A functionally determines B and
graphically represent this can be written as A -> B. The symbols A ->B can also be expressed as B
is functionally dependent on A.
Example
Functional depending set S is irreducible if the set has three following properties:
Given that X,Y,and Z are sets of attributes in a relation R,one can derive several propreties of
functional dependencies .Amongthe most important are Armstrong’s axioms, which are used in
database normalisation:
Equivalent sets of functional dependencies are called covers of each other . Every set of
functional dependencies has a canonical cover.
EXAMPLE: Suppose that we specify the following set F of obvious functional dependencies on
the relation schema.
Trivial dependency:
Non trivial dependency: If a functional dependency X->Y holds true where Y is not a subset of
X then this dependency is called non trivial Functional dependency.
If there is a functional dependency from X toY , a single attribute of X uniquely determines the
attribute in Y
AB
Multi-valued dependency:
AB,C,D.
EXAMPLE:
{SSN,PNO}-HOURS:FULL FD.
If Y is functionally dependet on X ie XY, then any proper subset of X can uniquely determine
attribute in Y
Transitive dependency:
What is Normalization?
– Absence of redundancy.
NORMAL FORMS:
The data in the database can be considered to be in one of the number of ‘normal
forms’.Basically the normal form of the data indicates how much redundancy is in
that data.
4.BCNF
Properties of Normalization:
1.No data value should be duplicated in different rows unnecessarily.
Normalization Benefits:
>Facilitates data integration.
Goals of Normalization(purposes):
There are two goals of the normalization process:
>Eliminating redundant data(for example,storing the same data in more than one
table) and
These are worthy goals as they reduce the amount of space a database consumes
and ensure that is logically stored.
Advantages of Normalization:
>Avoid redundancy(same data stored many times in same/different tables).
>All the update anomalies and does not have any loss of data or inefficient data
update process.
>In this,a well organized database where all the tables are inter-related maintain
integrity and consistency of data.
>The entire database system remains over time as the database grows with least
redundancy and much durability.
Disadvantages of Normalization:
>Maintaining more tables is a bit tough.
A table is in 1NF if and only if all the key attributes have been defined and it
contains no repeating groups. A relation R is in 1NF if and only if it has only
single –valued attributes(atomic value/data).
>It is smallest piece of data that cannot or should not be further divided.
There are basically two rules associated with First Normal Form.
Rule 1:A column with atomic data cannot have several values of the
same type of data in that column.
Example 1: Let us consider the below employee table:
In the above table, there are multiple values for children’s name and birthday
columns.
In the above table satisfies Rule1 but not First Normal Form .It violates Rule2.We
should apply Rule2 to the value table to be in First normal Form.
Rule2:A table with atomic data cannot have multiple columns with the
same type of data.
Example :Consider the below Employees table:
A table is in second normal form(2NF) if and only if is in 1NF and every non key
attribute is fully functional dependent on the whole of the primary key (i.e. there
are no partial dependencies.
In slightly more formal terms terms: a 1NF and is in 2NF if and only if all its non-
prime attributes are functionally dependent on the whole of a candidate key.(A
non-prime attribute is one that does not belong to any candidate key.)
Example1:second normal form requires that any non-key field be dependent upon
the entire key. For example, consider the student courses table below, where
StudentID and courseID form a compound primary key.
The student Name field does not depend at all on courseId,but only on StudentID.
Course Location has dependency on StudentID ,but only on CourseID. This data
should be split into three tables as follows:
In the above table, Student ID is primary key and Name column is functionally
dependent on Student ID.
3NF was originally defined by E.F.Codd in 1971. Codd’s definition states a table is
in 3NF if and only if both of the following conditions hold:
The professor is uniquely identified by the CourseID and Section of the course.
However,Professor Name depends on professorID and has no relation to CourseId
or Section. This data is properly stored as follows:
By splitting the data into two tables, the transitive dependency is removed.
Was developed in1974 by Raymond F.Boyce and Edgar F.Codd to address certain
types of anomaly not dealt with by 3NF as originally defined.Only in rare cases
does a 3NF table not meet the requirements of BCNF.
>Eventually,we select a single candidate key to be the key for the relation.
Ex:
Ex-2
The relation FUNDS is in 1NF, 2NF, 3NF because no multiple values, no partial
dependencies and no transitive relation.
3. for any determinant that is not a candidate key, create a new rew relation from
the functional dependency. Retain the determinant in the original relation
1.Determinants are
iii. Manager
2. The determinant which act as keys are i and ii but not Manager because using
only that we can’t identify a tuple in relation.
We have retained the determinant Manager int the MANAGERS and each of
relations are in 1NF , 2NF , 3NF AND BCNF