DBMS Final Sem Ppt (3)
DBMS Final Sem Ppt (3)
DBMS Final Sem Ppt (3)
2 ER MODEL 9 – 24
3 UNIT – 1 CONSTRIANTS 25 – 29
4 ENHANCED ER MODEL 30 – 36
5 UNIT – 2 NORMALIZATION 37 – 49
6 KEYS 50 – 65
8 UNIT – 4 HASHING 79 – 93
1. Entities
2. Attributes
3. Relationship
i. Entity:
An entity is a “thing” or “object” in real world that is
distinguishable from all other objects.
Example: If we want to create a database for college, in
that case ‘student’, ‘teachers’, ‘courses’, ‘building’ etc
are the entities.
Representation: A very simple rectangular box.
Entity set:
An entity set is a set of entities of the same type that
share similar properties, or attributes.
Representation: A vertical oval(ellipse) listing all
entities.
Student
Courses
Teachers
Building
Weak Entity:
The weak entity is the one that depends on other
entities in instance.
A weak entity cannot be identified uniquely.
Representation: Double rectangle.
Strong Entity:
An entity is the one that is independent of other entity
is called strong entity.
A strong entity can be uniquely identified using its
primary key attributes.
Representation:
ii Attributes:
The characteristics or properties which describe an entity are called as its attributes.
An entity is represented by a set of attributes.
Attributes are descriptive properties possessed by each member of an entity set.
Representation: A Horizontal Oval ( ellipse).
Attribute
name
Types of Attributes:
1. Key attributes
2. Composite attribute
3. Multivalued attribute
4. Derived attribute
Key attributes:
• A key attribute is the one which will uniquely identify
an entity.
• Ex: For a student, studentID is the key attribute.
• Representation: Underlined attribute name.
Composite attribute:
A composite attribute can be division of subparts.
Which composes of multiple units of attributes forming
a larger one.
Ex: Address ( hno,street,city,state,country etc)
Representation: Horizontal ovals mapped with further
composite vertical ovals.
Multivalued attribute:
An attribute that can hold multiple values is known as
multivalued attribute.
Ex: A student can have more than one phone numbers
so the phone no attribute is multivalued.
Representation: Double ellipses.
Attribute_Name
Derived attribute:
Derived attribute are those which are derived based on
other attributes.
Ex: The age of the student can be calculated from
date_of_birth present as an attribute.
Representation: Horizontal dashed ellipse.
attribute_name
iii) Relationship:
The association between different entities that are
existing in a database is depicted by relationship.
Representation: Diamond box with entities
connected to its edges.
Relationship Set: A set of relationships of the
same type.
Types of relationships
1. Unary relationship
2. Binary relationship
3. Ternary relationship
4. N-ary relationship
Unary relationship:
In this Unary relationship, both the associating entity types are
same and degree of relationship is1.
In other words, when the association is within a single entity.
Example: In a class, we have many students, there are monitors
too. So, here class monitors are also students. Thus, we can say
that only students are participating here. The degree of such type
of relationship is 1.
Binary relationship:
A relationship that associates two different entities is known as
Binary relationship. The degree of relationship is 2.
This is the most used relationship, and one can easily be
converted into relational table.
Example: We have two entity types ‘Student’ and ‘Course’ where
each ‘Student’ enrolled in ‘Course’.
Ternary relationship:
i. One-to-One :
An entity in A is associated with almost one entity in B, and an entity
in B is associated with almost one entity in A.
It is also represented by a 1:1 symbol.
The above example depict that ‘Employees’ can work in many ‘Projects’
and a ‘Project’ can have many employees
ENHANCED ENTITY RELATIONSHIP MODEL ( EER
MODEL)
If a student visits a 'University,' they may inquire not only about
the university but also about the 'Courses' it offers.
The 'University' entity could aggregate 'Courses.'
Normalization
Normalization divides the larger tables into smaller tables and links
them using relationships.
Normalization is used to minimize the redundancy from a set of
relations.
It is used to eliminate undesirable characteristics like Insertion,
Updation and Deletion anomalies(abnormality).
Why do we need normalization?
The main reason for normalizing the relations is to remove these
anomalies.
Failure to eliminate anomalies leads to data redundancy and can
cause data integrity and other problems as the database grows.
Data modification anomalies can be categorized into three
types:
Insertion anomaly: Insertion anomaly refers to when one
cannot insert a new tuple into a relationship due to lack of
data.
Deletion anomaly: The deletion anomaly refers to the
situation where the deletion of data results in the
unintended loss of some other important data.
Updation anomaly: The updation anomaly is when an update
of a single data value requires multiple rows of data to be
updated.
Types of Normal Forms
Normalization works through a series of stages called
Normal forms.
The normal forms apply to individual relations.
The relation is said to be in particular normal form if it
satisfies constraints.
1. 1NF
2. 2NF
3. 3NF
4. BCNF - BOYCE-CODD Normal form
i. First Normal Form (1NF)
A relation will be in 1NF if it contains an atomic value.
An attribute of a table cannot hold multiple values it must hold only
single valued attribute.
Each record needs to be unique.
example: employee table
Relation Employee is not in 1NF because of multi-value attribute
called “Emp_phone”.
We will divide this table in order to remove the multivalued attribute.
ii. Second Normal Form ( 2NF)
In the 2NF, relational must be in 1NF.
There should be no partial dependency.
No partial dependency: No non prime attributes is dependent on any
proper subset of any candidate key of the table.
Non-prime attribute: The attribute which is not part of a candidate
key.
Note: AB->B is a proper subset. AB->AB is a subset.
example: Score table
The above table is in 1NF as it is free from multivalued attributes.
The primary keys from the above table is { stu_id,sub_id} which is
composite primary key.
Candidate key={Stu_id,Sub_id}
Here, non-prime attributes are: {Marks, Teacher}.
Here, the ‘Teacher’ attribute entirely depends on ‘Sub_id’ Sub_id ->
Teacher
Teacher is independent of ‘Stu_id’.
Here, non prime attribute ‘Teacher’ is a proper subset of candidate
key which violates 2NF.
Procedure to convert a table into 2NF:
Identify the partial and full dependencies and apply decomposition
rule.
In the above table column ‘Teacher’ is dependent on
‘Sub_id’.
The simple solution is to create a separate table for ‘sub_id’
and ‘teacher’ and removing it from score table.
Resulting tables are :
iii. Third Normal Form(3NF)
A relation will be in 3NF if it follows 2NF .
There should be no transitive dependency for non prime attributes.
No Transitive dependency: No Non prime attributes depend on non prime
attribute rather than the prime attribute or primary key or A relation is in third
normal form if it holds at least one of the following conditions for every non -
trivial function dependency X->Y.
1. X is a super key.
2. Y is a prime attribute i.e. (each element of Y is part of some candidate key).
In the above employee table, we have dependency as
shown
Project_id ->Date_completion
Here, ‘Date_completion’ is dependent on ‘Project_id’ which
is non prime attribute.
Primary key={Emp_id}
Super key={Emp_id,{emp_id,name},…}
Candidate key={emp_id}
Non_prime attributes={Name,Project_id,date_completion}
It violates 3NF.
Solution:
Transitive dependency is converted into a separate table.
Add primary key of each table resulted from each transitive
dependency as the foreign key in an original table along with all
remaining attributes.
iv. Boyce-Codd Normal Form(BCNF)
A table is said to be in BCNF if it satisfies the following
properties.
1. It should be in 3NF.
2. For every functional dependency A->B, ‘A’ must be a super
key i.e. in any functional dependency L.H.S attribute must
be a super key.
• From the above table super key={Stu_id,{Stu_id,Course_id},….}
• Here, the relation is not in BCNF because ‘Instructor’ determines
‘Course_ID’, but ‘Instructor’ is not a superkey.
Decomposition into BCNF:
To convert the table into BCNF:
1. Identify the violation:
Instructor→CourseID violates BCNF.
2. Decompose the relation:
Create one table for the dependency Instructor→CourseID
Instructor_Course (Instructor, Course_ID)
Create another table for the remaining attributes:
Enrollment (Student_ID, Course_ID, Instructor)
KEYS
Keys are one of the basic requirements of a relational
database model.
It is widely used to identify the tuples(rows) uniquely in the
table.
We also use keys to set up relationships among various
columns and tables of a relational database.
Types of Keys
i. Primary Key
ii. Foreign Key
iii. Candidate key
iv. Super key
v. Composite key
vi. Surrogate key
i. Candidate key
Tuple
Stu_ID NAME PHONE NUMBER AGE
1 AYRA 9966225544 23
2 BRAN 8854628742 25
3 JOHN 6395874501 26
We can uniquely identify the tuple in the above table with the
attribute Stu_Id.
We can uniquely identify using Name and Phone attributes but, our
definition says minimal so we are going to choose Stu_Id.
STU_ID COURSE_ID NAME
2 CSC101 C++
3 CSC101 C++
2 CSE201 DBMS
2 CSC101 C++
3 CSC101 C++
2 CSE201 DBMS
DEPT_ID DEPT_NAME
001 CSE
002 IT
003 MECHANICAL
Instructor table
DEPT_ID INST_ID NAME
002 B005 WALTER
002 B003 JESSE
001 B007 SKYLER
v. Composite Key
Keys which more than one attribute that can uniquely
identify a tuple in a relation is called composite key.
Two or more attributes are used together to make a
composite key.
It acts as a primary key if there is no primary key in a table.
vi. Surrogate Key
A surrogate key in SQL is an artificial key that is used as a unique
identifier for each record in a table.
1. Static Hashing
2. Dynamic Hashing
i. Static Hashing
The number of buckets is fixed.
The same hash function is always used.
If you know the size of the database in advance, static
hashing works well.
Problem: If the data size grows, the fixed number of
buckets might not be enough, causing collisions.
Assume that we have a hash table with 4 buckets and
use a hash function that computes the bucket index
based on the key module the number of buckets, i.e.,
hash(key) % 4.
Initial Setup:
Number of Buckets = 4
Hash Function = key % 4
• Now, we add Key 20, which also hashes to 00. Since Bucket
00 already has Key 8 and Key 12, it overflows.
• When this happens: