Notes Unit-2
Notes Unit-2
1. For each strong entity type E in the ER diagram, we create a relation R containing all the
single attributes of E. The primary key of the relation R will be one of the key attribute of
R.
2. For each weak entity type W in the ER diagram, we create another relation R that contains
all simple attributes of W. If E is an owner entity of W then key attribute of E is also included
in R. This key attribute of R is set as a foreign key attribute of R. Now the combination of
primary key attribute of owner entity type and partial key of the weak entity type will form
the key of the weak entity type
Binary Relationships:
One-to-One Relationship:
For each 1:1 relationship type R in the ER-diagram involving two entities E1 and E2 we
choose one of the entities (say E1) preferably with total participation and add primary key
attribute of another E as a foreign key attribute in the table of entity (E1). We will also include
all the simple attributes of relationship type R in E1 if any, For example, the department
relationship has been extended to include head-id and attribute of the relationship.
One-to-Many Relationship:
For each 1:N relationship type R involving two entities E1 and E2, we identify the entity
type (say E1) at the N-side of the relationship type R and include primary key of the entity
on the other side of the relation (say E2) as a foreign key attribute in the table of E1. We
include all simple attribute (or simple components of a composite attribute of R (if any) in
the table E1)
Many-to-Many Relationship:
For each M:N relationship type R, we create a new table (say S) to represent R, we also
include the primary key attributes of both the participating entity types as a foreign key
attribute in S. Any simple attributes of the M:N relationship type (or simple components as
a composite attribute) is also included as attributes of S.
N-ary Relationship:
For each N-ary relationship type R where n>2, we create a new table S to represent R, We
include the primary keys of the relations as foreign key attributes in S that represent the
participating entity types. We also include any simple attributes of the N-ary relationship
type (or simple components of complete attribute) as attributes of S. The primary key of S
is usually a combination of all the foreign keys that reference the relations representing the
participating entity types.
Customer Loan
Loan -
sanction
Employee
Multi-Valued Attributes:
For each multivalued attribute ‘A’, we create a new relation R that includes an attribute
corresponding to plus the primary key attributes k of the relation that represents the entity
type or relationship that has as an attribute. The primary key of R is then combination of A
and k.
name
Account_n
Account branch
generalisation
specialisation
Is-a
intrest charges
Saving Current
Simplification of conversion rules with examples
Following rules are used for converting an ER diagram into the tables-
A strong entity set with only simple attributes will require only one table in relational model.
Attributes
of the table will be the attributes of the entity set.
The primary key of the table will be the key attribute of the entity set.
Example-
A strong entity set with any number of composite attributes will require only one table in relational
model.
While conversion, simple attributes of the composite attributes are taken into account and not the
composite attribute itself.
Example-
A strong entity set with any number of multi valued attributes will require two tables in relational model.
One table will contain all the simple attributes with the primary key.
Other table will contain the primary key and all the multi valued attributes.
Example-
Roll_no City
Roll_no Mobile_no
NOTE-
If we consider the overall ER diagram, three tables will be required in relational model-
One table for the entity set “Employee”
One table for the entity set “Department”
One table for the relationship set “Works in”
Rule-05: For Binary Relationships with Cardinality Ratios-
NOTE- Here, combined table will be drawn for the entity set B and relationship set R.
Case-03: For Binary Relationship With Cardinality Ratio m:1
NOTE- Here, combined table will be drawn for the entity set A and relationship set R.
Here, two tables will be required. Either combine ‘R’ with ‘A’ or ‘B’
Way-01:
1. AR ( a1 , a2 , b1 )
2. B ( b1 , b2 )
Way-02:
1. A ( a1 , a2 )
2. BR ( a1 , b1 , b2 )
While determining the minimum number of tables required for binary relationships with given
cardinality ratios, following thumb rules must be kept in mind-
For binary relationship with cardinality ration m : n , separate and individual tables will be
drawn for each entity set and relationship.
For binary relationship with cardinality ratio either m : 1 or 1 : n , always remember “many side
will consume the relationship” i.e. a combined table will be drawn for many side entity set
and relationship set.
For binary relationship with cardinality ratio 1 : 1 , two tables will be required. You can
combine the relationship set with any one of the entity sets.
Rule-06: For Binary Relationship with Both Cardinality Constraints and Participation Constraints-
Case-01: For Binary Relationship with Cardinality Constraint and Total Participation Constraint From One
Side-
Because cardinality ratio = 1 : n , so we will combine the entity set B and relationship set R.
Then, two tables will be required-
1. A ( a1 , a2 )
2. BR ( a1 , b1 , b2 )
Because of total participation, foreign key a1 has acquired NOT NULL constraint, so it can’t be null now.
Case-02: For Binary Relationship With Cardinality Constraint and Total Participation Constraint From Both
Sides-
If there is a key constraint from both the sides of an entity set with total participation, then that binary
relationship is represented using only single table.
Here, only one table is required.
ARB ( a1 , a2 , b1 , b2 )
Weak entity set always appears in association with identifying relationship with total participation constraint.
Problem 1: Find the minimum number of tables required for the following ER diagram in relational model-
Solution-
Problem 2: Find the minimum number of tables required to represent the given ER diagram in relational
model-
Solution-
Problem 3: Find the minimum number of tables required to represent the given ER diagram in relational
model-
Solution-
Problem 4: Find the minimum number of tables required to represent the given ER diagram in relational
model-
Solution-
Problem 5: Find the minimum number of tables required to represent the given ER diagram in relational
model-
Solution-
Applying the rules that we have learnt, minimum 6 tables will be required-
Account (Ac_no , Balance , b_name)
Branch (b_name , b_city , Assets)
Loan (L_no , Amt , b_name)
Borrower (C_name , L_no)
Customer (C_name , C_street , C_city)
Depositor (C_name , Ac_no)
ER design issues
Relational database design requires that we find a “good” collection of relation schemas. A bad design
may lead to
★ Repetition of Information.
★ Inability to represent certain information.
■ Design Goals:
★ Avoid redundant data
★ Ensure that relationships among attributes are represented
★ Facilitate the checking of updates for violation of database integrity constraints.
Example:
Consider the relation schema:
Lending-schema = (branch-name, branch-city, assets, customer-name, loan-number, amount)
■ Redundancy:
★ Data for branch-name, branch-city, assets are repeated for each loan that a branch makes
★ Wastes space
★ Complicates updating, introducing possibility of inconsistency of assets value
■ Null values:
★ Can’t store information about a branch if no loans exist
★ Can use null values, but they are difficult to handle.
Unit-2
Relational Algebra
Relational algebra is a procedural query language. It gives a step by step process to obtain
the result of the query. It uses operators to perform queries.
1. Select Operation:
o The select operation selects tuples that satisfy a given predicate.
o It is denoted by sigma (σ).
Notation: σ p(r)
Where:
1
BRANCH_NAME LOAN_NO AMOUNT
Input:
1. σ BRANCH_NAME="perryride" (LOAN)
Output:
2. Project Operation:
o This operation shows the list of those attributes that we wish to appear in the result.
Rest of the attributes is eliminated from the table.
2
o It is denoted by ∏.
Where
Input:
Output:
NAME CITY
3
Jones Harrison
Smith Rye
Hays Harrison
Curry Rye
Johnson Brooklyn
Brooks Brooklyn
3. Union Operation:
o Suppose there are two tuples R and S. The union operation contains all the tuples
that are either in R or S or both in R & S.
o It eliminates the duplicate tuples. It is denoted by ∪.
Notation: R ∪ S
Example:
DEPOSITOR RELATION
CUSTOMER_NAME ACCOUNT_NO
Johnson A-101
Smith A-121
4
Mayes A-321
Turner A-176
Johnson A-273
Jones A-472
Lindsay A-284
BORROW RELATION
CUSTOMER_NAME LOAN_NO
Jones L-17
Smith L-23
Hayes L-15
Jackson L-14
Curry L-93
Smith L-11
Williams L-17
Input:
Output:
5
CUSTOMER_NAME
Johnson
Smith
Hayes
Turner
Jones
Lindsay
Jackson
Curry
Williams
Mayes
4. Set Intersection:
o Suppose there are two tuples R and S. The set intersection operation contains all
tuples that are in both R & S.
o It is denoted by intersection ∩.
1. Notation: R ∩ S
Input:
6
1. ∏ CUSTOMER_NAME (BORROW) ∩ ∏ CUSTOMER_NAME (DEPOSITOR)
Output:
CUSTOMER_NAME
Smith
Jones
5. Set Difference:
o Suppose there are two tuples R and S. The set intersection operation contains all
tuples that are in R but not in S.
o It is denoted by intersection minus (-).
1. Notation: R - S
Input:
Output:
CUSTOMER_NAME
Jackson
Hayes
Willians
7
Curry
6. Cartesian product
o The Cartesian product is used to combine each row in one table with each row in the
other table. It is also known as a cross product.
o It is denoted by X.
1. Notation: E X D
Example:
EMPLOYEE
1 Smith A
2 Harry C
3 John B
DEPARTMENT
DEPT_NO DEPT_NAME
A Marketing
B Sales
C Legal
Input:
8
1. EMPLOYEE X DEPARTMENT
Output:
1 Smith A A Marketing
1 Smith A B Sales
1 Smith A C Legal
2 Harry C A Marketing
2 Harry C B Sales
2 Harry C C Legal
3 John B A Marketing
3 John B B Sales
3 John B C Legal
7. Rename Operation:
The rename operation is used to rename the output relation. It is denoted by rho (ρ).
Example: We can use the rename operator to rename STUDENT relation to STUDENT1.
1. ρ(STUDENT1, STUDENT)
9
Passenger (pid, pname, pgender, pcity)
Agency (aid, aname, acity)
Flight (fid, fdate, time, src, dest)
Booking (pid, aid, fid, fdate)
b) Get the details about all flights from Chennai to New Delhi.
c) Find only the flight numbers for passenger with pid 123 for flights to Chennai before
06/11/2020.
Π fid (σ pid = 123 (booking) ⨝ σ dest = “Chennai” ^ fdate < 06/11/2020 (flight)
[Hint: Given conditions are pid, dest, and fdate. To get the flight id for a passenger given a
pid, we have two tables flight and booking to be joined with necessary conditions. From the
result, the flight id can be projected]
d) Find the passenger names for passengers who have bookings on at least one flight.
e) Find the passenger names for those who do not have any bookings in any flights.
10
[Hint: here applied a set difference operation. The set difference operation returns only
pids that have no booking. The result is joined with passenger table to get the passenger
names.]
-----------------------------------------------------------------------------------------------------
f) Find the agency names for agencies that located in the same city as passenger with
passenger id 123.
[Hint: we performed a theta join on equality conditions (equi join) here. This is done
between details of passenger 123 and the agency table to get the valid records where the
city values are same. From the results, aname is projected.]
g) Get the details of flights that are scheduled on both dates 01/12/2020 and 02/12/2020
at 16:00 hours.
h) Get the details of flights that are scheduled on either of the dates 01/12/2020 or
02/12/2020 or both at 16:00 hours.
-----------------------------------------------------------------------------------------------------
i) Find the agency names for agencies who do not have any bookings for passenger with id
123.
j) Find the details of all male passengers who are associated with Jet agency.
11
Π passengers.pid, pname, pcity (σ pgender = “Male” (passengers ⨝ booking) ⨝
agency))
[Hint: To get the link between passengers and agency, we need to join all three tables
passengers, booking, and agency with necessary condition. Here, agency links both
passengers and agency. As we have performed natural join operation between all three
tables, the degree of the result will consist of all attributes from all the three tables. Hence,
we project only passengers details as these are mentioned as required.]
Thus, it generates set of all tuples t, such that Predicate P(t) is true for t.
P(t) may have various conditions logically combined with OR (∨), AND (∧), NOT(¬).
It also uses quantifiers:
∃ t ∈ r (Q(t)) = ”there exists” a tuple in t in relation r such that predicate Q(t) is true.
∀ t ∈ r (Q(t)) = Q(t) is true “for all” tuples in relation r.
12
Example:
13
14
Queries-1: Find the loan number, branch, amount of loans of greater than or equal to
10000 amount.
{t| t ∈ loan ∧ t[amount]>=10000}
Resulting relation:
15
In the above query, t[amount] is known as tuple variable.
Queries-2: Find the loan number for each loan of an amount greater or equal to 10000.
{t| ∃ s ∈ loan(t[loan number] = s[loan number]
∧ s[amount]>=10000)}
Resulting relation:
Queries-3: Find the names of all customers who have a loan and an account at the
bank.
{t | ∃ s ∈ borrower( t[customer-name] = s[customer-name])
Resulting relation:
Queries-4: Find the names of all customers having a loan at the “ABC” branch.
{t | ∃ s ∈ borrower(t[customer-name] = s[customer-name]
Resulting relation:
16
Domain Relational Calculus
Domain Relational Calculus is a non-procedural query language equivalent in power to
Tuple Relational Calculus. Domain Relational Calculus provides only the description of
the query but it does not provide the methods to solve it. In Domain Relational Calculus,
a query is expressed as,
{ < x1, x2, x3, ..., xn> | P (x1, x2, x3, ..., xn ) }
Where, < x1, x2, x3, …, xn > represents resulting domains variables and P (x1, x2, x3, …, xn )
represents the condition or formula equivalent to the Predicate calculus.
Example:
Table-1: Customer
Customer name Street City
Example:
Table-1: Customer
Customer name Street City
17
Customer name Street City
Table-3: Borrower
Customer name Loan number
Ritu L01
Debomit L08
Soumya L03
Query-1: Find the loan number, branch, amount of loans of greater than or equal to 100
amount.
{≺l, b, a≻ | ≺l, b, a≻∈ loan ∧ (a ≥ 100)}
Resulting relation:
Query-2: Find the loan number for each loan of an amount greater or equal to 150.
{≺l≻ | ∃ b, a (≺l, b, a≻∈ loan ∧ (a ≥ 150)}
Resulting relation:
Loan number
L01
18
Loan number
L03
Query-3: Find the names of all customers having a loan at the “Main” branch and find
the loan amount.
{≺c, a≻ | ∃ l (≺c, l≻∈ borrower ∧∃ b (≺l, b, a≻∈ loan ∧ (b = “Main”)))}
Resulting relation:
Ritu 200
Debomit 60
Soumya 150
Pitfalls in relational DB
Careless Naming Practices
Choosing names is an aspect of database design that is often neglected but can have a
considerable impact on usability and future development. To avoid this, both table and
column names should be chosen to be meaningful and to conform to the established
conventions, ensuring that consistency is maintained throughout a system. A number of
conventions can be used in relational database names, including the following two
examples for a record storing a client name: "client_name" and "clientName."
Lack of Documentation
Creating documentation for a relational database can be a vital step in safeguarding future
development. There are different levels of documentation that can be created for
databases, and some database management systems are able to generate the
documentation automatically. For projects where formal documentation is not considered
necessary, simply including comments within the SQL code can be helpful.
19
Failure to Normalize
Normalization is a technique for analyzing, and improving on, an initial database design. A
variety of techniques are involved, including identifying features of a database design that
may compromise data integrity, for example items of data that are stored in more than one
place. Normalization identifies anomalies in a database design, and can preempt design
features that will cause problems when data is queried, inserted or updated.
Lack of Testing
Failure to test a database design with a sample of real, or realistic, data can cause serious
problems in a database system. Generally, relational database design is started from an
abstract level, using modeling techniques to arrive at a design. The drawback to this
process is that the design sometimes will not relate accurately to the actual data, which is
why testing is so important.
SQL has many capabilities that can improve the usability and success of a database system.
Facilities such as stored procedures and integrity checks are often not used in cases where
they could greatly enhance the stability of a system. Developers often choose not to carry
out these processes during the design stages of a project as they are not a necessity, but
they can help to avoid problems at a later stage.
20