[go: up one dir, main page]

0% found this document useful (0 votes)
31 views74 pages

DBMS Class Notes

The document provides comprehensive lecturer notes on Database Management Systems (DBMS), covering definitions, components, types, history, advantages, and disadvantages of DBMS. It discusses the architecture of DBMS, data independence, and the roles of various database users. Additionally, it outlines centralized and distributed database architectures and introduces principles and techniques for database design, including the waterfall model of system development.

Uploaded by

davidohtieno903
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views74 pages

DBMS Class Notes

The document provides comprehensive lecturer notes on Database Management Systems (DBMS), covering definitions, components, types, history, advantages, and disadvantages of DBMS. It discusses the architecture of DBMS, data independence, and the roles of various database users. Additionally, it outlines centralized and distributed database architectures and introduces principles and techniques for database design, including the waterfall model of system development.

Uploaded by

davidohtieno903
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 74

SIAYA INSTITUTE OF TECHNOLOGY

P.O. BOX 1087 - 40600 SIAYA. TEL: 0703564522


Email: info@siayainstitute.ac.ke

LECTURER NOTES

ON

DATABASE MANAGENEBT SYSTEM


(DBMS)

1 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
TOPIC ONE: INTRODUCTION TO DATABASE MANAGEMENT SYSTEMS

INTRODUCTION & DEFINITION

Data
In simple words data can be facts related to any object in consideration, e.g. your name, age,
height, weight, etc. are some data related to you. A picture, image, file, pdf etc. can also be
considered data.

Database
Database is a collection of related data. Databases support storage and manipulation of data.
.
Database Management System (DBMS)
It is a software system that allows users to define, create maintain a database and provides
controlled access to the data. DBMS is basically a collection of programs that enables users
to store, modify and extract information from a database as per the requirements.

COMPONENTS OF DBMS
A database management system (DBMS) consists of several components. Each component
plays very important role in the database management system environment. The major
components of database management system are:
• Software
• Hardware
• Data
• Procedures
• Users

Hardware: The hardware is the actual computer system used for keeping and accessing the
database. It consists of a set of physical electronic devices such as computers (together with
associated I/O devices like disk drives), storage devices, I/O channels, electromechanical
devices that make interface between computers and the real-world systems etc.

Software: The main component of a DBMS is the software. It is the set of programs used to
handle the database and to control and manage the overall computerized database. DBMS
software itself, is the most important software component in the overall system.

Data: The main purpose of DBMS is to process the data. In DBMS, databases are defined,
constructed and then data is stored, updated and retrieved to and from the databases.

Procedures: Procedures refer to the instructions and rules that help to design the database
and to use the DBMS. The users that operate and manage the DBMS require documented
procedures on how use or run the database management system.

2 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
Users: The users are the people who manage the databases and perform different operations
on the databases in the database system e.g. Application Programmers, Data Administrators
Database Administrators, End-Users etc.

TYPES OF DBMS
The major types of DBMS are:

Hierarchical - this type of DBMS employs the "parent-child" relationship of storing data.
This type of DBMS is rarely used nowadays. Its structure is like a tree with nodes
representing records and branches representing fields

Network DBMS - this type of DBMS supports many-to many relations. This usually results
in complex database structures. RDBM Server is an example of a database management
system that implements the network model.

Relational DBMS - this type of DBMS defines database relationships in form of tables, also
known as relations. Unlike network DBMS, RDBMS does not support many to many
relationships. Relational DBMS usually have pre-defined data types that they can support.
This is the most popular DBMS type in the market. Examples of relational database
management systems include MySQL, Oracle, and Microsoft SQL Server database.

Object Oriented Relation DBMS - this type supports storage of new data types. The data to
be stored is in form of objects. The objects to be stored in the database have attributes (i.e.
gender, age) and methods that define what to do with the data. PostgreSQL is an example of
an object-oriented relational DBMS.

HISTORY OF DATABASE SYSTEMS


Before-1950s
• Data was stored as paper records.
• Lot of man power involved.
• Lot of time was wasted.e.g. When searching therefore inefficient.

1950s and early 1960s:


• Data processing using magnetic tapes for storage
• Tapes provide only sequential access
• Punched cards for input

Late 1960s and 1970s:


• Hard disks allow direct access to data
• Data stored in files known as File Processing System
• 1960’s Hierarchical and Network models were introduced
• 1970 E.F. Codd introduced the Relational model

1990’s to date: Object Oriented model and other Modern Database Management
Systems are being researched
3 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
File-based System
File-based systems were an early attempt to computerize the manual filing system. File-
based system is a collection of application programs that perform services for the end-users.
Each program defines and manages its data.

Disadvantages
1. Separation and isolation of data
When data is isolated in separate files, it is more difficult for us to access data that should be
available. The application programmer is required to synchronize (harmonize) the processing
of two or more files to ensure the correct data is extracted.

2. Duplication of data
When employing the decentralized file-based approach, the uncontrolled duplication of data
can easily occur.

3. Incompatible file formats


The structures of the file are dependent on the application programming language. The direct
incompatibility makes them difficult to process jointly.

Database Approach:
In order to overcome the limitations of the file-based approach, the concept of database and
the Database Management System (DBMS) emerged in 60s.

Advantages
A number of advantages of applying database approach in application system are obtained
including:

1. Eliminates duplication of data


Database system ensures that there is no duplication of data. Data input and stored only
once but it is used for many purposes.

2. Sharing of data
The data held in the database can be shared by different users. Different users can access
the same data in the database for their own processing applications. This eliminates the
need to duplicate data.

3. Integrity of data is preserved


Database system ensures that only valid data is entered to the database to preserve the
integrity of the database. The data values stored in the database must satisfy certain types
of integrity constraints. For example, the balance of a bank account must never fall below
a prescribed amount say sh. 1000.

4. Data consistency
By eliminating or controlling redundancy, the database approach reduces the risk of
inconsistencies occurring. It ensures all copies of the data are kept consistent.

5. Improved security

4 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
Database approach provides a protection of the data from the unauthorized users. It may take
the term of user names and passwords to identify user type and their access right in the
operation including retrieval, insertion, updating and deletion.

6. Improved maintenance
Database approach provides a data independence. As a change of data structure in the
database will not affect the application program, therefore simplifies database application
maintenance.

7. Improved backing and recovery services


Modern database management system provides facilities to minimize the amount of
processing that can be lost following a failure by using the transaction approach.

Disadvantages
In split of a large number of advantages can be found in the database approach, it is not
without any challenge. The following disadvantages can be found including:

1. Complexity
Database management system is an extremely complex piece of software. All memory as
well as a large number amount of disk space in order to make it run parties must be familiar
with its functionality and take full advantage of it. Therefore, training for the administrators,
designers and users is required.

2. Size
The database management system consumes a substantial amount of main memory
efficiently.

3. Cost of DBMS
A multi-user database management system may be very expensive. Even after the
installation, there is a high recurrent annual maintenance cost on the software.

4. Higher impact of a failure


The database approach increases the vulnerability of the system due to the centralization. As
all users and applications reply on the database availability, the failure of any component can
bring operations to a halt and affect the services to the customer seriously.

5. Requires a skilled database administrator


The organization requires to deploy a skilled database administrator to manage the database
which is expensive.

Database systems:
A database system consists of a database, DBMS and a data model. To build a database
system we must use the facilities of some DBMS to define the data structures (scheme) of the
database. We must use the facilities of some DBMS to define integrity constraints and then
populate the database with data.

5 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
DBMS Architecture:
There are three level or layers of DBMS architecture:
1. External level
2. Conceptual level
3. Internal level

1. External level: This is a highest level of abstraction as seen by user. This level of
abstraction describes only the part of entire database. It is based on the conceptual model, is
the end user view of data environment. Each external view described by means of a schema
called an external schema or subschema.

2. Conceptual level: At this level of database abstraction all the database entities and the
relationships among them are included. One conceptual view represents the entire database.
The conceptual schema defines the theoretic view.

3. Internal (physical) level: This lowest level of abstraction. It closest to physical storage
device. It describes how data are actually stored on the storage medium.

Data Independence:
The ability to modify a schema definition in one level without affecting a schema definition
in a higher level is called data independence.

There are two kinds:


1. Physical data independence

6 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
▪ The ability to modify the physical schema without causing application
programs to be rewritten
▪ Modifications at this level are usually to improve performance
2. Logical data independence
▪ The ability to modify the conceptual schema without causing
application programs to be rewritten
▪ Usually done when logical structure of database is altered
Logical data independence is harder to achieve as the application programs are usually
heavily dependent on the logical structure of the data. An analogy is made to abstract data
types in programming languages.

Types of Database Users:


Many persons are involved in the design, use and maintenance of any database.

These persons can be classified into 2 types:


✓ Actors on the scene
✓ Workers behind the scene

Actors on the scene:


The people, whose jobs involve the day-to-day use of a database are called as 'Actors on the
scene', as listed below.

1. Database Administrators (DBA): The DBA is responsible for authorizing access to the
database, for coordinating and monitoring its use and for acquiring software and hardware
resources as needed. These are the people, who maintain and design the database daily.

DBA is responsible for the following issues.

a. Design of the conceptual and physical schemas:


b. Security and Authorization:
c. Storage structure and Access method definition:
d. Data Availability and Recovery from Failures:
e. Database Tuning:
f. Integrity Constraint Specification:

2. Database Designers: are responsible for identifying the data to be stored in the database
and for choosing appropriate structures to represent and store this data.

3. End Users: are the people whose jobs require access to the database for querying,
updating and generating reports.
a) Casual End users: These people occasionally access the database, but they may need
different information each time.
b) Naive or Parametric End Users: Their job function revolves around constantly
querying and updating the database using standard types of queries and updates.
c) Sophisticated End Users: These include Engineers, Scientists, Business analyst and
others familiarize to implement their applications to meet their complex requirements.

7 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
d) Stand-alone End users: These people maintain personal databases by using ready-
made program packages that provide easy to use menu-based interfaces.

4. System Analyst:
These people determine the requirements of end users and develop specifications for
transactions.

5. Application Programmers (Software Engineers):


These people can test, debug, document and maintain the specified transactions.

Workers behind the scene:


1. Database Designers and Implementers: These people who design and implement the
DBMS modules and interfaces as a software package.

2. Tool Developers: Include persons who design and implement tools consisting the
packages for design, performance monitoring, and prototyping and test data generation.

TOPC TWO: DATABASE ORGANITION


The database may be of different types but there are two generic database architectures.
These are:
• Centralized Database
• Distributed Database

Centralized Database
A database that all data is located at a single computer (or site) and multiple users can access
the database i s known as centralized database. A centralized database provides an efficient
way to access and update data. These databases are usually used in computer network
environments. The examples of centralized databases are:
• Personal Computer Databases
• Client/Server Databases
• Central Computer Databases

(i) Personal Computer Databases


A personal computer database is normally created and maintained by a single user on the
personal computer. The personal computer database is commonly used in small business or
organization. If there is a need to share data, this database could be stored on a database
server in a local area network, so that the multiple user can access and update the database.

(ii) Client/Server Databases


The client/server databases are used in small to medium organization or businesses to share
data among multiple users in local area network. The microcomputers are often used in a
local area network.

The client/server architecture is designed for the distribution of work on a computer network
in which many clients may share the data (or services). Here is an example of client/server
database

8 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
(iii) Central Computer Databases
The central computer databases are commonly used in central computers in large
organizations. The central computer may be a mainframe or minicomputer. These databases
are accessed by a large number of users. The users at remote locations can also access the
database using remote terminals and data communication links.

Here is an example of a central computer database

Distributed Database
Many organizations/departments have sub-offices in different cities and countries. In such
cases, the distributed databases are used instead of centralized databases. A distributed
database is a single logical database, which is spread physically across computers in multiple
locations (such as cities or countries).

The distributed databases are further divided into two categories:


• Homogeneous Databases
• Heterogeneous Databases

Homogeneous Databases
The homogeneous database means that the database technology is the same at each of the
locations (or sites) and that the data at various locations are also compatible. In a
homogeneous system, all nodes use the same hardware and software for the database system.
The following conditions must be satisfied for homogeneous database.

9 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
1. The operating system used at each location must be same or compatible.
2. The data structures used at each location must be same or compatible.
3. The database application (or DBMS) used at each location must be same or
compatible.

Heterogeneous Databases
The heterogeneous database systems are opposite to homogeneous database systems. In a
heterogeneous system, different nodes may have different hardware and software and data
structures at various nodes or locations are also compatible

10 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
TOPIC THREE: PRINCIPLES AND TECHNIQUES OF DATABASE DESIGN

Database development is just one part of the much wider field of software engineering, the
process of developing and maintaining software. A core aspect of software engineering is the
subdivision of the development process into a series of phases, or steps, each of which
focuses on one aspect of the development. The collection of these steps is sometimes referred
to as a development life cycle.

Waterfall model: It shows the process as a strict sequence of steps where the output of one
step is the input to the next and all of one step has to be completed before moving onto the
next. However, in reality there is usually some degree of refinement and feedback as the
product proceeds through the development.

A general model of system development: the waterfall model

We can use the above figure as a means of identifying the tasks that are required, together
with the input and output for each activity. What is important is the scope of the activities,
which can be summarized as follows:

• Establishing requirements involves consultation with, and agreement among,


stakeholders as to what they want of a system, expressed as a statement of
requirements.
• Analysis starts by considering the statement of requirements and finishes by
producing a system specification. The specification is a formal representation of what
a system should do, expressed in terms that are independent of how it may be
realized.
• Design begins with a system specification and produces design documents, and
provides a detailed description of how a system should be constructed.

11 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
• Implementation is the construction of a computer system according to a given design
document and taking account of the environment in which, the system will be
operating (for example specific hardware or software available for the development).
Implementation may be staged, usually with an initial system than can be validated
and tested before a final system is released for use.
• Testing compares the implemented system against the design documents and
requirements specification and produces an acceptance report or, more usually, a list
of errors and bugs that require a review of the analysis, design and implementation
processes to correct (testing is usually the task that leads to the waterfall model
iterating through the life cycle).
• Maintenance involves dealing with changes in the requirements, or the
implementation environment, bug fixing or porting of the system to new
environments (for example migrating a system from a standalone PC to a UNIX
workstation or a networked environment). Since maintenance involves the analysis of
the changes required, design of a solution, implementation and testing of that solution
over the lifetime of a maintained software system, the waterfall life cycle will be
repeatedly revisited.

12 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
TOPIC FOUR: RELATIONAL DATABASE SYSTEM

DBMS Concepts
A Relational Database Management System (RDBMS) is a database management system
based on the relational model introduced by E.F Codd. In relational model, data is stored in
relations (tables) and is represented in form of tuples (rows).

RDBMS is used to manage Relational database which is the most commonly used database
these days. Relational database is a collection of organized set of tables related to each
other, and from which data can be accessed easily.

Table
In Relational database model, a table is a collection of data elements organized in terms of
rows and columns. A table is also considered as a convenient representation of relations.
Table is the simplest form of data storage. Below is an example of an Employee table.

ID Name Age Salary


1 Adam 34 13000
2 Alex 28 15000
3 Stuart 20 18000
4 Ross 42 19020

Tuple
A single entry in a table is called a Tuple or Record or Row. A tuple in a table represents a
set of related data. For example, the above Employee table has 4 tuples/records/rows.

Attribute
A table consists of several records (row), each record can be broken down into several
smaller parts of data known as Attributes. The above Employee table consist of four
attributes, ID, Name, Age and Salary.

Attribute Domain
When an attribute is defined in a relation (table), it is defined to hold only a certain type of
values, which is known as Attribute Domain. Hence, the attribute Name will hold the name
of employee for every tuple.

Degree
The number of fields is called as ‘degree’. This is also called as ‘arity’.

Cardinality
The cardinality of a relation instance is the number of tuples in it.

Relation Schema
A relation schema describes the structure of the relation, with the name of the relation (name
of table), its attributes and their names and type.

Relation Key
13 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
A relation key is an attribute which can uniquely identify a particular tuple (row) in a relation
(table).

Relational Integrity Constraints


Every relation in a relational database model should abide by or follow a few constraints to
be a valid relation, these constraints are called as Relational Integrity Constraints.

The three main Integrity Constraints are:


1. Key Constraints
2. Domain Constraints
3. Referential integrity Constraints

Key Constraints
We store data in tables, to later access it whenever required. In every table one or more than
one attributes together are used to fetch data from tables. The Key Constraint specifies that
there should be such an attribute (column) in a relation (table), which can be used to fetch
data for any tuple (row). The Key attribute should never be NULL or same for two different
rows of data. For example, in the Employee table we can use the attribute ID to fetch data for
each of the employee. No value of ID is null and it is unique for every row, hence it can be
our Key attribute.

Domain Constraint
Domain constraints refer to the rules defined for the values that can be stored for a certain
attribute. Like we explained above, we cannot store Address of employee in the column for
Name. Similarly, a mobile number cannot exceed 10 digits.

Referential Integrity Constraint


If a table reference to some data from another table, then that table and that data should be
present for referential integrity constraint to hold true.

RELATIONAL ALGEBRA
Relational Algebra operators are mathematical functions used to retrieve queries by
describing a sequence operation on tables or even database (schema) involved. With
relational algebra operators a query is always composed of a number of operators, which
each in turn are composed of relations as variables and return an individual abstraction as the
end product.

The following are the main relational algebra operators as applied to SQL:

1. UNARY OPERATORS
(i) The SELECT Operator (σ)

The SELECT Operator is used to choose a sub set of the tuples (rows) from a relation that
satisfies a selection condition, acting as a filter to return only tuples that fulfills a qualifying
requirement.

14 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
The SELECT Operator is denoted by the symbol (σ) (Sigma)
The syntax for the SELECT statement is the written as follows:

(σ) selection condition>(R)


The (σ) would represent the select command
The selection condition> would represent the condition for selection
The (R) would represent the relation or the table from which we are making a selection of
the tuples.

Ex
Reseach_Student
Name University GPA Age E-mail
Joe Harvard 75 23 joe@myuniv.com
Tim Stanford 80 27 tim@myuniv.com
Janie TU 63 26 janie@myuniv.com
Rob Virginia 77 22 rob@myuniv.com

Required:
(i) Write an algebraic expression that would display student(s) who have graduated
from Harvard University and give the output relation
Soln.
(i) (σ)University=’Harvard’(Reseach_Student)
Output
Name University GPA Age E-mail
Joe Harvard 75 23 joe@myuniv.com

(ii) Write an algebraic expression that would display student(s) who have GPA>=75 and
their Age<=25 and give the output relation

Soln.  GPA>=75 AND Age<=25(Reseach_Student)


Output
Name University GPA Age E-mail
Joe Harvard 75 23 joe@myuniv.com
Rob Virginia 77 22 rob@myuniv.com

(ii) The PROJECT Operator (Π)

This operator is used to re order, select and gets rid of attributes from a table. At some point
we might want only certain attributes in a relation and eliminate others from our query result.
Therefore, the PROJECT Operator would be used in such operations.

The symbol used for the PROJECT Operation is Π (pi)


The general syntax for the PROJECT operator is:

15 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
Π <attribute list>(R)
Π would represent the PROJECT
< attribute list> would represent the attributes (columns) we want from the relational
(R) Would represent the relation or table we want to choose the attributes from.

Ex. Using Reseach_Student relation


Required: Write an algebraic expression to display Name and E-mail of all students in that
relation and give the output relation.

Soln
Π Name, E-mail (Reseach_Student)

Output
Name E-mail
Joe joe@myuniv.com
Tim tim@myuniv.com
Janie janie@myuniv.com
Rob rob@myuniv.com

Required: Write an algebraic expression to find Name and E-mail of students having
GPA>75 in graduation and give out the output relation

Soln. Combine SELECT and PROJECT

Π Name, E-mail  GPA>75 (Reseach_Student)


Output
Name E-mail
Tim tim@myuniv.com
Rob rob@myuniv.com

Ex. Using Student relation


Student_Id StudentName D.O.B Gender Course_Id Marks
N001 Jane 23.06.98 Female 004 70
N002 Andrew 22.05.96 Male 003 75
N003 Catherine 18.09.98 Female 004 90
N004 Agnes 16.06.99 Female 002 92
N005 Noel 14.07.98 Male 004 65

Required: Write an algebraic expression that would:


(i) display all records of female students who attained more than 70 marks;
(ii) display all the Course_Id of the students;

Sol:
(i)  Gender =’Female’ AND Marks>70 (Student)
(ii) Π Course_Id (Student)

16 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
2. SET OPERATORS
The UNION, INTERSECTION and DIFFERENCE Operators (Set Operations)

SET Operations conditions


✓ The relations must have the same arity (the same number of columns)
✓ The domain attribute of R1 should be equal to the domain attribute of R2
UNION: the UNION operation on relation A UNION relation B designated as AUB, joins or
includes all tuples that are in A or in B, eliminating duplicate tuples.

Ex.
Project_Mgr
EmpID Name E-mail
123 Joe joe@...
256 Janie janie@...
742 Tim tim@...
Engineer_Mgr
EmpID Name E-mail
256 Janie janie@...
742 Tim tim@...
341 Rob rob@...

Required: Write an algebraic expression to display Name and Email of people who are
either Engineer OR Project_Mgr or give the output relation.

Sol.
Π Name, E-mail (Project_Mgr)) U Π (Name, E-mail (Engineer))

Name E-mail
Joe joe@...
Janie janie@...
Tim tim@...
Rob rob@...

INTERSECTION: the intersection operation on a relation A INTERSECTION relation B,


designated A∩ includes tuples that are only in A and B, or shared by both A and B are
included in the result.

Ex.
Required: Write an algebraic expression to display Name, Email of persons who are both
Engineers and Project_Mgr and give the output result.

Soln.
Π Name, E-mail (Project_Mgr)) ∩ Π (Name, E-mail (Engineer))

17 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
Name E-mail
Janie janie@...
Tim tim@...

DIFFERENCE: the DIFFERENCE (MINUS) operations includes tuples from one relation
that is not in other relation. Let the relation be A and B, the DIFFRENCE operation A
MINUS B is denoted by A-B, that results in tuples that are A and not in B.

Ex.

3 1 6
2 7 A U B= {3, 4, 5, 1, 2, 6, 7, 8}
A 4 B
5 8 A∩  = {1, 2}
A-B = {3, 4, 5}
B-A = {6, 7, 8}

Required: Write an algebraic expression that would display Name and Email of persons who
are Project_Mgr but Not Engineers and give the output result.

Soln.
Π Name, E-mail (Project_Mgr)) − Π (Name, E-mail (Engineer))
Name E-mail
Joe joe@...
Ex:
Use the following tables to answer the question that follows.

Table X
ID NAME RATING AGE
22 Paul 7 45
31 Linn 8 20
58 Mike 10 35

Table Y
ID NAME RATING AGE
28 James 9 35
31 Linn 8 20
44 Sussan 5 35
58 Mike 10 35

Determine the output for each of the following statements:


(i) XUY

18 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
(ii) X∩Y

3. CARTERSIAN PRODUCT Operator


The CARTERSIAN PRODUCT operator, also referred to as the cross product or cross join,
creates relation that has all the attributes combinations of tuples from A and B in the result. It
is symbolized by × as in A×B.

Ex1.
Write an algebraic expression using Cartesian product operator on Emp. and Dept. relations
given below

Employee

EmpID Name DeptName


123 Jim Finance
254 Joame Marketing
678 Tim Finance

Department
DeptName Manager
Finance Rob
Marketing Sean

Soln. Employee×Department 3x2=6 no. of rows


3+2=5 no. of columns

Output

EmpID Name DeptName DeptName Manager


123 Jim Finance Finance Rob
123 Jim Finance Marketing Sean
254 Joame Marketing Finance Rob
254 Joame Marketing Marketing Sean
678 Tim Finance Finance Rob
678 Tim Finance Marketing Sean

Ex2.
Using the following relations, write the algebraic expression to display Name and Grade of
students having Grade A

Student
RollNo Name
1 Joan
19 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
2 Crisp
3 Teddy

Grade
RollNo Grade
1 B
2 A
3 C

Soln.
πName, Grade  Grade=’A’ (Student ×Grade)
4. DIVISION Operator
Division operator takes two relations as input and produces one relation as output. One of the
input relations must be a binary relation. The other input relation must be a unary relation.
The unary relation must be defined on the same domain as one of the attributes in the binary
relation.

It takes the values of the unary relation and checks them off against the associated attribute in
the binary relation. If all the values in the unary relation match with the same value in the
binary relation then it outputs a value to the output relation.

Ex.
Using the following relations, write an algebraic expression that would display names of
students who passed all the required subjects in diploma module 2. Give the output result.

Student
Name Subject
A DBMS
B DBMS
C OOP
C SAD
A OOP
C APPL2
A SAD
C DBMS
B VB
B QT
A QT
A VB
A APPL2
C QT
B SAD
C VB

Subject

20 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
Subject
DBMS
OOP
SAD
QT
VB
APPL2

Soln.
ΠName (Student÷Subject)
Name
A
C

Ex2.
Using the following relations, write an algebraic expression that would display name of
players who have goals in all the venues. Give output result.

R1
Player name Venue
Joe XYZ
Joe PQR
Jim LPM
Tom ABC
Joe LPM
Joe ABC
Ric ABC

R2
Venue
XYZ
PRQ
LPM
ABC

Soln.
ΠPlayer_name (R1÷R2)

Player name
Joe

5. JOIN Operator
The JOIN operator is used to combine related tuples from two relations into single tuple. It is
denoted by ⋈ symbol. It is similar to cross/Cartesian product.

JOIN CARTESIAN PRODUCT


Combination of tuples that satisfy All possible combination of tuples

21 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
the filtering/matching condition from the relation

TYPES OF JOIN
1. INNER JOINS: Contains only matching tuples
(i) THETA JOIN
Comparative operators like =, <, >, >=, <=

Ex. Given the relations


Mobile
Model Price
Nokia 10000
Samsung 20000
IPhone 50000

Laptop
Model Price
Dell 30000
Acer 20000
Asus 10000

Required: Write an algebraic expression that would enable the purchase of both mobile and
laptop, but mobile price should be less than laptop.

Soln.
Mobile ⋈ Laptop
(Mobile. Price < laptop. Price)

(ii) EQUI JOIN


Make use of equality (‘=’) comparison as comparative operator

Ex.
Using the above example write an algebraic expression that would enable the purchase of
both mobile and laptop but mobile and laptop price should be the same.

Soln.
Mobile ⋈ laptop
(Mobile. Price = laptop. Price)

(iii) NATURAL JOIN


Natural join operation forms a Cartesian product of its two arguments, performs selection
forcing equality on those attributes that appear in both relations schemas and finally removes
duplicate attribute.

Ex
Using the following relations, write algebraic expression that would perform natural join

22 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
Employee
EmpID Name Dept.
1 A Finance
2 B Sales
3 C Sales

Department
Dept. Manager
Finance M1
Sales M2

Soln

Employee ⋈ Department

Output
EmpID Name Dept. Manager
1 A Finance M1
2 B Sales M2
3 C Sales M2

Ex
Using the following relations, write algebraic expression that would perform natural join and
give the output relation.
R
A1 A3 A2
A x10 12
B x20 13
C x30 17

S
A2 A4 A5
17 C X40
12 D X70
13 E X90

Sol:
R⋈S
Output
A1 A3 A2 A4 A5
A x10 12 D x70
B x20 13 E x90
C x30 17 C x40

Ex:
Use the following tables C and D to answer the questions that follow

23 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
Sid Sname
001 English
002 Kiswahili
003 Science
C

Pno Name Sid Hours


K0088769 Cate 002 10
K0099221 Jonathan 001 20
K008702 Andrew 002 30
D
(i) Write an algebraic expression that would join the tables C and D.
Sol:
C⋈D
(ii) Write the output generated from the expression in (i) above
Sol:
Sid Sname Pno Name Hours
001 English K0099221 Jonathan 20
002 Kiswahili K0088769 Cate 10
002 Kiswahili K008702 Andrew 30

(iii) Write the output of the following expression:


C
⋈d. hours≥15 D
Sol:

Sid Sname Pno Name Hours


001 English K0099221 Jonathan 20
002 Kiswahili K008702 Andrew 30

2. OUTER JOIN: It used when we want to keep all the tuples in either or both the relation in
the result of the JOIN regardless of whether or not they have matching tuples in other
relation.

(i) LEFT-OUTER JOIN: Keeps every tuple in the first or left relation. It is denoted by ⟕

Ex. Write an algebraic expression to perform the left-outer join and give the resultant relation

R1
ID Name
1 A
2 B
3 C

24 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
R2
ID Dept.
1 D1
2 D2

Soln. R1 ⟕ R2
ID Name Dept.
1 A D1
2 B D2
3 C NULL

(ii) RIGHT-OUTER JOIN: Keeps every tuple in the second or right relation. It is denoted
by ⟖

Ex. Write an algebraic expression to perform the right-outer join and give the resultant
relation.
R1
ID Name
1 A
2 B
1 C

R2
ID Dept.
1 D1
3 D3

Soln. R1 ⟖ R2
Name ID Dept.
A 1 D1
NULL 3 D3
C 1 D1

(iii) FULL-OUTER JOIN: Keeps tuples from both left and right relation

Ex. Write an algebraic expression to perform the full outer join on the following relations and
give the output.
R1
ID Name
1 A
2 B
1 C

R2
ID Dept.
1 D1
3 D3
25 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
Soln. R1 ⟗ R2
Name ID Dept.
A 1 D1
B 2 NULL
C 1 D1
NULL 3 D3

Ex: Given the following relations;

A B
20 30
40 50
Tab1

B C
50 60
70 8
Tab2

Perform each of the following in relation algebra:


(i) Tab1⋈ Tabl2;
Sol:
A B C
40 50 60

(ii) Tab1⟕ Tabl2;


Sol:
A B C
20 30 NULL
40 50 60

(iii)Tab1⟖Tabl2;
Sol:
A B C
40 50 60
NULL 70 8

(iv) Tab1⟗ Tabl2;


Sol:
A B C
20 30 NULL
40 50 60
NULL 70 8

26 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
Ex: Given the following relations;

Sno sname sage


S1 n1 21
S2 n2 22
S3 n3 23
S4 n4 24
Student

Sno cname fees


S1 C1 1000
S3 C3 3000
S5 C5 5000
Course

Perform each of the following in relation algebra:


(i) Student⋈ Course;
Sol:
Sno sname sage cname fees
S1 n1 21 C1 1000
S3 n3 23 C3 3000

(ii) Student ⟕ Course;


Sol:
Sno sname sage cname fees
S1 n1 21 C1 1000
S2 n2 22 NULL NULL
S3 n3 23 C3 3000
S4 N4 24 NULL NULL

(iii) Student ⟖ Course;


Sol:
Sno sname Sage cname fees
S1 n1 21 C1 1000
S3 n3 23 C3 3000
S5 NULL NULL C5 5000

(iv) Student ⟗ Course;


Sol:
Sno sname Sage cname fees
S1 n1 21 C1 1000
S2 n2 22 NULL NULL
S3 n3 23 C3 3000
S4 N4 24 NULL NULL
S5 NULL NULL C5 5000

27 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
6. RENAME Operator (ρ)
The RENAME operator is used to give a name to results or output of queries, returns of
selection statements, and views of queries that we should like to view at some other point in
time.

The RENAME operator is symbolized by ρ (rho)


Ex.
Given the relation below, Write an algebraic expression to: -
(i) Rename the above EMPLOYEE relation to EMPLOYEE_TABLE
(ii) Rename EmployeeID and EmployeeName attributes as ‘ID’ and ‘Name’ respectively
(iii) Give the output result.

EMPLOYEE
EmployeeID EmployeeName
A 100
B 101
C 102

Soln.
(i) ρ EMLOYEE_TABLE (EMPLOYEE)
(ii) ρ ID/EmployeeID, Name/EmployeeName (EMPLOYEE)

Result:
EMPLOYEE_TABLE
ID Name
A 100
B 101
C 102

AGGREGATE FUNCTIONS
We can also apply Aggregate functions to attributes and tuples:

_ SUM
_ MINIMUM
_ MAXIMUM
_ AVERAGE, MEAN, MEDIAN
_ COUNT

Ex. Assume the relation Employee has the following tuples:


Office Name Dept Salary
CS001 Smith CS 45000
Econ001 Jones Econ 35000
Econ002 Green Econ 50000
CS002 Brown CS 65000
Fin001 Bright Fin 60000

28 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
Table: Employee

Required:
(i) Find the minimum salary:

Sol: Ԍ MIN (salary) (Employee)

Results:

MIN (salary)
50000

(ii) Find the average salary


Sol: Ԍ AVG (salary) (Employee)

Results:

AVG (salary)
51000

(iii) Count the number of employees in the CS department

Sol:
Ԍ COUNT (name) σ Dept. = 'CS' (Employee)

Results:

COUNT (name)
2

(iv) Find the total payroll for the Economics department

Sol:
Ԍ SUM (salary) σ Dept. = 'Econ' (Employee)

Results:

SUM (salary)
85000

(ii) Find the average salary in each department

Sol:
Dept. Ԍ AVG (salary) (Employee)

29 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
Results:

Dept. AVG (salary)


CS 55000
Econ 42500
Fin 60000

RELATIONAL CALCULUS
Relational Calculus is a query language which is non-procedural and instead of algebra, it
uses mathematical predicate calculus. The relational calculus is not the same like that of
differentiation and integral calculus in mathematics, but takes its name from a branch of
symbolic logic termed as predicate calculus. When applied to database, it found in two
forms:

1. Tuple relational calculus which was originally proposed by Codd in year 1972 and
is used for selecting those tuples that satisfy the given condition.

2. Domain relational calculus which was proposed by Lacroix and Pirotte in the year
1977and is used to selecting records based on the domains

Tuple relational calculus (TRC)


▪ In this, we work on only filtering the tuples which is based on the given condition.
▪ syntax: {T | Condition}
▪ In this, we define a tuple variable; specify the relation name in which the tuple is to
be searched for, along with a condition.
▪ In this, we do also specify column name using a .dot operator, with the tuple
variable to only get a certain attribute in the result.
▪ A tuple variable is nothing but a name, can be anything, generally we use a single
alphabet for this, so let’s say T is a tuple variable.
▪ To specify the name of the relation in which we want to look for data, we do the
following:

Relation (T), where T is our tuple variable.

Example: If our table is a student, we will write it as student (T).


Then comes the condition part to specify a condition application for a particular attribute, we
can use the .dot variable with the tuple variable to specify it, like in table student, if we
want to get data for student with age greater than 17, then we can write it as, T.age>17

Using Tuple Relational Calculus to fetch names of students, from the table student, with the
age greater than 17 then, T being our tuple variable,

T.name | student(T) AND T.age>17

| means such that

30 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
Ex:
Student (RollNo, Name, DepartmentNo, Sex)
Query: Find RollNo and Name of Student in DepartmentNo 2

Sol:
{t. RollNo, t. Name | Student(t) ^ t. DepartmentNo=2}
Ex:
Student (RollNo, Name, DepartmentNo, Sex)
Query: Find RollNo and Name of Male Student in DepartmentNo 2

Sol:
{t. RollNo, t. Name | Student(t) ˄ t. DepartmentNo=2 ˄ t. Sex= ‘Male’}

Ex: Given the student relation below;

SID SNAME LNAME MARKS


001 Jane Luke 50
002 Agnes James 70
003 Josephine Eliud 90
004 Catherine Joseph 40
005 Susan Clark 30
Student
Write a tuple relational calculus statement that would:

(i) Display all the details of students who scored above 40 Marks

Sol: {t | Student(t) ^ t. MARKS>40}

(ii) Display the records for the fields SID, SNAME AND LNAME of all students
Sol:
{t. SID, t. SNAME, t. LNAME | student(t)}

(iii) Display the records in the field SNAME, LNAME and MARKS for all students with
MARKS values more than 60.
Sol:
{t. SNAME, t. LNAME, t. MARKS | student(t) ˄ t. MARKS >60}

(iv) Display the MARKS values for a student with SNAME is “Catherine” and LNAME
“Joseph”.
Sol:
{t. MARKS | student(t) ˄ t. SNAME= ‘Catherine’ ˄ t. LNAME ‘Joseph’}

USING QUANTIFIERS in Tuple Relational Calculus

Types of Quantifiers

1) Existential quantifiers (Ǝ)

31 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
2) Universal quantifiers (∀)

Expression Meaning
1. Ǝt for some occurrence of t
2. ∀t for every occurrence of t

Example

R
A B C
10 1 20
20 2 30
30 3 40
40 4 50
50 5 60

(i) Ǝt( t.A>50) //returns false


(ii) Ǝt (t.A>40) //returns true
(iii) ∀t (t.A>40) //returns false
(iv) ∀t (t.B<10) //returns true
(v) ¬ Ǝt (t.C<10) //returns true

Domain Relational Calculus (DRC)

In this, filtering is done based on the domain of the attributes and not based on the tuple
values.

Syntax: { c1, c2, c3, …cn | F (c1, c2, c3, …cn) }


where, c1, c2, c3, … etc. are domain of attributes and F defines the formula including the
condition for fetching the data

For example,

{<name, age > | ∈ student ˄ age > 17}

The above query will return the names and ages of the students in the table who are aged
above 17 years

Ex
Given the student relation below;
Last_name Age
First_name
Ajeet Singh 30
Chaitanya Singh 31
Rajeev Bhatia 27
Carl Pratap 28

32 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
Student

Write a domain relational calculus statement that would:

(i) Query to find the first name and age of students where student age is greater than 27
(ii) Give the output result

Sol:

{< First_name, Age > | ∈ Student ∧ Age > 27}

33 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
TOPIC FIVE: ENTITY RELATIONSHIP

The ER model defines the conceptual view of a database. It works around real-time entities
and the associations among them. At view level, the ER model is considered a good option
for designing databases.

Entity

An entity can be a real-world object, either animate or inanimate, that can be easily
identifiable. For example, in a school database, students, teachers, classes, and courses
offered can be considered as entities. All these entities have some attributes or properties that
give them their identity.

An entity set is a collection of similar types of entities. An entity set may contain entities with
attribute sharing similar values. For example, a Students set may contain all the students of a
school; likewise, a Teachers set may contain all the teachers of a school from all faculties.
Entity sets need not be disjoint.

Attributes

Entities are represented by means of their properties, called attributes. All attributes have
values. For example, a student entity may have name, class, and age as attributes.

There exists a domain or range of values that can be assigned to attributes. For example, a
student's name cannot be a numeric value. It has to be alphabetic. A student's age cannot be
negative, etc.

Types of Attributes

• Simple attribute − Simple attributes are atomic values, which cannot be divided
further. For example, a student's phone number is an atomic value of 10 digits.
• Composite attribute − Composite attributes are made of more than one simple
attribute. For example, a student's complete name may have first_name and
last_name.
• Derived attribute − Derived attributes are the attributes that do not exist in the
physical database, but their values are derived from other attributes present in the
database. For example, average_salary in a department should not be saved directly in
the database, instead it can be derived. For another example, age can be derived from
data_of_birth.
• Single-value attribute − Single-value attributes contain single value. For example −
Social_Security_Number.
• Multi-value attribute − Multi-value attributes may contain more than one values.
For example, a person can have more than one phone number, email_address, etc.

Entity-Set and Keys

Key is an attribute or collection of attributes that uniquely identifies an entity among entity
set.
34 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
For example, the roll_number of a student makes him/her identifiable among students.

• Super Key − A set of attributes (one or more) that collectively identifies an entity in
an entity set.
• Candidate Key − A minimal super key is called a candidate key. An entity set may
have more than one candidate key.
• Primary Key − A primary key is one of the candidate keys chosen by the database
designer to uniquely identify the entity set.

Relationship

The association among entities is called a relationship. For example, an employee works_at
a department, a student enrolls in a course. Here, Works_at and Enrolls are called
relationships.

Relationship Set

A set of relationships of similar type is called a relationship set. Like entities, a relationship
too can have attributes. These attributes are called descriptive attributes.

Degree of Relationship

The number of participating entities in a relationship defines the degree of the relationship.

• Binary = degree 2
• Ternary = degree 3
• n-ary = degree

Mapping Cardinalities

Cardinality defines the number of entities in one entity set, which can be associated with the
number of entities of other set via relationship set.

• One-to-one − One entity from entity set A can be associated with at most one entity
of entity set B and vice versa.

35 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
• One-to-many − One entity from entity set A can be associated with more than one
entity of entity set B however an entity from entity set B, can be associated with at
most one entity.

• Many-to-one − More than one entity from entity set A can be associated with at most
one entity of entity set B, however an entity from entity set B can be associated with
more than one entity from entity set A.

• Many-to-many − One entity from A can be associated with more than one entity
from B and vice versa.

36 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
ER Diagrams

ER Diagram is a visual representation of data that describes how data is related to each other.

For example: Developer develops a website, whereas a Visitor visits a website.

Symbols and Notations

Entity, Attributes, Relationships etc. form the components of ER Diagram and there are
defined symbols and shapes to represent each one of them.

Entity

Simple rectangular box represents an Entity.

Relationships between Entities - Weak and Strong

Rhombus is used to setup relationships between two or more entities.

37 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
Attributes for any Entity

Ellipse is used to represent attributes of any entity. It is connected to the entity.

Weak Entity

A weak Entity is represented using double rectangular boxes. It is generally connected to


another entity.

Key Attribute for any Entity

To represent a Key attribute, the attribute name inside the Ellipse is underlined.

Derived Attribute for any Entity

Derived attributes are those which are derived based on other attributes, for example, age can
be derived from date of birth.

To represent a derived attribute, another dotted ellipse is created inside the main ellipse.

Multivalued Attribute for any Entity

38 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
Double Ellipse, one inside another, represents the attribute which can have multiple values.

Composite Attribute for any Entity

A composite attribute is the attribute, which also has attributes.

Example 1
Draw entity relationship diagrams for the following:

(i) A student entity having RollNo as the key attribute, Name and Birthdate
(ii) A student entity having RollNo as the key attribute, Name and Birthdate but
Name is further divided as FirstName and LastName
(iii) A student entity having RollNo as the key attribute, Name and Birthdate but
Name is further divided as FirstName and LastName and a student can have more
than one PhoneNo.
(iv) A student entity having RollNo as the key attribute, Name, Birthdate, PhoneNo
and Age. A StudentName has FirstName and LastName and a student can have
more than one PhoneNo and Age does not exist in physical database.

Soln

(i) RollNo
Birthdate

Student

Name

(ii)
RollNo
Birthdate

Student
39 PREPARED BY,
Name THOMAS O. OCHIENG
FirstName
ICT DEPTMENT, SIT.

LastName
(iii)
RollNo Birthdate

Student PhoneNo

Name
FirstName

LastName

RollNo Birthdate
(iv)

Student PhoneNo
Example 2
Construct an E-R diagram (including attributes, identifiers and mandatory/optional
Name
cardinality) for the following Age
system descriptions, stating any assumptions you make.
FirstName

1) A company keeps records of the products it stores in its warehouses and factories that
make the products. LastName
2) Each product is identified by a unique product number and has name and description.
3) A product is made by a single factory but a factory can make more than one product.
4) A factory is recognized by a factory code and has an address and telephone number.
A product can be stored in one or more warehouses and a warehouse can stock a
number of products. The quantity of a product stored at a warehouse needs to be
recorded.
5) A warehouse is identified by a warehouse code and address is also held.
Soln
Step 1: Identify Entities
(i) Product
(ii) Warehouse
(iii)Factory

Step 2: Identify Relationships


(i) A product is made by a single factory but a factory can make more than one product;
Many-to-one (N:1)
40 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
(ii) A product can be stored in one or more warehouses and a warehouse can stock a
number of products; many-to-many (N:N)

Step 3: Identify key attributes


(i) Product number is the key attribute for the Entity “product”
(ii) Factory code is the Key attribute for the Entity “factory”
(iii)Warehouse code is the key attribute for the Entity “warehouse”

Step 4: Identify other relevant attributes


(i) For the product Entity, other attributes are name, description
(ii) For the factory Entity, other attributes are address, telephone number
(iii)For the warehouse Entity, other attributes are address, quantity_stored.

Step 5: Draw complete ER diagram.

product_ number
location

Product

made stored
by

Quantity_stored
telephone_number
Example 3 warehouse
factory
Design an Entity Relationship (ER) model for a college database having the following. address
Warehouse code
1) A college contains manyaddress
factory_code departments
2) Each department can offer any number of courses
3) Many instructions can work in a department
4) An instructor can work only in one in a department
5) For each department there is a HOD
6) An instructor can be HOD of only one department
7) Each instructor can take any number of courses
8) A course can be taken by only one instructor
9) A student can enroll for any number of courses
10) Each course can have any number of students

Soln:

Step 1: Identify the Entities


(i) Department

41 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
(ii) Course
(iii) Instructor
(iv) Student

Step 2: Identify the relationship


(i) One dept. offers many courses, but one particular course can be offered by only one
dept., hence the cardinality btw dept. and course is one-to-many (1:N)
(ii) One dept. has multiple instructors, but instructor belongs to only dept., hence the
cardinality btw dept. and instructor is one-to-many (1:N)
(iii)One dept. has only one HOD and one HOD can be head of only one dept., hence the
cardinality is one-to-one (1:1)
(iv) One course can be enrolled by many students and one student can enroll for many
courses, hence the cardinality btw course and student is many-to- many (N:N)
(v) One course is only taught by one only one instructor, but one instructor teaches many
courses, hence the cardinality btw course and instructor is many -to-one (M:1)

Step 3: Identify the key attributes

(i) dept_name can identify a dept uniquely, hence dept_name is the key attribute for the
dept. entity
(ii) courseID is the key attribute for the course entity
(iii) studentID is the key attribute for the student entity
(iv) instructorID is the key attribute for the instructor entity

Step 3: Identify the key attributes


(i) For the dept. entity, other attributes are location
(ii) For the course. entity, other attributes are course_name and duration
(iii)For the student. entity, other attributes are first_name, last_name and phone

Step5: Draw complete ER diagram

dept_name description
-name

Department

1 1

has
offers

courseID duration heade


42 N d by PREPARED BY,
course THOMASN O. OCHIENG
ICT DEPTMENT, SIT.
N first_name
course_name

N
enrolled is taught 1
instructor last_name
by by
TOPIC SIX: NORMALIZATION

Definition
is the formal process of decomposing relations with anomalies to produce smaller, well-
structured and stable relations.

Importance of normalization
The aim of normalization is to eliminate the following three file maintenance anomalies:
– Insertion Anomaly – adding new rows forces user to create duplicate data
– Deletion Anomaly – deleting a row may cause loss of other data representing completely
different facts
– Modification Anomaly – changing data in a row forces changes to other rows because of
duplication

The aim of normalization is twofold: -


(iii) To ensure links between items of data that are related is preserved
(iv) To ensure duplication of data items is avoided so that there are no inconsistencies
between the same data stored in different places.

Steps in normalization
In order to obtain a set of normalized relations, the following steps should be applied in
sequence: -

Step1: Identify the key field


A key field is the item of data within a row which can be used to uniquely identify a row.
You should select a code rather than a text field

Step 2: Convert the unnormalized data to 1NF


A relation is in 1NF if and only if every non-key attribute is functionally dependent upon
the primary key. To convert to 1NF you remove all the repeating groups from the table. A
repeating group is a set of fields which have data in them more than once for a single value of
the key field.

Step 3: Identify key fields for the new tables


At this stage you choose the key fields for the new tables. A key field is a field that can be
used to uniquely identify a record in each table.

Step 4: Convert to 2NF


A relation is in 2NF if and only if it is in 1NF and every non-key attribute is fully
functionally dependent on the primary key. The non-key fields must fully depend on the key
field i.e. remove partial dependencies.

Step 5: Convert to 3NF


A relation is in 3NF if and only if it is in 2NF and every non-key attribute is non-transitively
dependent on the primary key. There must be no inter-data dependencies in the table and
also remove calculated or derived key fields in the table.

Ex1.

43 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
In a college, a lecturer may teach many subjects but may not belong to more than one
department. The college maintains information of its lecturers’ subject area as follows;

Lecturer Number, Lecturer Name, Lecturer Grade, Department Code, Department Name,
Subject Code, Subject Name and Subject Level.

Represent this information to 3rd Normal Form.

1NF
Lecturer (Lecturer Number, Lecturer Name, Lecturer Grade)
Lecturer -Department (Lecturer Number, Department Code, Department Name, Subject
Code, Subject Name and Subject Level)

2NF
Lecturer (Lecturer Number, Lecturer Name, Lecturer Grade)
Department (Department Code, Department Name)
Lecturer -Department (Lecturer Number, Department Code, Subject Code, Subject Name and
Subject Level)

3NF
Lecturer (Lecturer Number, Lecturer Name, Lecturer Grade)
Department (Department Code, Department Name)
Subject (Subject Code, Subject Name and Subject Level)
Lecturer -Department (Lecturer Number, Department Code, Subject Code)

Ex2
Table 1 shows Safari company trainee’s details. Use it to answer the question that follows

Trainee Trainee Course Course Name Tutor Tutor Office


ID Surname Code Fname
T101 Brain ACC1 Principles of Peter ADM1
Accounting
T102 William MKT1 Introduction to Cecil ADM3
MKT2 marketing Mary ADM2
International marketing
T103 Sospeter MKT1 Introduction to Cecil ADM3
marketing
Table1

Required: Normalize the table to 3NF


Sol:
1NF
Trainee (Trainee ID, Trainee Surname)
Trainee-Course (Trainee ID, Course Code, Course Name, Tutor Fname, Tutor Office)

2NF
Trainee (Trainee ID, Trainee Surname)
Course (Course Code, Course name)

44 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
Trainee-Course (Trainee ID, Course Code, Tutor Fname, Tutor Office)

3NF
Trainee (Trainee ID, Trainee Surname)
Course (Course Code, Course name)
Tutor (Tutor Office, Tutor Fname)
Trainee-Course (Trainee ID, Course Code, Tutor office)

Trainee Tutor Course


Trainee Trainee Course Course Name
ID Surname Tutor Tutor Code
T101 Brain Office Fname ACC1 Principles of Accounting
T102 William ADM1 Peter MKT1 Introduction to marketing
T103 Sospeter ADM2 Mary MKT2 International marketing
ADM3 Cecil

Trainee-Course

Trainee Course Tutor


ID Code Office
T101 ACC1 ADM1
T102 MKT1 ADM3
T102 MKT2 ADM2
T103 MKT1 ADM3

Ex2
Table 2 is an Employee table represented in 1NF. Use it to answer the questions that follow.

Employee Contract Hours Employee Company Company


ID No Name ID Location
616681B SC1025 48 P. White SC111 Nrb
674315A SC1025 48 R. Press SC111 Nrb
323113B SC1026 24 P. Smith SC112 Nrb
616681B SC1026 24 P. White SC112 Nrb

Required: Normalize the table in its 3NF


Sol:
1NF
Employee (Employee ID, Employee Name)
Employee-Contract (Employee ID, Contract No, Hours, Company ID, Company Location)

2NF
Employee (Employee ID, Employee Name)
Contract (Contract No, Hours)

45 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
Employee-Contract (Employee ID, Contract No, Company ID, Company Location)

3NF
Employee (Employee ID, Employee Name)
Contract (Contract No, Hours)
Company (Company ID, Company Location)
Employee-Contract (Employee ID, Contract No, Company ID)

Employee Contract

Employee Employee Contract Hours


ID Name No
616681B P. White SC1025 48
674315A R. Press SC1026 24
323113A P. Smith

Company Employee-Contract
Company Company
ID Location Employee Contract Company
SC111 Nrb ID No ID
SC112 Nrb 616681B SC1025 SC111
674315A SC1025 SC111
323113B SC1026 SC112
616681B SC1026 SC112
Ex3

Table 3 below shows a student’s result slip. Use it to answer the questions that follow.
Student No:1022567
Student Name: Alex James
Course Code: F105
Course Title: ICT
Subject Subject Title Number of Grade Result Results
Code Hours Code
BUS119 Business operations 20 10 PA01 Pass
COM110 Introduction to computers 20 8 PA02 Pass
COM112 Application Development 20 2 RE01 Refer Exam
COM114 HCI 10 7 RE01 Refer Exam
Table3.
Required: Normalize to 3NF
Sol:
1NF
Student (Student No, Student Name)
Student-Course (Student No, Course Code, Course Title, Subject Code, Subject Title, Number of Hours,
Grade, Result Code, Results)

2NF

46 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
Student (Student No, Student Name)
Course (Course Code, Course Title)
Student-Course (Student No, Course Code, Subject Code, Subject Title, Number of Hours, Grade,
Result Code, Results)

3NF
Student (Student No, Student Name)
Course (Course Code, Course Title)
Subject (Subject Code, Subject Title, Number of Hours)
Result (Result Code, Grade, Results)
Student-Course (Student No, Course Code, Subject Code, Result Code)

Student Course
Student No Student Name Course Code Course Title
1022567 Alex James F105 ICT

Subject Result
Subject Subject Title Number Grade
Code of Hours Result Code Results
BUS119 Business operations 20 10
COM110 Introduction to 20 8 PA01 Pass
computers PA02 Pass
COM112 Application 20 2 RE01 Refer Exam
Development RE01 Refer Exam
COM114 HCI 10 7

Student-Course
Student No Course Code Subject Code Result Code
1022567 F105 BUS119 PA01
1022567 F105 COM110 PA02
1022567 F105 COM112 RE01
1022567 F105 COM114 RE01

47 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
TOPIC SEVEN: QUERYING A DATABASE

Definition of SQL
SQL stands for structured Query Language. It allows you to access a database. It can execute, retrieve,
insert, delete, and update queries against a database.

DATA DEFINITION LANGUAGE (DDL)


This is the language that is used to create structures for data, delete structures for data and amend
existing data structures. It is used to define tables and structures in the database

Data Types
SQL uses several data types. Data types define the properties of values for a column i.e. they define the
allowable values for a column (domain). Every data value within a column must be of the same type.

The data types are classified into three groups namely:


• String data type
• Numeric data types
• Date/Time data types

String Data Types


String data types which consists of a sequence of characters. The following are string data types:
(i) Character: Stores a character string of fixed length. When a string less than the specified length
is input, space is added (padded) to the end of the string. It is abbreviated as CHAR.
(ii) Character Varying: It is a string of minimum length 1 and maximum length determined by the
system. When a string less than the specified length is input only the actual length is stored. It
abbreviated to as VARCHAR.
(iii) Bit: It is used to store binary digits. It usually used to store graphics and sound data. It is
abbreviated as BIT.
(iv) Bit Varying: It is used to store bit strings of varying length. It is abbreviated as BITVAR.

Numeric Data Types


Numeric data types are used to store numeric data. The following are numeric data types:
(i) Integer: It is used to store whole or exact numeric data. It is abbreviated as INT.
(ii) Smallint: It is used to store small integer depending on the system define-values. It is
abbreviated as SMALLINT.
(iii) Decimal: It used to store numeric data with specified number of decimal places. It is abbreviated
as DEC.

Date/Time Data Types


They are used to store date and time data and used as follows:

DATE: YYYY-MM-DD
TIME: HH:MM: SS
DATETIME: YYYY-MM-DD HH:MM: SS
TIMESTAMP: YYYYMMDDHHMMSS
YEAR: YYYY

NOT NULL AND UNIQUE

48 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
Any column in a table can be defined as NOT NULL. This means that the user must type a value in that
column. The default specification is that null values are allowed in a column.
Any column in a table can be defined as UNIQUE. This clause prohibits the user from entering
duplicate values into a column. The combination of NOT NULL and UNQUE is used to specify the
PRIMARY KEY

DDL STATEMENTS

CREATE DATABASE: It is used to create database

Syntax:
CREATE DATABASE database_name;

DROP DATABASE: It is used to delete a database. All objects in the database (tables, views, indexes)
are deleted.

Syntax:
DROP DATABASE database_name;

CREATE TABLE: It is used when you want to create a table in a database. To create a table using SQL
the user needs to specify four components namely:
• Name of the table
• Name of each column in the table
• Data type of each column
• Maximum length of each column.

Syntax:
CREATE TABLE table_name
(
Column_name1 data_type,
Column_name2 data_type,
………………
);

Ex1. Create a table named person with four columns, FirstName, LastName, Address and Age

CREATE TABLE person


(
FirstName VARCHAR,
LastName VARCHAR,
Address VARCHAR,
Age INT
);

Ex2. Adjust the above table created to a specified maximum length for LastName up to 30 characters
and age having up to three digits.

CREATE TABLE person

49 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
(
FirstName VARCHAR,
LastName VARCHAR (30),
Address VARCHAR,
Age INT (3)) ;
DROP TABLE: It deletes a table (the table structure attributes, and indexes will also be deleted).

Syntax:
DROP TABLE table_name;

TRUCATE TABLE: It used to get rid of data inside the table and not the table itself.

Syntax:
TRUNCATE TABLE table_name;

ALTER TABLE: The ALTER TABLE statement is used to add or drop columns in an existing table.

Syntax:
ALTER TABLE table_name
ADD COLUMN column_name data type

ALTER TABLE table_name


DROP COLUMN column_name

Ex1. Write an SQL statement to add column named City in the Person table.

ALTER TABLE Person


ADD COLUMN City VARCHAR;

Ex2. Write an SQL statement to drop Address column in the Person table.

ALTER TABLE Person


DROP COLUMN Address;

Ex3. Use the following description of userlogin table given below to answer the following questions.

Field Type Null Key Default Extra


Id int (11) NO PRIMARY NULL auto_increment
unsigned
Username varchar (25) NO NULL
Email Varchar NO NULL

(a) Write SQL statement to CREATE the above table


(b) Add a new COLUMN called passwd after username with the following description: field passwd,
type varchar (25), null NO and default NULL.
(c) Modify email field to have type of varchar (30) and can be NULL.
(d) Change the id field to userid with the same parameters.

50 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
Sol.
(a
CREATE TABLE userlogin (
id INT (11) UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
username VARCHAR (25) NOT NULL,
email VARCHAR (25) NOT NULL
);

(b
ALTER TABLE userlogin
ADD COLUMN passwd VARCHAR (25) NOT NULL
AFTER username;

(c
ALTER TABLE userlogin
MODIFY COLUMN email VARCHAR (30);

(d
ALTER TABLE userlogin
CHANGE id userid
INT (11) UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT;

RENAME TABLE: Syntax: RENAME TABLE table_name TO new_table_name

Ex1. Rename a table called Person to Persons

Sol.
RENAME TABLE Person TO Persons;

CREATE INDEX: This statement is used to create an index in a table. Indices are created in an existing
table to locate rows more quickly and efficiently. It is possible to create an index on one or more
columns of a table, and each index is given a name. The users cannot see the indexes; they are just used
to speed up queries.

Syntax:
CREATE INDEX index_name
ON table_name (column_name)

Ex. Write an SQL statement to create an index, named PersonIndex on the LastName field of the Person
table:

Sol:
CREATE INDEX PersonIndex
ON Person (LastName);

To index the values in a column in descending order, you can add the reserved word DESC after the
column name. To index more than one column, and then list the column names within the parenthesis,
separated by commas.

51 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
Ex
CREATE INDEX PersonIndex
ON Person (FirstName, LastName);

CREATE UNIQUE INDEX: It is used to create a unique index on a table. A unique index means that
two rows cannot have the same index value.

Syntax:

CREATE UNIQUE INDEX index_name


ON table_name (column_name);

DROP INDEX: Used to delete an existing index in a table with the DROP statement.

Syntax:
ALTER TABLE table_name
DROP INDEX index_name;

DATA MANIPULATION LANGUAGE (DML)


Data Manipulation Language is used to manipulate the data within those schema objects. That is, it is
used to insert, delete, update and query existing data in the database.

DML Statements
SELECT Statement

Syntax
SELECT column_name(s)
FROM table_name

The table below named Student in database. Use it to answer the questions that follow:

FirstName SirName IDNo DOB Marks


Alice Smith 123980 23-06-1984 95
Alex Maps 879060 14-04-1989 60
Banice Young 879067 12-03-1990 75
John Wallace 312689 18-06-1994 55
Ex
Write an SQL statement that would perform each of the following:
(i) Display all the records from the fields IDNo, FirstName and SirName
(ii) Display all the records from all the fields
Sol:
(i) SELECT IDNo, FirstName, SirName
FROM Student;

(ii) SELECT *
FROM Student;

52 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
The SELECT DISTINCT Statement
The DISTINC keyword is used to return only distinct (different) values)

Inside a table, a column often contains many duplicate values; and sometimes you only want to list the
different (distinct) values.

Syntax
SELECT DISTINCT column_name(s)
FROM table_name

Ex
Using the Employee table given below:
Name Office Dept. Salary
Smith 400 CS 45000
Jones 220 Econ 35000
Green 160 Econ 50000
Brown 420 CS 65000
Smith 500 Fin 60000

Required: Write an SQL statement that display only distinct values from the column Dept.

Sol:
SELECT DISTINC Dept.
FROM Employee

The WHERE Clause


Is used to specify a selection criterion

Syntax

SELECT column_name(s)
FROM table_name
WHERE column_name operator value

With the WHERE clause, the following operators can be used:

Operator Description
= Equal
<> Not equal
> Greater than
< Less than
>= Greater than or equal
<= Less than or equal
BETWEEN Between and inclusive range
53 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
LIKE Search for a pattern
IN If you know the exact value you want to
return for a least one of the columns

BETWEEN AND Operators


Is used to search for values that are within a set of values, the MINIMUM value and MAXIMUM
value. It works best for dates and numbers.

Syntax

SELECT column_name FROM table_name


WHERE column_name
BETWEEN value1 AND value2

Ex
Using Employee table given above, write an SQL statement that display records whose salary range
from 40,000 to 60,000.

Sol:
SELECT * FROM Employee
WHERE Salary
BETWEEN 400000 AND 60000;

The LIKE Operator


The LIKE operator is used to compare a value to similar values using wildcards operators. There are two
wildcards used in conjunction with the LIKE operator:
• The percentage sign (%)
• The underscore (-)

Syntax
SELECT column_name FROM table_name
WHERE column_name LIKE pattern

The percentage sign (%), can be used to define wildcards (missing letters in the pattern) both before and
after the pattern while the underscore (-), can be used to find missing single character in the pattern.

Example

WHERE CustomerName LIKE 'a%' Finds any values that starts with "a"
WHERE CustomerName LIKE '%a' Finds any values that ends with "a"
WHERE CustomerName LIKE '%or%' Finds any values that have "or" in any position
WHERE CustomerName LIKE '_r%' Finds any values that have "r" in the second position
WHERE CustomerName LIKE 'a_%_%' Finds any values that starts with "a" and are at least 3
characters in length
WHERE ContactName LIKE 'a%o' Finds any values that starts with "a" and ends with "o"

54 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
The following SQL statement will return persons with first names that start with an ‘O’

SELECT * FROM Persons


WHERE FirstName LIKE ‘O%’;

The following SQL statement will return persons with first names that end with an ‘a’

SELECT * FROM Persons


WHERE FirstName LIKE ‘%a’;

The following SQL statement will return persons with first names that contain the pattern ‘la’

SELECT * FROM Persons


WHERE FirstName LIKE ‘%la%’;

The following SQL Statement will return missing characters in first name field from persons table.

SELECT * FROM Persons


WHERE FirstName LIKE ‘J ⸏ cks⸏_n’;

Ex
Using the given relation named Customers in a database given below:

CustID CustName Contact Town


C001 Alfred 07994692 Lodwar
C002 Annette 07884525 Kisumu
C003 Moreno 07994464 Nairobi
C004 Wendi 07894730 Mombasa

Required: Write SQL statement that would display


(i) All records whose Contact contains the pattern “44”;
(ii) All records with CustName not starting with letter “K” or “O”.

Sol:

(i) SELECT * FROM Customers


WHERE Contact LIKE “%44%”:

(iii) SELECT * FROM Customers


WHERE CustName NOT LIKE “K%” OR CustName NOT LIKE “O%”;

Assign: Write an SQL statement that would display letter ‘T’ and at least 3 characters in length in the
field firstname from table named persons.

Sol

SELECT * FROM Customers


WHERE firstname LIKE “T_%_%”;
55 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
IN Operator
The IN operator may be used if you know the exact value you want to return for a least
one of the columns

Syntax
SELECT column_name FROM table_name
WHERE column_name IN (Value1, Value2…)

Ex
Using Employee table given above, write an SQL statement that display the employees with Name equal
to Smith or Brown.

Sol:
SELECT * FROM Employee
WHERE name IN (‘Smith’, ‘Brown’);

SELECT LIMIT operator


The SELECT LIMIT operator can be used to specify the number of records to return. The
LIMIT operator can be very useful on large tables with thousands of records returning the
top number specified.

Ex.
Given the table below called Persons

FirstName LastName D.O.B Enter in format YYYY-MM-DD


Shelly Johnson 1970-12-12
Bobby Briggs 1967-05-24
Donna Haggard 1970-03-24
Audrey Horne 1965-02-01
James Hurley 1967-01-02

Write an SQL statement to the first three records from the table

Sol:
SELECT * FROM Persons
LIMIT 3;
Ex.
Using the table given above called Persons write an SQL statement to:
(i) Display person(s) that were born after 1965
(ii) Display person(s) who were born in February
56 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
(iii) Display person(s) that have birth date greater than 12
(iv) Display the DOB value for Johnson
Sol;
(i) SELECT * FROM Students
WHERE YEAR (D.O.B)>1965;
(ii) SELECT * FROM Students
WHERE MONTH (D.O.B) =2;

(iii) SELECT * FROM Students


WHERE DAY (D.O.B)>12;

(iv) SELECT DOB FROM Students


WHERE LastName = ’Johnson’;

The ORDER BY keyword

Is used to sort the data in ascending or descending order, based on one or more columns

Syntax
SELECT column-list FROM table-name
WHERE condition
ORDER BY Column1…
ASC or DESC

Ex
Using Student table given below, write an SQL statement that display all records from highest to lowest
based on marks.

Table: Student

FirstName SirName IDNo DOB Marks


Alice Smith 123980 23-06-1984 95
Alex Maps 879060 14-04-1989 60
Banice Young 879067 12-03-1990 75
John Wallace 312689 18-06-1994 55

Sol:

SELECT * FROM Student


ORDERED BY Marks DESC;

The INSERT INTO statement


The INSERT INTO statement is used to insert new rows into a table

57 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
Syntax
INSER INTO table_name
VALUES (value1, value2….)

You can also specify the columns for which you want to insert data:
INSER INTO table_name (column1, column2….
VALUES (value1, value2….)

Ex
Using Student table given above, write an SQL statement to add a new record; Jonathan, Osteen,
353981, 15-02-1998, 80 in the relation.

Sol:

INSERT INTO Student


VALUES (‘Jonathan’, ‘Osteen’, 353981, ‘15-02-1998’, 80);

DELETE FROM statement


DELETE FROM statement is used to delete data from the table

Syntax

DELETE FROM table_name


WHERE column-name condition

It is also possible to delete all rows in a table without deleting the table. This means that
the table structure, attributes, and indexes will be intact:

DELETE FROM table_name


Or
DELETE * FROM
table_name

You can also delete more than one set of data

DELETE FROM table_name Ex


WHERE condition IN (Value1, value2 Using Student table given above, write an SQL statement to
remove Student

58 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
having IDNo 123980.

Sol:
DELETE FROM Customer
WHERE IDNo = 123980;

The UPDATE statements

The UPDATE statement is used to modify the existing records in a table. Use WHERE
clause with UPDATE statement to update selected rows otherwise all the rows would be
affected.

UPDATE table_name
SET column-name= new-value
WHERE column-name = some-value

Ex
Using Student table given above, write an SQL statement to update Surname to Osteen where FirstName
is Alex.

Sol:
UPDATE Student
SET Sirname = ‘Osteen’
WHERE FirstName = ’Alex’;

SQL Aliases

SQL aliases are used to give a table or a column in a table, a temporary name. Aliases are often used to
make column names more readable. An alias only exists for the duration of the query.

Table Name Alias

Syntax:
SELECT column-name FROM table-name AS table-alias;

Column Name Alias

Syntax
SELECT column-name AS column-alias FROM table-name;
;;
Ex

Using Student table given above, write an SQL statement to:


59 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
(i) Display all the details but table name As Student Info
(ii) Display FirstName, Sirname As Surname and DOB
(iii) Display concatenated FirstName and Sirname As Name, DOB and Marks

Sol
(i) SELECT * FROM Student As StudentInf;

(ii) SELECT FirstName, Sirname As Surname, DOB


FROM Student;

(iii) SELECT CONCAT(FirstName, “ “ , Sirname) As Name, DOB, Marks


FROM Student;

SQL | Join

A SQL Join statement is used to combine data or rows from two or more tables based on a common field
between them. Different types of Joins are:

• INNER JOIN
• LEFT JOIN
• RIGHT JOIN
• FULL JOIN

INNER JOIN: The INNER JOIN keyword selects all rows from both the tables as long as the condition
satisfies. This keyword will create the result-set by combining all rows from both the tables where the
condition satisfies i.e. value of the common field will be same.

Syntax:

SELECT field1, field2, field3


FROM table1
INNER JOIN table2
ON table1.matching column = table2.matching_column;

Note: We can also write JOIN instead of INNER JOIN. JOIN is same as INNER JOIN.

Example

Ex1:
Use the following tables Subject and Teacher to answer the questions that follow:

60 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
Sid Sname
001 English
002 Kiswahili
003 Science
Table: Subject

Pno Name Sid Hours


K0088769 Cate 002 10
K0099221 Jonathan 001 20
K008702 Andrew 002 30
Table: Teacher

Write an SQL statement that would perform each of the following


(i) Display all details of the teachers
(ii) Display Pno, Name, Sname and Hours of all the teachers with Hours more than 10

Sol:
(i) SELECT Teacher. Pno, Teacher. Name, Subject. Sid, Subject. Sname, Teacher. Hours
FROM Teacher
INNER JOIN Subject
ON Teacher. Sid=Subject. Sid

(ii) SELECT Teacher. Pno, Teacher. Name, Subject. Sname, Teacher. Hours
FROM Teacher
INNER JOIN Subject
ON Teacher. Sid=Subject. Sid
WHERE Teachers. Hour > 10;

Ex2
Use the following tables Grade and Lecturer to answer the questions that follow:

GradeID Grade Name


002 Tutor
003 Ass. Lecturer
004 Lecturer
005 Professor

Table: Grade

LecturerNo LecturerName GradeID Salary Age YearEmployed


L345675 Eunice 003 30000 45 1982
L347890 Agnes 003 45000 60 1985
L345679 Alex 002 56000 55 1989
L456782 Ted 008 25000 22 1996
L789073 Felix 012 15000 35 1994

61 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
L897032 Gladys 015 50000 1998

Table: Lecturer

Write an SQL statement that would perform each of the following


(i) Display the LecturerNo, LecturerName and Salary of all the lecturers with GradeName
“Ass.Lecturer”
(ii) Display the LecturerNo, GradeID and Salary for all Lecturers above age 50 and YearEmployed
is between 1980 and 1992
(iii) Display all details of the lectures whose age is not captured in the table Lecturer.

Sol

(i) SELECT Lecturer. LecturerNo, Lecturer. LecturerName, Lecturer. Salary


FROM Lecturer
INNER JOIN Grade
ON Lecturer. GradeID = Grade. GradeID
WHERE Grade. GradeName = ‘Ass.Lecturer’;

(ii) SELECT Lecturer. LecturerNo, Grade. GradeID, Lecturer. Salary


FROM Lecturer
INNER JOIN Grade
ON Lecturer. GradeID =Grade. GradeID
WHERE (Lecturer. Age >50) AND (Lecturer.YearEmployeed BETWEEN 1980 AND 1992)

LEFT JOIN: This join returns all the rows of the table on the left side of the join and matching rows for
the table on the right side of join. The rows for which there is no matching row on right side, the result-
set will contain null. LEFT JOIN is also known as LEFT OUTER JOIN.

Syntax:
SELECT table1.column1, table1.column2, table2.column1…
FROM table1
LEFT JOIN table2
ON table1.matching_column = table2.matching_column;

Note: We can also use LEFT OUTER JOIN instead of LEFT JOIN, both are same.

Ex
Given the following relations;

Student_no Student_name Student_age


S1 n1 21

62 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
S2 n2 22
S3 n3 23
S4 n4 24
Student

Student_no Course_name Fees


S1 C1 1000
S3 C3 3000
S5 C5 5000
Course
Required:
(i) Write an SQL statement that would perform left join
(ii) Give the output result

Sol

(i) SELECT Student. Student_no, Student.Student_name,Student.Student_age,


Course.Course_name,Course.Fees
FROM Student
LEFT JOIN Course
ON Student. Student_no= Course. Student_no;
(ii)

Student_no Student_name Student_age Course_name Fees


S1 n1 21 C1 1000
S2 n2 22 NULL NULL
S3 n3 23 C3 3000
S4 n4 24 NULL NULL

RIGHT JOIN: RIGHT JOIN is similar to LEFT JOIN. This join returns all the rows of the table on the
right side of the join and matching rows for the table on the left side of join. The rows for which there is
no matching row on left side, the result-set will contain null. RIGHT JOIN is also known as RIGHT
OUTER JOIN.

Syntax:

SELECT table1.column1, table1.column2, table2.column1…


FROM table1
RIGHT JOIN table2
ON table1.matching_column = table2.matching_column;

Note: We can also use RIGHT OUTER JOIN instead of RIGHT JOIN, both are same.

63 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
Ex
Using the above given relations
Required:
(iii)Write an SQL statement that would perform right join
(iv) Give the output result

Sol

(i) SELECT Student. Student_no, Student.Student_name,Student.Student_age,


Course.Course_name,Course.Fees
FROM Student
RIGHT JOIN Course
ON Student. Student_no= Course. Student_no;

(ii)

Student_no Student_name Student_age Course_name Fees


S1 n1 21 C1 1000
S3 n2 22 C3 3000
S5 NULL NULL C5 5000

FULL JOIN creates the result-set by combining result of both LEFT JOIN and RIGHT JOIN. The
result-set will contain all the rows from both the tables. The rows for which there is no matching, the
result-set will contain NULL values.

Syntax:

SELECT table1.column1, table1.column2, table2.column1...


FROM table1
FULL JOIN table2
ON table1.matching_column = table2.matching_column;

Ex
Using the above given relations
Required:
(v) Write an SQL statement that would perform full join
(vi) Give the output result

64 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
Sol

(i) SELECT Student. Student_no, Student.Student_name,Student.Student_age,


Course.Course_name,Course.Fees
FROM Student
FULL JOIN Course
ON Student. Student_no= Course. Student_no;
(ii)

Student_no Student_name Student_age Course_name Fees


S1 n1 21 C1 1000
S2 n2 22 NULL NULL
S3 n3 23 C3 3000
S4 n4 24 NULL NULL
S5 NULL NULL C5 5000

AGGREGATE FUNCTIONS

Aggregate functions operate against a collection of values, but return a single value.

Syntax
SELECT function (Column)
FROM table;

OR

SELECT function (Column)


FROM table
WHERE condition;

Ex.
Assume the relation Employee has the following tuples:
Office Name Dept Salary
CS001 Smith CS 45000
Econ001 Jones Econ 35000
Econ002 Green Econ 50000
CS002 Brown CS 65000
Fin001 Bright Fin 60000
Table: Employee

Required: Write an SQL statement to find:


(i) the total salary:
(ii) average salary
(iii) Count the number of employees in the CS department
(iv) the total payroll for the Economics department

Sol:
65 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
(i)
SELECT SUM (Salary)
FROM Employee;

Results:

SUM (salary)
255000

(ii)
SELECT AVG (Salary)
FROM Employee;

Results:

AVG (salary)
51000

(iii)
SELECT COUNT (Name)
FROM Employee
WHERE Dept=’CS’;

Results:

COUNT (Name)
2

(iv)
SELECT SUM (Salary)
FROM Employee
WHERE Dept=’Econ’;

Results:

SUM (salary)
85000

GROUP BY… STATEMENT

66 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
GROUP BY… was added to SQL because aggregate functions return the aggregate of all column values
every time they are called, and without the GROUP BY function it would be impossible to find the
aggregate for each individual group of column values.

Syntax

SELECT function (column) FROM table


GROUP BY column

Ex 1

Using the above relation employee, write an SQL statement find the average salary in each department.

Sol:
SELECT dept, AVG(Salary)
FROM Employee
GROUP BY dept;
Results:
Dept. AVG (salary)
CS 55000
Econ 42500
Fin 60000

Ex2

Using the below given sales relation

Company Amount
W3Schools 5500
Studytonight 4500
W3Schools 5100
GeeksforGeeks 6400

Table: Sales

Write an SQL statement to calculate the total salary for every company and give the output result.

Sol

SELECT Company, SUM (Amount)


FROM Sales
GROUP BY Company;

67 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
Output

Company SUM(Amount)
W3Schools 10600
Studytonight 4500
GeeksforGeeks 6400

HAVING… STATEMENT

HAVING… was added to SQL because the WHERE keyword could not be used against aggregate
functions and without HAVING… it would be impossible to test for result conditions.

Syntax

SELECT column, function (column) FROM table


GROUP BY column
HAVING function (column) condition value

Using the sales table given above, write an SQL statement to calculate the total salary for every
company and display the company having salary more than 10000 giving the output result.

Sol

SELECT Company, SUM (Amount)


FROM Sales
GROUP BY Company
HAVING SUM (Amount)>10000;

Output

Company SUM(Amount)
W3Schools 10600

68 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
TOPIC EIGHT: FUNCTION OF DATABASE MANAGEMENT SYSTEMS

Meaning and importance of database management systems


transaction processing
concurrency controls
database recovery
database security and authorization

DBMS Transaction Management: ACID Properties, Schedule

What is a Database Transaction?

A transaction is a logical unit of processing in a DBMS which entails one or more database access
operation. In a nutshell, database transactions represent real-world events of any enterprise.

All types of database access operation which are held between the beginning and end transaction
statements are considered as a single logical transaction. During the transaction the database is
inconsistent. Only once the database is committed the state is changed from one consistent state to

another.
In this tutorial, you will learn:

• What is a Database Transaction?


• Facts about Database Transactions
• Why do you need concurrency in Transactions?
• States of Transactions
• What are ACID Properties?
• Types of Transactions
• What is a Schedule?

Facts about Database Transactions

• A transaction is a program unit whose execution may or may not change the contents of a
database.
• The transaction is executed as a single unit
• If the database operations do not update the database but only retrieve data, this type of
transaction is called a read-only transaction.
• A successful transaction can change the database from one CONSISTENT STATE to another
• DBMS transactions must be atomic, consistent, isolated and durable
• If the database were in an inconsistent state before a transaction, it would remain in the
inconsistent state after the transaction.

Why do you need concurrency in Transactions?

A database is a shared resource accessed. It is used by many users and processes concurrently. For
example, the banking system, railway, and air reservations systems, stock market monitoring,
supermarket inventory, and checkouts, etc.
69 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
Not managing concurrent access may create issues like:

• Hardware failure and system crashes


• Concurrent execution of the same transaction, deadlock, or slow performance

States of Transactions

The various states of a Database Transaction are listed below

State Transaction types

Active State A transaction enters into an active state when the execution process
begins. During this state read or write operations can be performed.

Partially Committed A transaction goes into the partially committed state after the end of a
transaction.

Committed State When the transaction is committed to state, it has already completed its
execution successfully. Moreover, all of its changes are recorded to the
database permanently.

Failed State A transaction considers failed when any one of the checks fails or if the
transaction is aborted while it is in the active state.

Terminated State State of transaction reaches terminated state when certain transactions
which are leaving the system can't be restarted.

State Transition Diagram for a Database Transaction

Let's study a state transition diagram that highlights how a transaction moves between these various
states.

1. Once a transaction states execution, it becomes active. It can issue READ or WRITE operation.
2. Once the READ and WRITE operations complete, the transactions becomes partially committed

70 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
3. state.
4. Next, some recovery protocols need to ensure that a system failure will not result in an inability
to record changes in the transaction permanently. If this check is a success, the transaction
commits and enters into the committed state.
5. If the check is a fail, the transaction goes to the Failed state.
6. If the transaction is aborted while it's in the active state, it goes to the failed state. The transaction
should be rolled back to undo the effect of its write operations on the database.
7. The terminated state refers to the transaction leaving the system.

71 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
What are ACID Properties?

For maintaining the integrity of data, the DBMS system you have to ensure ACID properties.
ACID stands for Atomicity, Consistency, Isolation, and Durability.

• Atomicity: A transaction is a single unit of operation. You either execute it entirely or do


not execute it at all. There cannot be partial execution.
• Consistency: Once the transaction is executed, it should move from one consistent state
to another.
• Isolation: Transaction should be executed in isolation from other transactions (no
Locks). During concurrent transaction execution, intermediate transaction results from
simultaneously executed transactions should not be made available to each other. (Level
0,1,2,3)
• Durability: · After successful completion of a transaction, the changes in the database
should persist. Even in the case of system failures.

Example of ACID

Transaction 1: Begin X=X+50, Y = Y-50 END


Transaction 2: Begin X=1.1*X, Y=1.1*Y END

Transaction 1 is transferring $50 from account X to account Y.

Transaction 2 is crediting each account with a 10% interest payment.

If both transactions are submitted together, there is no guarantee that the Transaction 1 will
execute before Transaction 2 or vice versa. Irrespective of the order, the result must be as if the
transactions take place serially one after the other.

Types of Transactions

Based on Application areas

• Non-distributed vs. distributed


• Compensating transactions
• Transactions Timing
• On-line vs. batch

Based on Actions

• Two-step
• Restricted
72 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
• Action model

Based on Structure

• Flat or simple transactions: It consists of a sequence of primitive operations executed


between a begin and end operations.
• Nested transactions: A transaction that contains other transactions.
• Workflow

What is a Schedule?

A Schedule is a process creating a single group of the multiple parallel transactions and
executing them one by one. It should preserve the order in which the instructions appear in each
transaction. If two transactions are executed at the same time, the result of one transaction may
affect the output of other.

Example

Initial Product Quantity is 10


Transaction 1: Update Product Quantity to 50
Transaction 2: Read Product Quantity

If Transaction 2 is executed before Transaction 1, outdated information about the product


quantity will be read. Hence, schedules are required.

Parallel execution in a database is inevitable. But, Parallel execution is permitted when there is
an equivalence relation amongst the simultaneously executing transactions. This equivalence is
of 3 Types.

RESULT EQUIVALENCE:

If two schedules display the same result after execution, it is called result equivalent schedule.
They may offer the same result for some value and different results for another set of values. For
example, one transaction updates the product quantity, while other updates customer details.

View Equivalence

View Equivalence occurs when the transaction in both the schedule performs a similar action.
Example, one transaction inserts product details in the product table, while another transaction
inserts product details in the archive table. The transaction is the same, but the tables are
different.

CONFLICT Equivalence

73 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.
In this case, two transactions update/view the same set of data. There is a conflict amongst
transaction as the order of execution will affect the output.

What is Serializability?

Serializability is the process of search for a concurrent schedule who output is equal to a serial
schedule where transaction are execute one after the other. Depending on the type of schedules,
there are two types of serializability:

• Conflict
• View

Summary:

• A transaction is a logical unit of processing in a DBMS which entails one or more


database access operation
• It is a transaction is a program unit whose execution may or may not change the contents
of a database.
• Not managing concurrent access may create issues like hardware failure and system
crashes.
• Active, Partially Committed, Committed, Failed & Terminate are important transaction
states.
• ACID stands for Atomicity, Consistency, Isolation, and Durability
• Three DBMS transactions types are Base on Application Areas, Action, & Structure.
• A Schedule is a process creating a single group of the multiple parallel transactions and
executing them one by one.
• Serializability is the process of search for a concurrent schedule who output is equal to a
serial schedule where transaction are execute one after the other.

74 PREPARED BY,
THOMAS O. OCHIENG
ICT DEPTMENT, SIT.

You might also like