Database Management Systems
(3:0:2)
Sub Code:BCS403
Text Book:
◼ Fundamentals of Database Systems
- Elmasri and Navathe, Addison Wesley, 7th Ed.
Reference books:
◼ Database Management Systems:
Raghu Ramakrishnan and Johannes Gehrke,
McGraw-Hill, 3rd Edition, 2014.
◼ Database System Concepts : Silberschatz,
Korth and Sudharshan, 7th Edition,
Mc-GrawHill, 2022.
Introduction to Database
◼ Data
❑ Data is raw fact or figures or entity.
❑ When activities in the organization takes place, the
effect of these activities need to be recorded which
is known as Data.
◼ Information
❑ Processed data is called information.
❑ The purpose of data processing is to generate the
information required for carrying out the activities.
Why do we need a database?
◼ Keep records of our:
❑ Clients
❑ Staff
❑ Volunteers
◼ To keep a record of activities
and interventions;
◼ Keep sales records;
◼ Develop reports;
◼ Perform research
◼ Longitudinal tracking
Database
◼ Database may be defined in simple terms as a
collection of data
◼ A database is a collection of related data.
◼ Databases are used every where.
◼ The database can be of any size and of varying
complexity.
◼ A database may be generated and maintained
manually or it may be computerized.
◼ Ex: Amazon.com – commercial db
2 terabytes (a terabyte is 1012 bytes worth of
storage) and is stored on 200 different
Why Study Databases?
◼ Databases are useful
❑ Many computing applications deal with large
amounts of information
❑ Database systems give a set of tools for storing,
searching and managing this information
◼ Databases in CS/IS
❑ Basic concepts and skills with database systems
are part of the skill set you will be assumed to
have as a CS/IS graduate
Database Management System
◼ A Database Management System (DBMS) is a
collection of program that enables user to create
and maintain a database.
◼ The term database is used as shorthand to
DBMS.
◼ Ex: Proprietary - MS Access, MS SQL Server,
Oracle, DB2 (IBM), Sybase.
Open Source: MySQL, PostgreSQL
Types of Databases
◼ Non-relational databases ◼ Relational databases
1. Place information in field
categories that we create so that 1. Here fields can be used in a
information is available for number of ways (and can be of
sorting and disseminating the variable length), provided that
way we need it. they are linked in tables.
2. The data in a non-relational
database, is limited to that 2. It is developed based on a
program and cannot be extracted database model that provides
and applied to a number of other for logical connections among
software programs, or other files (known as tables) by
database files within a school or including identifying data from
administrative system. one table in another table
3. The data can only be "copied
and pasted.“
Example: a spread sheet
Database: Applications
Databases are widely used. Here are some
representative applications:
1. Banking: for customer information, accounts,
loans, and banking transaction.
2. Airlines: For reservation and schedule
information. (first to use DB)
3. Universities: For student information, course
registrations, and grades.
4. Credit card transaction: For purchase on credit
cards and generation of monthly statements.
5. Telecommunication: for keeping records of calls
made, generating monthly bills, maintaining
balances on prepaid calling cards, ..
6. Sales:
7. Manufacturing:
8.Human resource: For information about
employees, salaries, payroll taxes, benefits and
for generation of paychecks.
Etc…
What the DBMS does ?
◼ Provides users with:
- Data definition language (DDL)
- Data manipulation language (DML)
- Data control language (DCL)
Often these are all the same language
◼ DBMS provides:
- Persistence
- Concurrency
- Integrity
- Security
- Data independence
Database Systems
◼ The database and DBMS software together called
as a database system.
◼ A database system consists of
- Data (the database) - Software
- Hardware - Users
◼ Database systems allow users to
- Store
- Update
- Retrieve
- Organise
- Protect their data.
Selecting a database system: Need Analysis
The needs analysis process will be specific to your organization
but, at a minimum, should answer the following questions:
◼ How many records we will warehouse and for how long?
◼ Who will be using the database and what tasks will they
perform?
◼ How often will the data be modified? Who will make these
modifications?
◼ Who will be providing IT support for the database?
◼ What hardware is available? Is there a budget for purchasing
additional hardware?
◼ Who will be responsible for maintaining the data?
◼ Will data access be offered over the Internet? If so, what level
of access should be supported?
Chapter 1
Introduction
Basic Definitions
Database: A collection of related data.
Data: Known facts that can be recorded and have an
implicit meaning.
Mini-world: Some part of the real world about which
data is stored in a database. For example, student
grades and transcripts at a university.
Database Management System (DBMS): A software
package/ system to facilitate the creation and
maintenance of a computerized database.
Database System: The DBMS software together with
the data itself. Sometimes, the applications are also
included.
Typical DBMS Functionality
◼ Define a database : in terms of data types,
structures and constraints.
◼ Construct or Load the Database on a
secondary storage medium.
◼ Manipulating the database : querying,
generating reports, insertions, deletions and
modifications to its content
◼ Concurrent Processing and Sharing by a set
of users and programs – yet, keeping all data
valid and consistent
◼ Other features:
❑ Protection or Security measures to prevent
unauthorized access
❑ Presentation and Visualization of data
Example of a Database
◼ Mini-world for the example: Part of a
UNIVERSITY environment.
◼ Some mini-world entities:
❑ STUDENTs
❑ COURSEs
❑ SECTIONs (of COURSEs)
❑ (academic) DEPARTMENTs
❑ INSTRUCTORs
◼ Note: The above could be expressed in the
ENTITY-RELATIONSHIP data model.
Fig 1.2: Sample database
◼ Some mini-world relationships:
❑ SECTIONs are of specific COURSEs
❑ STUDENTs take SECTIONs
❑ COURSEs have prerequisite COURSEs
❑ INSTRUCTORs teach SECTIONs
❑ COURSEs are offered by DEPARTMENTs
❑ STUDENTs major in DEPARTMENTs
Main Characteristics of the Database Approach
◼ Self-describing nature of a database system
◼ Insulation between programs and data
◼ Support of multiple views of the data
◼ Sharing of data and multiuser transaction
processing
Main Characteristics of the Database....
1. Self-describing nature of a database system:
◼ The database system contains the database and a
complete definition or description of the database
structure and constraints. This definition is stored in
the DBMS catalog,
◼ The information stored in the catalog is called meta-
data.
◼ This allows the DBMS software to work with different
databases. (ex: university, banking, company etc.,)
◼ The catalog is used by the DBMS software and also
by database users who need information about the
database structure.
◼ In traditional file processing, data definition is
typically part of the application programs
themselves.
- Programs are constrained to work with only
one specific database, whose structure is
declared in the application programs.
◼ Whereas file-processing software can access
only specific databases, DBMS software can
access diverse databases by extracting the
database definitions from the catalog and using
these definitions.
2. Insulation between programs and data:
◼ In traditional file processing, the structure of data file
is embedded in the application programmes, so any
changes to the structure of a file may require
changing all programs that access that file.
◼ By contrast, DBMS access programs do not require
such changes in most cases. The structure of data
file is stored in the DBMS catalog separately from
the access programmes.
◼ DBMS allows changing data storage structures and
operations without having to change the DBMS
access programs. This property is called program-
data independence.
◼ For example, a file access program may be
written in such a way that it can access only
STUDENT records of the structure shown in
Figure 1.4
◼ Program-operation independence:
- In some types of db sys, users can define
operations on data as part of the db definitions.
- An operation is specified in two parts:
- Interface – includes operation name and data
types of its arguments(or parameters)
- Implementation – a implementation of operation
specified separately
- can be changed without
affecting the interface
◼ User application programs can operate on the data
by invoking these operations through their names
and arguments, regardless of how the operations
are implemented. This is termed as program-
operation independence.
◼ The characteristic that allows program-data
independence and program-operation
independence is called data abstraction.
◼ Data Abstraction: A data model is used to hide
storage details and present the users with a
conceptual view of the database.
Ex: Fig 1.2, Fig 1.3, Fig 1.4
3. Support of multiple views of the data:
◼ Each user may see a different view of the
database, which describes only the data of
interest to that user.
◼ View-May be a subset of the database or it may
contain virtual data that is derived from the
database files but is not explicitly stored.
1. one user of the database of may be interested only in accessing and printing
the transcript of each student.
2. second user, who is interested only in checking that students have taken all the
prerequisites of each course for which they register, may require the view
4. Sharing of data and multiuser transaction
processing:
- Allowing a set of concurrent users to retrieve
and to update the database.
- Concurrency control within the DBMS
guarantees that each transaction is correctly
executed or completely aborted.
- OLTP (Online Transaction Processing) is a
major part of database applications.
ex: allotment of seats in reservation sys.
◼ A Fundamental role of multiuser DBMS s/w is to
ensure that concurrent transaction operate
correctly and efficiently.
◼ A transaction is an executing program or process
that includes one or more database accesses, such
as reading or updating of database records.
◼ Each transaction is supposed to execute a logically
correct database access if executed in its entirety
without interference from other transactions.
◼ The DBMS must enforce several transaction
properties. The isolation property ensures that each
transaction appears to execute in isolation from
other transactions, even though hundreds of
transactions may be executing concurrently. The
atomicity property ensures that either all the
database operations in a transaction are executed
or none are.
Characteristics of DBMS summary
◼ Self-describing nature of a database system
❑ Catalog, metadata
◼ Insulation between programs and data and data
abstraction
❑ Program-data independence
❑ Program-operation independence
❑ Data abstraction
◼ Support of multiple views of the data
❑ Views
◼ Sharing of data and multi-user transaction processing
❑ Concurrency control software
❑ OLTP-Online transaction processing
❑ Transaction – process that includes one or more db
accesses
Advantages of Using the Database Approach
◼ 1. Controlling Redundancy:
- Redundancy - Multiple entry of same data
- Solve the problems of
◼ First, there is the need to perform a single logical update
multiple times: once for each file. This leads to duplication
of effort.
◼ Second, storage space is wasted when the same data is
stored repeatedly, and this problem may be serious for
large databases.
◼ Third, files that represent the same data may become
inconsistent because the updates are applied
independently by each user group.
◼ 2. Restricting unauthorized Access:
- Users or user groups are given a/c nos.
protected by passwords.
- DBMS should provide a security &
authorization subsystem for DBA to create A/Cs
and specify A/C restriction.
◼ 3. Providing persistent storage for program
objects:
❑ Reason for Object-oriented database system.
- Programming languages have complex data
structures, such as record types in Pascal,
class definition in C++ or Java.
❑ The values of program variables are discarded
once a program terminates, unless the
programmer explicitly stores them in permanent
files.
❑ Often involves converting these complex
structures into a format suitable for file storage.
- OOdb sys are compatible with programming
languages such as C++ & Java.
- DBMS s/w automatically performs any necessary
conversions.
- Hence, a complex object in C++ can be stored
permanently in an OODBMS.
- Traditional DB sys often suffered from the
impedance-mismatch problem.
since the data structure provided by the DBMS
were incompatible with the programming language’s
data structures.
- OOdb sys typically offer data structure compatibility
with one or more ObjOrienProgramming language
4. Providing Storage structures for efficient query
processing
◼ Db sys must provide capabilities for efficiently
executing queries and updates.
◼ Must provide data structures to speed up disk
search for the desired records.
❑ Indexes – based on tree data structure
❑ Buffering module: maintains parts of the db in main
memory buffers.
❑ The Query processing and optimization module of
the DBMS is responsible for choosing an efficient
query execution plan for each query based on the
existing storage structure.
5. Providing backup and recovery services
◼ Must provide facilities for recovering from hardware and
software failures.
◼ The backup and recovery subsystem of the DBMS is
responsible for recovery.
6. Providing multiple user interfaces
• DBMS should provide a variety of user interface, because
many types of users use a db.
• These include query language for casual users,
programming language interfaces for application
programmers, forms & command codes for parametric
users, etc.,
• Menu-based+ form style=GUI, Also Web GUI
7. Representing complex relationships among data
- A db may include numerous varieties of data that are
interrelated in many ways.
- DBMS must have the capability to represent a variety of
complex relationships among the data, to define new
relationships as they arise, and to retrieve and update
related data easily and efficiently. (Ex: Fig 1)
8. Enforcing integrity constraints
- DBMS should provide capabilities for defining and enforcing
constraints.
Ex: specifying a data type for each data item, constraint
specifying uniqueness on data item values
- It is the responsibility of db designers to identify integrity
constraints during the db design.
Fig 1: Sample database
9. Permitting Inferencing and Actions using rules:
❑ Some db syss provide capabilities for defining deduction
rules for inferecning new information from the stored db
facts. Such sys are called deductive db sys.
❑ Ex: Eligibility criteria
❑ Triggers: Is a form of a rule activated by updates to the
table, which results in performing some additional
operations to some other tables sending messages and
so on.
❑ Stored procedures: Procedures used to enforce rules.
They become a part of the overall db definition & are
invoked appropriately when certain conditions are met.
❑ Active db sys: Provides more powerful functionalities.
Provides active rules that can automatically initiate
actions when certain events and conditions occur.
Query to compute Everest Books sales information
for the first two days of April 2004
◼ SELECT OrderInfo.ISBN, Title,
SUM(OrderInfo.Qty) AS Quantity,
SUM(OrderInfo.Qty*OrderInfo.Price) AS Sales
FROM OrderInfo, Orders, Books
WHERE Orderinfo.ISBN=Books.ISBN and
Orderinfo.OrderId=Orders.OrderId AND
ShipDate > = ‘2004-04-01’ and ShipDate <= ‘2004-
04-02’ GROUP BY OrderInfo.ISBN, Title;
Stored Procedure Definition
CREATE PROCEDURE Sales(S Date, E Date)
BEGIN ATOMIC
SELECT OrderInfo.ISBN, Title,
SUM(OrderInfo.Qty) AS Quantity,
SUM(OrderInfo.Qty*OrderInfo.Price) AS Sales
FROM OrderInfo, Orders, Books
WHERE OrderInfo.ISBN = Books.ISBN AND
OrderInfo.OrderId = Orders.OrderId AND
Orders.ShipDate >= S AND
Orders.ShipDate <= E
GROUP BY OrderInfo.ISBN, Title;
END;
◼ To get the sales information for the first two days
of April, we can now simply call procedure Sales
as follows:
◼ CALL Sales(’2004-04-01’,’2004-04-02’);
A Brief History of Database Applications
◼ We now give a brief historical overview of the
applications that use DBMSs and how these applications
provided the impetus for new types of database systems.
◼ Early Database Applications Using Hierarchical and
Network Systems
◼ Providing Data Abstraction and Application
Flexibility with Relational Databases
◼ Object-Oriented Applications and the Need for More
Complex Databases
◼ Interchanging Data on the Web for E-Commerce
Using XML
◼ Extending Database Capabilities for New
Applications
◼ Databases versus Information Retrieval
When not to use DBMS:
◼ Simple, well-defined db applications that are not
expected to change
◼ No multiple-user access to data
◼ Initial investment is less on hardware, software
and training (or db is small)
Chapter -2
Database System Concepts and Architecture
Definition of schema, data model and instances
◼ Data abstraction : Refers to the suppression of
details of data organization and storage and the
highlighting of the essential features for an
improved understanding of data.
◼ Data Model:
❑ A set of concepts to describe
◼ the structure of a database,
◼ the operations for manipulating these structures,
◼ and certain constraints that the database should
obey.
◼ Types of Data Models: Categorize according to
the types of concepts they use to describe the
database structure.
1. High Level or Conceptual data models.
2. Low Level or Physical data models.
3. Representational or Implementation data models.
1. High-level or conceptual data model:
◼ Provide concepts that are close to the way
many users perceive data
◼ Conceptual data models use concepts such as
entities, attributes and relationships
❑ entity represents a real-world object or concept.
Ex: an employee or a project
❑ attribute represents some property that
describes an entity Ex: employee’s name or
salary.
❑ relationship among two or more entities
represents an association among the entities,
❑ Ex: employee works_on Project
◼ EX: Entity-Relationship model
2. Low-level or physical data models:
◼ Provide concepts that describe the details of how
data is stored in the computer by representing
information such as record formats, record
orderings and access paths.
◼ Access Path is a structure that makes the
search for particular database records efficient.
Ex: Index
◼ Low level data model is only for Computer
Specialists not for end-user.
Primary
Index
3. Representational (or implementation) data
model:
◼ It is between High level & Low level data model
◼ Hide many details of data storage on disk but can
be implemented on a computer system directly.
◼ Are the models used most frequently in
traditional commercial DBMSs.
Includes widely used relational data
model, legacy data models—the network and
hierarchical models.
◼ It represent data by using record structures and
hence sometimes called record-based data
models.
The different types of Representational or
implementation data models are:
1. Relational data models.
2. Hierarchical data models.
3. Network data models
4. Object-oriented data Models.
XML data Models.
1. Relational Model:
◼ The Relational Model uses a collection of
tables to store data and the relationship
among those data.
◼ Data is extracted by selecting columns and
rows that satisfy user-specified conditions.
◼ Each table have multiple column and each
column has a unique name .
Consumer_Table
Advantages:
1.The main advantage of this model is its ability to
represent data in a simplified format.
2.The process of manipulating record is simplified
with the use of certain key attributes used to
retrieve data.
3.Representation of different types of relationship
is possible with this model.
2. Hierarchical Model:
◼ A hierarchical data model is a data model which
the data is organized into a tree like structure.
◼ The structure allows repeating information using
parent/child relationships: each parent can
have many children but each child only has one
parent.
◼ All attributes of a specific record are listed under
an entity type.
Hierarchical Model:
◼ Advantages:
1. The representation of records is done using an ordered
tree, which is natural method of implementation of one–
to-many relationships.
2. Proper ordering of the tree results in easier and faster
retrieval of records.
3. Network Model:
◼ The data in the network model are represented
by collection of records and relationships
among data are represented by links, which
can be viewed as pointers.
Advantages:
1. Representation of relationship between entities
is implemented using pointers which allows the
representation of arbitrary relationship
2. Data manipulation can be done easily with this
model.
4. Object oriented data model:
◼ Introduced with growing popularity of the object
oriented programming language C++.
◼ The object db paradigm allows users to structure,
retrieve and update data in terms of objects in the
application domain.
◼ Is the dream of the programmer using C++, Java
or some other object-oriented programming
language.
◼ A standard for object databases called ODMG
object model is proposed.
XML model:
◼ The Extensible Markup Language (XML) is a
text markup language designed in 1996.
◼ For specifying the syntax of data and electronic
documents.
◼ Is particularly useful for describing semi-
structured data.
◼ Used in a wide variety of domains such as
ecommerce, protocols and exchange data and
so on.
Other languages and Protocols related to XML
◼ Extensible Stylesheet Language(XSL)- can be
used to define how a document should be
rendered for display by a web browser.
◼ Extensible Stylesheet Language for
Transformation (XSLT)-used to transform one
structure into a different structure.
◼ Web Service Description Language (WSDL)-
allows for the description web service in XML.
◼ Simple Object Access Protocol (SOAP)- is a
platform independent and programming
language independent protocol for messaging
and remote procedure calls.
Schemas
◼ Database Schema
❑ The description of a database.
❑ Includes descriptions of the database structure,
data types, and the constraints on the database.
◼ Schema Diagram
❑ An illustrative display of (most aspects of) a
database schema.
◼ Schema Construct
❑ A component of the schema or an object within the
schema, e.g., STUDENT, COURSE.
Example of a Database Schema
Schemas contd…
◼ Database State
❑ The actual data stored in a database at a
particular moment in time.
❑ This includes the collection of all the data in the
database.
❑ Also called database instance (or occurrence
or snapshot).
◼ The term instance is also applied to
individual database components, e.g. record
instance, table instance, entity instance
Example of a database state
◼ Distinction between database schema &
database state
❑ The database schema changes very infrequently.
❑ The database state changes every time the
database is updated.
• The DBMS is partly responsible for ensuring that
every state of the database is a valid state—that
is, a state that satisfies the structure and
constraints specified in the schema.
• Hence, specifying a correct schema to the DBMS
is extremely important and the schema must be
designed with utmost care.
◼ The DBMS stores the descriptions of the
schema constructs and constraints—also called
the meta-data—in the DBMS catalog so that
DBMS software can refer to the schema
whenever it needs to.
◼ Schema is also called intension.
◼ State is also called extension.
Three-Schema Architecture
◼ Proposed to support DBMS characteristics of:
❑ Program-data independence.
❑ Support of multiple views of the data.
❑ Use of catalog to store the db description
◼ Its goal is to separate the user applications and
the physical database.
◼ Not explicitly used in commercial DBMS products,
but has been useful in explaining database
system organization
The Three-schema architecture
◼ Defines DBMS schemas at three levels:
❑ Internal schema at the internal level to
describe physical storage structures.
◼ Lowest level of abstraction.
◼ Typically uses a physical data model.
◼ Describes the complete details of data
storage and access paths for the database.
❑ Conceptual schema at the conceptual level
to describe the structure and constraints for
the whole database for a community of users.
◼ Hides the details of physical storage
structures. Concentrates on describing
entities, data types, relationships, user
operations and constraints.
◼ Uses a implementation/Representational
data model to describe conceptual schema.
❑ External schemas at the external level (or
view level) to describe the various user views.
◼ Each External schemas describes the part
of the db that a particular user group is
interested in and hides the rest of the db
from that user group.
◼ Uses representational model to implement
external schema.
◼ The three schema architecture is a convenient
tool with which the user can visualize the
schema levels in a db system.
◼ Most DBMSs do not separate the three levels
completely and explicitly, but support the three-
schema architecture to some extent.
◼ In most DBMSs external schemas are specified in
the same data model that describes the conceptual–
level information. Ex: Oracle-SQL
◼ Some DBMSs allow different data model for external
and Conceptual level
◼ Ex: Universal Data Base –IBMs DBMS
❑ Uses Relational model for Conceptual schema
❑ May use Object-oriented model for external
schema
Mappings: The processes of transforming
requests and results between levels is called
mappings.
❑ Programs refer to an external schema, and are
mapped by the DBMS to the internal schema for
execution.
❑ Data extracted from the internal DBMS level is
reformatted to match the user’s external view (e.g.
formatting the results of an SQL query for display
in a Web page)
Data Independence
◼ Data independence can be defined as the
capacity to change the schema at one level
without changing the schema at next higher
level.
◼ There are two types of data Independence.
They are:
1. Logical data independence.
2. Physical data independence.
The Three-schema architecture
Logical data independence:
◼ Is the capacity to change the conceptual schema
without having to change the external schema or
application programs.
◼ Only the view definition and mapping need to be
changed in a DBMS that support logical data
independence.
◼ After the conceptual schema undergoes a
logical reorganization, application programs that
reference the external schema constructs must
work as before.
1.2
For example, the
external schema of Figure 1.5(a) should
not be affected by changing the
GRADE_REPORT file (or record type)
shown in Figure 1.2 into the one
shown in Figure 1.6(a).
1.6a
1.5a
Physical data independence:
◼ Is the capacity to change the internal schema
without changing the conceptual schema.
◼ Hence, the external schemas need not be
changed as well.
◼ Modification at the physical level are
occasionally necessary to improve
performance.
◼ Ex: By creating additional access structures—to
improve the performance of retrieval or update.
◼ If the same data as before remains in the
database, we should not have to change the
conceptual schema.
◼ For example, providing an access path to
improve retrieval speed of section records
(Figure 1.2) by semester and year should not
require a query such as list all sections offered in
fall 2008 to be changed, although the query
would be executed more efficiently by the DBMS
by utilizing the new access path.
◼ Logical data independence is more difficult
to achieve than physical data independence,
because it allows structural and constraint
changes without affecting application programs
◼ When a schema at a lower level is changed,
only the mappings between this schema and
higher-level schemas need to be changed in a
DBMS that fully supports data independence.
DBMS Languages
◼ Types of DBMS languages:
❑ Data Definition Language (DDL)
❑ Data Manipulation Language (DML)
❑ Data Control Language (DCL)
Data Definition Language (DDL)
◼ Used by the DBA and database designers to specify
the conceptual schema of a database.
◼ In many DBMSs, the DDL is also used to define
internal and external schemas (views).
◼ In some DBMSs, separate storage definition
language (SDL) and view definition language
(VDL) are used to define internal and external
schemas.
◼ Ex: Create, Alter, Drop, Truncate
Data Manipulation Language (DML)
◼ Used to specify database retrievals and updates.
◼ DML commands (data sublanguage) can be
embedded in a general-purpose programming
language (host language), such as COBOL, C,
C++, or Java.
◼ Alternatively, stand-alone DML commands can be
applied directly (called a query language).
◼ Ex: Insert, Delete, Select, Update
Data Control Language (DCL)
◼ The DCL language is used for controlling the
access to the table and hence securing the
database.
◼ DCL is used to provide certain privileges to a
particular user. Privileges are rights to be
allocated.
◼ The various privileges that can be granted or
revoked are: Select, Insert, Delete, Update,
References, Execute All.
Some of the DCL commands
◼ COMMIT - save work done
◼ SAVEPOINT - identify a point in a transaction to
which you can later roll back
◼ ROLLBACK - restore database to original since
the last COMMIT
◼ SET TRANSACTION - Change transaction
options like what rollback segment to use
DBMS Interfaces
User friendly interfaces provided by DBMS may include
the following:
Menu-based, for web clients or browsing :
◼ Present the user with list of options (called menus)
that lead the user through the formulations of a
request.
◼ No need to memorize the specific commands &
syntax of a query language.
◼ Pull-down menus are a very popular technique in
web based user interfaces.
◼ Used in browsing interfaces, which allow a user to
look through the content of db in an exploratory &
unstructured manner.
Form-based interfaces:
◼ Displays a form to each user.
◼ User can fill out all of the form entries to insert
data.
◼ Usually designed and programmed for naive
users as interfaces to canned transactions.
◼ Many DBMSs have form specification
languages, which are special languages that
help programmers specify such forms.
❑ Ex: SQL forms, Oracle forms, etc.,
Graphical User Interfaces(GUI):
◼ Displays a schema to the user in diagrammatic
form. Then user can specify a query by
manipulating the diagram.
◼ In many cases GUI utilize both menu & forms.
◼ Most GUIs use a pointing devices such as a
mouse, to pick certain parts of the displayed
schema diagram.
Natural language interface:
◼ Accept request written in English or some other
language and attempt to understand them.
◼ Usually has its own schema, which is similar to
the db conceptual schema, as well as a
dictionary of important words.
◼ Refers to the words in its schema, as well as to
the set of standard words in its dictionary, to
interpret the request.
◼ The capabilities of natural language interfaces
have not advanced rapidly.
(Ex: Search engine that accept strings of natural
language)
Speech input & output:
◼ Uses speech as an input query & speech as an
answer to a question or result of a request.
Ex: enquires for telephone directory, flight
arrival/departure, back account information
◼ Speech i/p is detected using a library of
predefined words and used to set up the
parameters that are supplied to the queries. For
o/p, a similar conversion from text or numbers
into speech takes place.
Interfaces for parametric users:
◼ System analysts and programmers design and
implement a special interfaces for each class of
naive users. Ex: bank tellers
◼ Usually a small set of abbreviated commands is
included, with the goal of minimizing the no. of
keystrokes required for each request.
◼ Ex: Function keys in a terminal can be
programmed to initiate various commands.
Interfaces for the DBA:
◼ Most database systems contain privileged
commands that can be used only by DBA’s staff.
Ex: Commands for creating accounts,
setting system parameters,
granting a/c authorization,
changing a schema and
reorganizing the storage structure of database.
The Database System Environment
◼ DBMS Component Modules
Entity types, Entity sets, Attributes and Keys
• E-R modeling: A graphical technique for
understanding and organizing the data
independent of the actual database
implementation.
• Entities and Attributes:
Entity: Any thing that may have an independent
existence and about which we intend to collect
data.
Ex: particular person, car, house or employee.
◼ Attributes: Properties / characteristics that
describe entities
◼ The set of possible values for an attribute is
called the domain of the attribute.
Ex:- The domain of attribute marital status is
just the four values: single, married, divorced,
widowed
– for month - 12 values ranging from Jan to Dec
◼ Key attribute: The attribute (or combination of
attributes) that is unique for every entity instance
Ex:- the account number of an account, the
employee id of an employee etc.
◼ Key attribute of an entity type. – An attribute
whose value are distinct for each individual
entity in the entity set
Ex: Composite attribute
Types of Attributes
1. Simple Vs Composite attribute
Simple attribute: cannot be divided into
simpler components
E.g:
Composite attribute: can be split into
components
E.g: Address, Name
2. Single Vs Multi-valued Attributes
Single valued : can take on only a single value
for each entity instance
Ex: Age
Multi-valued: can take many values
Ex: Color
3. Stored Vs Derived attribute
Stored Attribute: Attribute that need to be
stored permanently.
Ex: name of an employee, Birth date
Derived Attribute: Attribute that can be
calculated based on other attributes
Ex: Age
◼ NULL values
❑ Used when particular entity may not have an
◼ Applicable value
◼ Do not know the value
◼ Value is missing
◼ Complex Attributes
❑ Arbitrary Nesting of Composite and Multivalued
attributes
❑ ( ) , { }
Ex: If a person can have more than one residence and
each residence can have a single address and multiple
phones, an attribute Address_phone for a person can be
specified as
Address_phone ({Phone (Area_code, Phone_number)},
Address (Street_address (Number, Street,
Apartment_number), City, State, Zip) ) }
Entity Types, Entity Sets, Keys and Value sets
◼ Entity type: a collection of entities that have the
same attributes.
❑ Ex: employee, student, etc.,
◼ Entity set: Is a collection of entities of a same
type that share the same properties or attributes.
❑ The entity set is usually referred to using the same
name as the entity type.
❑ Ex: Employee – refers to both entity type as well as
the current set of all employee entities in the db.
Two entity types, EMPLOYEE and COMPANY, and
some member entities of each.
Initial Conceptual Design of the Company
Database
◼ Based on the requirements, we can identify four
initial entity types in the COMPANY database:
❑ DEPARTMENT
❑ PROJECT
❑ EMPLOYEE
❑ DEPENDENT
◼ Their initial design is shown on the following
slide
◼ The initial attributes shown are derived from the
requirements description
Example COMPANY Database
◼ We need to create a database schema design
based on the following (simplified) requirements
of the COMPANY Database:
❑ The company is organized into DEPARTMENTs.
Each department has a name, number and an
employee who manages the department. We keep
track of the start date of the department manager. A
department may have several locations.
❑ Each department controls a number of PROJECTs.
Each project has a unique name, unique number
and is located at a single location.
◼ We store each EMPLOYEE’s social security
number, address, salary, sex, and birthdate.
❑ Each employee works for one department but
may work on several projects.
❑ We keep track of the number of hours per week
that an employee currently works on each project.
❑ We also keep track of the direct supervisor of
each employee.
❑ Each employee may have a number of
DEPENDENTs.
◼ For each dependent, we keep track of their
name, sex, birthdate, and relationship to the
employee.
Relationship, Relationship Types, Relationship sets,
Roles and Structural Constraints.
◼ Relationship : Association between several
entities
ex: Employee – worksfor – Department
◼ Relationship type: A relationship type R among
n entity types E1, E2,…..En defines a set of
association among entities from these entity
types.
◼ Relationship set: is a set of relationships of the
same type
◼ Relationship instance: Associates n individual entities
and each entity in relationship is member of entity type.
❑ Ex: If works_for is the relationship between the
Employee entity and the Department entity, then ‘Ram
works for CS Department’, ‘Shyam works for electrical
Department’ etc., are relationship instances of the
relationship works_for.
◼ Mathematically, the relationship set R is a set of
relationship instances ri, where each ri associates n
individual entities (e1, e2,… en) and each entity ej in ri is
a member of enity type Ej, 1 ≤ j ≤ n.
◼ Each of the individual entities e1, e2, …en is said to
participate in the relationship instance ri = (e1, e2, …
en).
Relationship Degree
◼ Degree of a Relationship
- Is the number of entity types involved in a
relationship.
◼ Types
• One - Unary
• Two - Binary
• Three - Ternary
Ex : employee manager-of employee is unary
employee works-for department is binary
customer purchase item, shop keeper is a
ternary relationship
Role name
◼ A role name specifies what role an entity type
plays in a specific relationship
◼ Role names are sometimes used in ER-
diagrams to clarify the roles of the participating
entity types.
Recursive Relationship Type
◼ In a relationship where same entity type
participate more than once in a relationship type
in different roles
Ex: the SUPERVISION relationship
◼ EMPLOYEE participates twice in two distinct
roles:
❑ supervisor (or boss) role
❑ supervisee (or subordinate) role
Constraints on Binary Relationship Types
◼ Relationship types usually have certain
constraints that limit the possible combinations
of entities that may participate in the
corresponding relationship set.
◼ Two main types:
❑ Cardinality ratio
❑ Participation
Cardinality Ratios for Binary Relationships
◼ Specifies the maximum number of relationship
instances that an entity can participate in.
◼ Possible Cardinality ratios for binary relationship
types are :
– one-to-one (1:1)
– one-to-many (1:N)
– many-to-many (M:N)
Ex: employee head-of department (1:1)
student enrolls course (m:n)
lecturer offers course (1:n) assuming a
course is taught by a single lecturer
Hospital
Manageme
nt System
Participation Constraints and Existence Dependence
◼ Specifies whether the existence of an entity
depends on its being related to another entity via
the relationship type.
◼ Specifies the minimum number of relationship
instances that each entity can participate in, and
is some times called the minimum cardinality
constraints.
◼ Two types:
❑ Total participation
❑ Partial participation
Total Participation :
◼ Every entity instance must be connected through the
relationship to another instance of the other participating
entity types.
◼ Here the participation of Employee in works_for is called
total participation.
❑ Means, every entity in the total set of employee
entities must be related to a department entity via
Works_for.
◼ Total participation is also called as existence
dependency
Partial participation
◼ All instances need not participate.
Ex: Employee Head-of Department
Employee: partial
Department: total
Attributes for relationship types
◼ Also a relationship type can have attributes.
E.g. in the case where the weekly number of
hours an employee works on a project should be
kept, that can be represented for each instance
of the relation “works-on”.
◼ If the relation is a 1:1 or 1:N relation, the
attribute can be stored at one of the participating
entities.
◼ When the relation is of the type M:N one must
store the attributes with the instance of the
relation.
Regular Entity type
◼ Regular Entity: Entity that has its own key
attribute.
◼ E.g.: Employee, student ,customer, policy holder
etc.
◼ Also called as Strong Entity Type.
Weak entity type:
◼ Entity that depends on other entity for its
existence and doesn’t have key attribute of its
own.
◼ Ex: DEPENDENTS of EMPLOYEE
◼ The relationship type that relates a weak entity
type to its owner is called identifying
relationship of the weak entity type.
◼ Has a partial key, which is the set of attributes
that can uniquely identify weak entities that are
related to the same owner entity.
Relationship
•It has a label that explains the relationship.
•Usually the convention is to read the ER diagram from top
to bottom and from left to right.
•The relationship name is so chosen as to make sense when
read from left to right.
•The relationship above is read as student enrolls-in course
Case Study – ER Model For a College DB
Assumptions:
◼ A college contains many departments
◼ Each department can offer any number of courses
◼ Many instructors can work in a department
◼ An instructor can work only in one department
◼ For each department there is a Head
◼ An instructor can be head of only one department
◼ Each instructor can take any number of courses
◼ A course can be taken by only one instructor
◼ A student can enroll for any number of courses
◼ Each course can have any number of students
Steps in ER Modeling
◼ Identify the Entities
◼ Find relationships
◼ Identify the key attributes for every Entity
◼ Identify other relevant attributes
◼ Draw complete E-R diagram with all attributes including
Primary Key and Cardinality Ratios
◼ Review your results with your Business users
◼ Step 1: Identify the Entities
• DEPARTMENT
• STUDENT
• COURSE
• INSTRUCTOR
◼ Step 2: Find the relationships
• One course is enrolled by multiple students and one student enrolls
for multiple courses, hence the cardinality between course and
student is Many to Many.
• The department offers many courses and each course belongs to only
one department, hence the cardinality between department and
course is One to Many.
• One department has multiple instructors and one instructor belongs to
one and only one department , hence the cardinality between
department and instructor is one to Many.
• Each department there is a “Head of department” and one instructor is
“Head of department “,hence the cardinality is one to one .
• One course is taught by only one instructor, but the instructor teaches
many courses, hence the cardinality between course and instructor
is many to one.
◼ Step 3: Identify the key attributes
• Deptname is the key attribute for the Entity
“Department”, as it identifies the Department uniquely.
• Course# (CourseId) is the key attribute for “Course”
Entity.
• Student# (Student Number) is the key attribute for
“Student” Entity.
• Instructor Name is the key attribute for “Instructor” Entity.
◼ Step 4: Identify other relevant attributes
• For the department entity, the relevant attribute is
location
• For course entity, course name, duration, prerequisite
• For instructor entity, room#, telephone#
• For student entity, student name, date of birth
Step 5: Draw complete E-R diagram with all attributes including Primary K
A banking scenario
◼ Banks have customers.
◼ Customers are identified by name, custid, phone number
and address.
◼ Customers can have one or more accounts
◼ Accounts are identified by an account number, account
type (savings, current) and a balance.
◼ Customers can avail loans.
◼ Loans are identified by loan id, loan type (car, home,
personal) and an amount.
◼ Banks are identified by a name, code and the address of
the main office.
◼ Banks have branches.
◼ Branches are identified by a branch number, branch
name and an address.
◼ Accounts and loans are related to the banks’ branches.
◼ Step 1: Identify the Entities
• BANK
• BRANCH
• LOAN
• ACCOUNT
• CUSTOMER
◼ Step 2: Find the relationships
• One Bank has many branches and each branch belongs to only one
bank, hence the cardinality between Bank and Branch is One to Many.
• One Branch offers many loans and each loan is associated with one
branch, hence the cardinality between Branch and Loan is One to Many.
• One Branch maintains multiple accounts and each account is
associated to one and only one Branch, hence the cardinality between
Branch and Account is One to Many
• One Loan can be availed by multiple customers, and each Customer
can avail multiple loans, hence the cardinality between Loan and
Customer is Many to Many.
• One Customer can hold multiple accounts, and each Account can be
held by multiple Customers, hence the cardinality between Customer
and Account is Many to Many
◼ Solution Step 2: Identify relationships between entities
• Bank has Branch
• Branch maintains accounts
• Branch offers loans
• Account is held by customer
• Loan is availed by customer
◼ Solution Step 2: Analyze cardinality of relationships
•Bank has Branch : A bank has many branches->1:N
•Branch maintains accounts: One branch maintains
many accounts-> 1:N
•Branch offers loans : One branch offers many loans -
> 1:N
•Account is held by customer -> M:N
•Loan is availed by customer -> M:N
◼ Step 3: Identify the key attributes
• BankCode (Bank Code) is the key attribute for the Entity “Bank”, as
it identifies the bank uniquely.
• Branch# (Branch Number) is the key attribute for “Branch” Entity.
• Customer# (Customer Number) is the key attribute for “Customer”
Entity.
• Loan# (Loan Number) is the key attribute for “Loan” Entity.
• Account No (Account Number) is the key attribute for “Account”
Entity.
◼ Step 4: Identify other relevant attributes
• For the “Bank” Entity, the relevant attributes other than
“BankCode” would be “Name” and “Address”.
• For the “Branch” Entity, the relevant attributes other than “Branch#”
would be “Name” and “Address”.
• For the “Loan” Entity, the relevant attribute other than “Loan#”
would be “Loan Type”.
• For the “Account” Entity, the relevant attribute other than “Account
No” would be “Account Type”.
• For the “Customer” Entity, the relevant attributes other than
“Customer#” would be “Name”, “Telephone#” and “Address”.
Merits and Demerits of ER Modeling:
◼ Merits
• Easy to understand. Represented in Business Users
Language.
• Can be understood by non-technical specialist.
• Intuitive and helps in Physical Database creation.
• Can be generalized and specialized based on needs.
• Can help in database design.
• Gives a higher level description of the system.
◼ Demerits
• Physical design derived from E-R Model may have
some amount of ambiguities or inconsistency.
• Sometime diagrams may lead to misinterpretations
◼ Exercise 2.7 The Prescriptions-R-X chain of
pharmacies has ordered to give you a free
lifetime supply of medicines if you design its
database. Given the rising cost of health care,
you agree. Here's the information that you
gather:
◼ Patients are identified by an SSN, and their names,
addresses, and ages must be recorded.
◼ Doctors are identified by an SSN. For each doctor,
the name, specialty, and years of experience must
be recorded.
◼ Each pharmaceutical company is identified by name
and has a phone number.
◼ For each drug, the trade name and formula must be
recorded.
◼ Each drug is sold by a given pharmaceutical company, and
the trade name identifies a drug uniquely from among the
products of that company. If a pharmaceutical company is
deleted, you need not keep track of its products any longer.
◼ Each pharmacy has a name, address, and phone number.
◼ Every patient has a primary physician. Every doctor has at
least one patient.
◼ Each pharmacy sells several drugs and has a price for
each. A drug could be sold at several pharmacies, and the
price could vary from one pharmacy to another.
◼ Doctors prescribe drugs for patients. A doctor could
prescribe one or more drugs for several patients, and a
patient could obtain prescriptions from several doctors.
◼ Each prescription has a date and a quantity associated
with it. You can assume that if a doctor prescribes the
same drug for the same patient more than once, only the
last such prescription needs to be stored.
◼ Pharmaceutical companies have long-term contracts with
pharmacies. A pharmaceutical company can contract with
several pharmacies, and a pharmacy can contract with
several pharmaceutical companies. For each contract,
you have to store a start date, an end date, and the text of
the contract.
◼ Pharmacies appoint a supervisor for each contract. There
must always be a supervisor for each contract, but the
contract supervisor can change over the lifetime of the
contract.
◼ Draw an ER diagram that captures the above
information. Identify any constraints that are not
captured by the ER diagram.
◼ Exercise 2.8 : Although you always wanted to be an artist, you ended
up being an expert on databases. So now you set up a database
company, ArtBase, that builds a product for art galleries. The core of this
product is a database with a schema that captures all the information
that galleries need to maintain.
◼ Galleries keep information about artists, their names (which are unique),
birthplaces, age, and style of art.
◼ For each piece of artwork, the artist, the year it was made, its unique
title, its type of art (e.g., painting, lithograph, sculpture, photograph), and
its price must be stored. Pieces of artwork are also classfied into groups
of various kinds, for example, portraits, still lifes, works by Picasso, or
works of the 19th century; a given piece may belong to more than one
group.
◼ Each group is identified by a name (like those above) that describes the
group.
◼ Finally, galleries keep information about customers. For each customer,
galleries keep their unique name, address, total amount of dollars they
have spent in the gallery (very important!), and the artists and groups of
art that each customer tends to like.
◼ Draw the ER diagram for the database.
Proper Naming of Schema Constructs
◼ When designing a database schema, the choice
of names for entity types, attributes, relationship
types, and (particularly) roles is not always
straightforward.
◼ One should choose names that convey, as much
as possible, the meanings attached to the
different constructs in the schema.
◼ We choose to use singular names for entity
types, rather than plural ones, because the entity
type name applies to each individual entity
belonging to that entity type.
◼ In our ER diagrams, we will use the convention
that entity type and relationship type names are
uppercase letters, attribute names have their
initial letter capitalized, and role names are
lowercase letters.
◼ As a general practice, given a narrative
description of the database requirements, the
nouns appearing in the narrative tend to give
rise to entity type names, and the verbs tend to
indicate names of relationship types.
◼ Attribute names generally arise from additional
nouns that describe the nouns corresponding to
entity types.
◼ Another naming consideration involves choosing
binary relationship names to make the ER
diagram of the schema readable from left to right
and from top to bottom.
play
Match Player
ed
Team Pla
_playe Pla yed
d yer _in
_of
Team Ground