DBMS
DBMS
DBMS
MODULE 1
Introduction: Database System applications, Purpose of database systems. View of Data, Database
Languages, Database Design, Database and application architecture. Data Models: Hierarchical, model,
Network model, Entity Relationship model, Object Oriented data model, Relational model. Introduction
to relational model: Structure of relational database, Database schema, keys, relational algebra and
calculas.
What is Data?
Data is a collection of a distinct small unit of information. It can be used in a variety of forms like text,
numbers, media, bytes, etc. it can be stored in pieces of paper or electronic memory, etc.
What is Database?
A database is an organized collection of data, so that it can be easily accessed and managed.
For example: The college Database organizes the data about the admin, staff, students and faculty etc.
o DBMS provides the interface to perform the various operations like creation, deletion, modification, etc.
o DBMS allows the user to create their databases as per their requirement.
o DBMS accepts the request from the application and provides specific data through the operating system.
o DBMS contains the group of programs which acts according to the user instruction.
o It provides security to the database.
Advantage of DBMS
Controls redundancy
It stores all the data in a single database file, so it can control data redundancy.
Data sharing
Backup
It provides Backup and recovery subsystem. This recovery system creates automatic data
from system failure and restores data if required.
Library Management System − Now-a-days it’s become easy in the Library to track each
book and maintain it because of the database. This happens because there are thousands of
books in the library. It is very difficult to keep a record of all books in a copy or register.
Now DBMS used to maintain all the information related to book issue dates, name of the
book, author and availability of the book.
Banking − Banking is one of the main applications of databases. We all know there will
be a thousand transactions through banks daily and we are doing this without going to the
bank. This is all possible just because of DBMS that manages all the bank transactions.
Universities and colleges − Now-a-days examinations are done online. So, the universities
and colleges are maintaining DBMS to store Student’s registrations details, results, courses
and grade all the information in the database. For example, telecommunications. Without
DBMS there is no telecommunication company. DBMS is most useful to these companies
to store the call details and monthly postpaid bills.
.
Finance − Now-a-days there are lots of things to do with finance like storing sales, holding
information and finance statement management etc. these all can be done with database
systems.
Military − In military areas the DBMS is playing a vital role. Military keeps records of
soldiers and it has so many files that should be kept secure and safe. DBMS provides a high
security to military information.
Online Shopping − Now-a-days we all do Online shopping without wasting the time by
going shopping with the help of DBMS. The products are added and sold only with the help
of DBMS like Purchase information, invoice bills and payment.
Manufacturing − Manufacturing companies make products and sell them on a daily basis.
To keep records of all those details DBMS is used.
Airline Reservation system − Just like the railway reservation system, airlines also need
DBMS to keep records of flights arrival, departure and delay status.
Purpose of Database systems
The purpose of database systems is to make the database user-friendly and do easy operations. Users can
easily insert, update, and delete. Actually, the main purpose is to have more control of the data.
If there are multiple copies of the same data, it just avoids it. It just maintains data in a single repository. Also, the purpose
of database systems is to make the database consistent.
A database system can easily manage to access data. Through different queries, it can access data from the database.
Data isolation:
Atomicity of updates:
In case of power failure, the database might lose data. So, this feature will automatically prevent data loss.
Concurrent access:
Users can have multiple access to the database at the same time.
Security problems:
Database systems will make the restricted access. So, the data will not be vulnerable.
It can support multiple views of data to give the required view as their needs. Only database admins can have a complete
view of the database. We cannot allow the end-users to have a view of developers.
View of Data
View of data in DBMS narrate how the data is visualized at each level of data abstraction.
Data abstraction allow developers to keep complex data structures away from the users. The developers
achieve this by hiding the complex data structures through levels of abstraction.
The data independence means while changing the data schema at one level of the database must not
modify the data schema at the next level.
Data Abstraction
Data abstraction is hiding the complex data structure in order to simplify the user’s interface of the
system. It is done because many of the users interacting with the database system are not that much
computer trained to understand the complex data structures of the database system.
To achieve data abstraction, we will discuss a Three-Schema architecture which abstracts the database
at three levels discussed below:
Three-Schema Architecture:
The main objective of this architecture is to have an effective separation between the user interface and
the physical database. So, the user never has to be concerned regarding the internal storage of the
database and it has a simplified interaction with the database system.
The physical or the internal level schema describes how the data is stored in the hardware. It also
describes how the data can be accessed. The physical level shows the data abstraction at the lowest level
and it has complex data structures. Only the database administrator operates at this level.
It is a level above the physical level. Here, the data is stored in the form of the entity set, entities,
their data types, the relationship among the entity sets, user operations performed to retrieve or
modify the data and certain constraints on the data. Well adding constraints to the view of data adds the
security. As users are restricted to access some particular parts of the database.
It is the developer and database administrator who operates at the logical or the conceptual level.
3. View Level/ User level/ External level
We have to create a database of a college. The entity used sets are Student, Lecturer, Department, Course and
so on…
Now, the entity sets Student, Lecturer, Department, Course will be stored in the storage as the consecutive
blocks of the memory location. This is the physical or internal level and is hidden from the programmers
but the database administrator is it aware of it.
At the logical level, the programmers define the entity sets and relationship among these entity sets using a
programming language like SQL. So, the programmers work at the logical level and even the database
administrator also operates at this level.
At the view level, the users have the set of applications which they use to retrieve the data they are
interested in.
Data Independence
Data independence defines the extent to which the data schema can be changed at one level without
modifying the data schema at the next level. Data independence can be classified as shown below:
Logical data independence describes the degree up to which the logical or conceptual schema can be
changed without modifying the external schema. The changes to data schema at the logical level are made
either to enlarge or reduce the database by adding or deleting more entities, entity sets, or changing the
constraints on data.
Physical data independence defines the extent up to which the data schema can be changed at the physical
or internal level without modifying the data schema at logical and view level.
Instances and Schemas
What is an instance?
We can define an instance as the information stored in the database at a particular point of time.
As we discussed above the database comprises of several entity sets and the relationship between them.
Now, the data in the database keeps on changing with time. As we keep inserting or deleting the data to
and from the database.
Now, at a particular time if we retrieve any information from the database then that corresponds to an
instance.
What is schema?
The definition of a database comprises of the description of what data it would contain what would be
the relationship between the data. This definition is the database schema.
Database languages
Database languages, also known as query languages or data query languages, are a classification of
programming languages that developers use to define and access databases, which are collections of
organized data that users can access electronically
4 categories of database languages
Here are four types of database languages and their uses:
1. Data definition language (DDL)
Data definition language (DDL) creates the framework of the database by specifying the database schema,
which is the structure that represents the organization of data. Its common uses include the creation and
alteration of tables, files, indexes and columns within the database. This language also allows users to
rename or drop the existing database or its components. Here's a list of DDL statements:
CREATE: Creates a new database or object, such as a table, index or column
ALTER: Changes the structure of the database or object
DROP: Deletes the database or existing objects
RENAME: Renames the database or existing objects
2. Data manipulation language (DML)
Data manipulation language (DML) provides operations that handle user requests, offering a way to access
and manipulate the data that users store within a database. Its common functions include inserting,
updating and retrieving data from the database. Here's a list of DML statements:
INSERT: Adds new data to the existing database table
UPDATE: Changes or updates values in the table
DELETE: Removes records or rows from the table
SELECT: Retrieves data from the table or multiple tables
3. Data Control language (DCL)
Data control language (DCL) controls access to the data that users store within a database. Essentially,
this language controls the rights and permissions of the database system. It allows users to grant or revoke
privileges to the database. Here's a list of DCL statements:
GRANT: Gives a user access to the database
REVOKE: Removes a user's access to the database
4. Transaction control language (TCL)
Transaction control language (TCL) manages the transactions within a database. Transactions group a set
of related tasks into a single, executable task. All the tasks must succeed in order for the transaction to
work. Here's a list of TCL statements:
COMMIT: Carries out a transaction
ROLLBACK: Restores a transaction if any tasks fail to execute
Database Design:
Database design is the process of defining the structure, organization, and relationships of
data within a database system. The goal of database design is to ensure the database is efficient, scalable,
maintainable, and supports the intended applications. A well-designed database reduces redundancy,
ensures data integrity, and supports efficient query processing.
1- Tier Architecture
o In this architecture, the database is directly available
to the user. It means the user can directly sit on the
DBMS and uses it.
o Any changes done here will directly be done on the database itself. It doesn't provide a handy
tool for end users.
o The 1-Tier architecture is used for development of the local application, where programmers can
directly communicate with the database for the quick response.
2- Tier Architecture
o The 2-Tier architecture is same as basic client-server. In the two-tier architecture, applications on
the client end can directly communicate with the database at the server side. For this interaction,
API's like: ODBC, JDBC are used.
o The user interfaces and application programs are run on the
client-side.
o The server side is responsible to provide the functionalities
like: query processing and transaction management.
o To communicate with the DBMS, client-side application
establishes a connection with the server side.
3- Tier Architecture
o The 3-Tier architecture contains another layer between the client and server. In this architecture,
client can't directly communicate with the server.
o The application on the client-end interacts with an application
server which further communicates with the database system.
o End user has no idea about the existence of the database beyond the
application server. The database also has no idea about any other
user beyond the application.
o The 3-Tier architecture is used in case of large web application.
DATA MODELS
Data models in DBMS help to understand the design at the conceptual, physical, and logical levels as it
provides a clear picture of the data making it easier for developers to create a physical database.
Data models are used to describe how the data is stored, accessed, and updated in a DBMS.
It defines the data elements and the relationships between the data elements. Data Models are
used to show how data is stored, connected, accessed and updated in the database
management system. The most common types of data models are:-
1. Hierarchical Model
2. Network Model
3. Entity-Relationship Model
4. Relational Model
5. Object-Oriented Data Model
1. Hierarchical Model
Hierarchical Model was the first DBMS model. This model organises the data in the
hierarchical tree structure. The hierarchy starts from the root which has root data and then
it expands in the form of a tree adding child node to the parent node. This model easily
represents some of the real-world relationships like food recipes, sitemap of a website
etc. Example: We can represent the relationship between the shoes present on a shopping
website in the following way:
Any change in the parent node is automatically reflected in the child node so, the
integrity of data is maintained.
Disadvantages of Hierarchical Model
As it does not support more than one parent of the child node so if we have some
complex relationship where a child node needs to have two parent node then that
can't be represented using this model.
2. Network Modcl
This model is an extension of the hierarchical model. It was the most popular model before
the relational model. This model is the same as the hierarchical model, the only difference is
that a record can have more than one parent. It replaces the hierarchical tree with a
graph. Example: In the example below we can see that node student has two parents i.e. CSE
Department and Library. This was earlier not possible in the hierarchical model.
The data can be accessed faster as compared to the hierarchical model. This is
because the data is more related in the network model and there can be more than
one path to reach a particular node. So the data can be accessed in many ways.
As more and more relationships need to be handled the system might get complex.
So, a user must be having detailed knowledge of the model to work with the model.
Relationship: Relationship tells how two attributes are related. Example: Teacher
works for a department.
Example:
In the above diagram, the entities are Teacher and Department. The attributes
of Teacher entity are Teacher_Name, Teacher_id, Age, Salary, Mobile_Number. The
attributes of entity Department entity are Dept_id, Dept_name. The two entities are connected
using the relationship. Here, each teacher works for a department.
Advantages of ER Model
Effective Communication Tool : This model is used widely by the database designers
for communicating their ideas.
Easy Conversion to any Model : This model maps well to the relational model and can
be easily converted relational model by converting the ER model to the table. This model
can also be converted to any other model like network model, hierarchical model etc.
Disadvatages of ER Model
4. Relational Modcl
Relational Model is the most widely used model. In this model, the data is maintained in the
form of a two-dimensional table. All the information is stored in the form of row and
columns. The basic structure of a relational model is tables. So, the tables are also
called relations in the relational model. Example: In this example, we have an Employee table.
Tuples : Each row in the table is called tuple. A row contains all the information
about any instance of the object. In the above example, each row has all the
information about any specific individual like the first row has information about
John.
Attribute or field: Attributes are the property which defines the table or relation. The
values of the attribute should be from the same domain. In the above example, we
have different attributes of the employee like Salary, Mobile_no, etc.
Advnatages of Relational Model
Simple: This model is more simple as compared to the network and hierarchical
model.
Scalable: This model can be easily scaled as we can add as many rows and columns we
want.
In the above example, we have two objects Employee and Department. All the data and
relationships of each object are contained as a single unit. The attributes like Name, Job_title of
the employee and the methods which will be performed by that object are stored as a single
object. The two objects are connected through a common attribute i.e the Department_id and the
communication between these two will be done with the help of this common id.
Introduction to Relational Model
The relational model represents how data is stored in Relational Databases. A relational database
stores data in the form of relations (tables). Consider a relation STUDENT with attributes
ROLL_NO, NAME, ADDRESS, PHONE, and AGE shown in.
STUDENT
IMPORTANT TERMINOLOGIES
Attribute: Attributes are the properties that define a relation. e.g.; ROLL_NO, NAME
Relation Schema: A relation schema represents the name of the relation with its attributes. e.g.;
STUDENT (ROLL_NO, NAME, ADDRESS, PHONE, and AGE) is the relation schema for
STUDENT. If a schema has more than 1 relation, it is called Relational Schema.
Tuple: Each row in the relation is known as a tuple. The above relation contains 4 tuples, one of
which is shown as:
Relation Instance: The set of tuples of a relation at a particular instance of time is called a
relation instance. Table shows the relation instance of STUDENT at a particular time. It can
change whenever there is an insertion, deletion, or update in the database.
Degree: The number of attributes in the relation is known as the degree of the relation.
The STUDENT relation defined above has degree 5.
NULL Values: The value which is not known or unavailable is called a NULL value. It is
represented by blank space. e.g.; PHONE of STUDENT having ROLL_NO 4 is NULL
DATABASE SCHEMA
A database schema is a blueprint or architecture of how our data will look. It doesn’t hold data
itself, but instead describes the shape of the data and how it might relate to other tables or models. An
entry in our database will be an instance of the database schema. It will contain all of the properties
described in the schema.
The size and complexity of your database schema depends on the size of your project. The visual style
of a database schema allows programmers to structure the database and its relationships properly
before jumping into the code. The process of planning a database design is called data modeling..
In many companies, database design and DBMS responsibilities usually fall to the role of the
Database Administrator (DBA). DBAs are responsible for ensuring that data analysts and database
users can easily access information. They work alongside management teams to plan and securely
manage an organization’s database.
There are two main database schema types that define different parts of the schema:
logical and physical.
1. Logical
A logical database schema represents how the data is organized in terms of tables. It also explains how
attributes from tables are linked together. Different schemas use a different syntax to define the logical
architecture and constraints.
2. Physical
The physical database schema represents how data is stored on disk storage. In other words, it is the
actual code that will be used to create the structure of your database.
Suppose you want to create a database to store information for your company’s accounting
department. This database’s schema could outline the structure of two simple tables:
A) Table1
Title: Users
Fields: ID, Full Name, Email, DOB, Dept
B) Table2
These schema tables can then be converted into SQL code by developers and database
administrators.
Let's start by creating a schema called bookstore and a table called genres:
The snippet of code above, besides defining a schema, The table has three columns:
For example, ID is used as a key in the Student table because it is unique for each student. In
the PERSON table, passport_number, license_number, SSN are keys since they are unique for
each person.
1. Candidate Key
2. Primary Key
3. Foreign Key
4. Super Key
5. Alternate Key
6. Composite Key
7. Unique Key
1. Candidate Key
The candidate keys refer to those attributes that identify rows uniquely in a table. In a table, we select
the primary key from a candidate key. Thus, a candidate key has similar properties as that of the primary
keys. In a table, there can be multiple candidate keys.
2. Primary Key
The primary key refers to a column or a set of columns of a table that helps us identify all the records
uniquely present in that table.
A table can consist of just one primary key. Also, this primary key cannot consist of the same values
reappearing/repeating for any of its rows. All the values of a primary key have to be different, and there
should be no repetitions.
3. Super Key
A super key refers to the set of all those keys that help us uniquely identify all the rows present in a
table. It means that all of these columns present in a table that can identify the columns of that table
uniquely act as the super keys.
A super key is a candidate key’s superset (candidate key has been explained below). We need to pick
the primary key of any table from the super key’s set so as to make it the table’s identity attribute.
4. Alternate Key
As we have stated above, any table can consist of multiple choices for the primary key. But, it can only
choose one. Thus, all those keys that did not become a primary key are known as alternate keys.
5. Foreign Key
We use a foreign key to establish relationships between two available tables. The foreign key would
require every value present in a column/set of columns to match the referential table’s primary key. A
foreign key helps us to maintain data as well as referential integrity.
Foreign keys are the column of the table used to point to the primary key of another table.
6. Composite Key
The composite key refers to a set of multiple attributes that help us uniquely identify every tuple present
in a table. The attributes present in a set may not be unique whenever we consider them separately.
Thus, when we take them all together, it will ensure total uniqueness.
7. Unique Key
A unique key refers to a column/a set of columns that identify every record uniquely in a table. All the
values in this key would have to be unique. Remember that a unique key is different from a primary key.
It is because it is only capable of having one null value. A primary key, on the other hand, cannot have a
null value.
Relational Algebra
Relational Algebra is a procedural query language, it is used to provide a single table / relation
as output of performing operations on more than one relations. Some of the basic relations
will be discussed here.
In our course of learning, we will use three relations (table) −
Table 1: course
Course_id Name
1 Computer science
2 Information Technology
3 mechanical
Table 2: students
Roll No. Name address age
1 Ram Delhi 18
2 Raju hyderabad 20
4 Faiz Delhi 22
5 Salman hyderabad 20
Table 3: Hostel
St. No. Name address age
1 Ram Delhi 18
2 Akash hyderabad 20
3 neha Jhansi 21
On this relations, we will perform some operation to make new relation based on operations
performed.
1) Selection operation (σ) − The selection operator denoted by sigma σ is used to select the
tuples of a relation based on some condition. Only those tuples that fall under certain conditions
are selected.
Syntax
σ(condition)(relation_name)
Example
Select the student with course id 1.
σ(course_id = 1)(student)
Result
Roll No. Name address age
4 Faiz Delhi 22
2) Projection operation (∏) The projection operator denoted by ∏ is used to select columns
from a specific reaction. Only specific columns are selected.
Syntax
∏(column1 , column2 , … , columnn)(relation_name)
Example
Let’s select all students's name and no who are in hostel.
∏( st. No. , name)(hostel)
Result
St. No. Name
1 Ram
2 Akash
3 neha
The row are always distinct in projection i.e. if their is any other student whose name is panjak
the other one is removed.
3) Cross Product(X) - Cross product is denoted using the X symbol and is used to find the value
of join of two variables. In cross product each tuple of relation1 is crossed with each tuple of
relation2. Which makes the output relation of the order nXm, where n is the number of tuples
in relation1 and m is the number of tuples in relation2.
Syntax
relation1 X relation2
Example
Let’s find cross product of course and hostel table.
student X course
4) Union (U) - The union of two relations relation1 and relation2 will gives the tuples that are
either in relation1 or in relation2 but tuples that are in both relation1 and relation2 are
considered only once.
Also both relations should be of the same domain for finding there union.
Syntax
relation1 U relation2
Example
Let’s find the union of student and hostel
student U hostel
1 Ram Delhi 18
2 Raju hyderabad 20
4 Faiz Delhi 22
5 Salman hyderabad 20
2 Akash hyderabad 20
3 neha Jhansi 21
5) Minus (-) operator - operator is denoted by - symbol. Relation1 - relation2 will result into a
relation in which the tuple in relation1 and not in relation2 are present. For calculating minus
too, the relations must be union compatible.
Syntax
relation1 - relation2
Example
Let’s find the operation student - hostel
student - hostel
2 Raju hyderabad 20
4 Faiz Delhi 22
5 Salman hyderabad 20
6) rename(ρ) − the rename operation denoted by the ρ is used to rename the given relation to
another name given.
Syntax
ρ(new_name , old_name)