DBMS Unit 1 Notes
DBMS Unit 1 Notes
STUCOR
DATA MODELS IN DBMS
A Data Model is a logical structure of Database. It describes the design of database to reflect entities, attributes,
relationship among data, constrains etc.
Types of Data Models:
Object based logical Models – Describe data at the conceptual and view levels.
1. E-R Model
An entity–relationship model (ER model) is a systematic way of describing and defining a business process. An
APP
ER model is typically implemented as a database. The main components of E-R model are: entity set and
relationship set.
3. Network Model – Network Model is same as hierarchical model except that it has graph-like structure rather
than a tree-based structure. Unlike hierarchical model, this model allows each record to have more than one
STUCOR
parent record.
Physical Data Models – These models describe data at the lowest level of abstraction.
Three Schema Architecture
The goal of the three schema architecture is to separate the user applications and the physical database. The schemas
can be defined at the following levels:
APP
1. The internal level – has an internal schema which describes the physical storage structure of the database.
Uses a physical data model and describes the complete details of data storage and access paths for the
database.
2. The conceptual level – has a conceptual schema which describes the structure of the database for users. It
hides the details of the physical storage structures, and concentrates on describing entities, data types,
relationships, user operations and constraints. Usually a representational data model is used to describe the
conceptual schema.
3. The External or View level – includes external schemas or user vies. Each external schema describes the
part of the database that a particular user group is interested in and hides the rest of the database from that
user group. Represented using the representational data model.
The three schema architecture is used to visualize the schema levels in a database. The three schemas are only
descriptions of data, the data only actually exists is at the physical level.
COMPONENTS OF DBMS
Database Users
STUCOR
Users are differentiated by the way they expect to interact with the system
• Application programmers
• Sophisticated users
• Naïve users
• Database Administrator
• Specialized users etc,.
APP
Application programmers:
Professionals who write application programs and using these application programs they interact with
the database system
Sophisticated users :
These user interact with the database system without writing programs, But they submit queries to
retrieve the information
Specialized users:
Who write specialized database applications to interact with the database system.
Naïve users:
Interacts with the database system by invoking some application programs that have been written
previously by application programmers
Eg : people accessing database over the web
Database Administrator:
Coordinates all the activities of the database system; the database administrator has a good understanding of
the enterprise’s information resources and needs.
Schema definition
Access method definition
Schema and physical organization modification
Granting user authority to access the database
Monitoring performance
Storage Manager
The Storage Manager include these following components/modules
Authorization Manager
Transaction Manager
File Manager
Buffer Manager
Storage manager is a program module that provides the interface between the low-level data stored in the
database and the application programs and queries submitted to the system.
The storage manager is responsible to the following tasks:
interaction with the file manager
efficient storing, retrieving and updating of data
Authorization Manager
Checks whether the user is an authorized person or not
Test the satisfaction of integrity constraints
Transaction Manager
Responsible for concurrent transaction execution It ensures that the database remains in a consistent state
despite of the system failure
EVOLUTION OF RDBMS
STUCOR
Before the acceptance of Codd’s Relational Model, database management systems was just an ad hoc collection of
data designed to solve a particular type of problem, later extended to solve more basic purposes. This led to complex
systems, which were difficult to understand, install, maintain and use. These database systems were plagued with the
following problems:
• They required large budgets and staffs of people with special skills that were in short supply.
• Database administrators’ staff and application developers required prior preparation to access these database
APP
systems.
• End-user access to the data was rarely provided.
• These database systems did not support the implementation of business logic as a DBMS responsibility.
Hence, the objective of developing a relational model was to address each and every one of the shortcomings that
plagued those systems that existed at the end of the 1960s decade, and make DBMS products more widely appealing
to all kinds of users.
The existing relational database management systems offer powerful, yet simple solutions for a wide variety of
commercial and scientific application problems. Almost every industry uses relational systems to store, update and
retrieve data for operational, transaction, as well as decision support systems.
RELATIONAL DATABASE
A relational database is a database system in which the database is organized and accessed according to the
relationships between data items without the need for any consideration of physical orientation and relationship.
Relationships between data items are expressed by means of tables.
It is a tool, which can help you store, manage and disseminate information of various kinds. It is a collection of
objects, tables, queries, forms, reports, and macros, all stored in a computer program all of which are inter-related.
It is a method of structuring data in the form of records, so that relations between different entities and attributes can
be used for data access and transformation.
RELATIONAL DATABASE MANAGEMENT SYSTEM
A Relational Database Management System (RDBMS) is a system, which allows us to perceive data as tables (and
nothing but tables), and operators necessary to manipulate that data are at the user’s disposal.
Features of an RDBMS
The features of a relational database are as follows:
The ability to create multiple relations (tables) and enter data into them
An interactive query language
Retrieval of information stored in more than one table
Provides a Catalog or Dictionary, which itself consists of tables ( called system tables )
STUCOR
Key: An attribute or set of attributes whose values uniquely identify each entity in an entity set is called a key for that
entity set.
Super Key: If we add additional attributes to a key, the resulting combination would still uniquely identify an
instance of the entity set. Such augmented keys are called super keys.
Primary Key: It is a minimum super key.
It is a unique identifier for the table (a column or a column combination with the property that at any given time no two
APP
rows of the table contain the same value in that column or column combination).
Foreign Key: A foreign key is a field (or collection of fields) in one table that uniquely identifies a row of another table.
In simpler words, the foreign key is defined in a second table, but it refers to the primary key in the first table.
Candidate Key: There may be two or more attributes or combinations of attributes that uniquely identify an
instance of an entity set. These attributes or combinations of attributes are called candidate keys.
Secondary Key: A secondary key is an attribute or combination of attributes that may not be a candidate key, but that
classifies the entity set on a particular characteristic. Any key consisting of a single attribute is called a simple key,
while that consisting of a combination of attributes is called a composite key.
Referential Integrity
Referential Integrity can be defined as an integrity constraint that specifies that the value (or existence) of an attribute in
one relation depend on the value (or existence) of an attribute in the same or another relation. Referential integrity in
a relational database is consistency between coupled tables. It is usually enforced by the combination of a primary
key and a foreign key. For referential integrity to hold, any field in a table that is declared a foreign key can contain
only values from a parent table's primary key field. For instance, deleting a record that contains a value referred to
by a foreign key in another table would break referential integrity.
Relational Model
Relational data model is the primary data model, which is used widely around the world for data storage and
processing. This model is simple and it has all the properties and capabilities required to process data with storage
efficiency.
Concepts
Tables − In relational data model, relations are saved in the format of Tables. This format stores the relation among entities.
A table has rows and columns, where rows represents records and columns represent the attributes.
Tuple − A single row of a table, which contains a single record for that relation is called a tuple.
Relation instance − A finite set of tuples in the relational database system represents relation instance. Relation instances
do not have duplicate tuples.
Relation schema − A relation schema describes the relation name (table name), attributes, and their names. Relation
key − Each row has one or more attributes, known as relation key, which can identify the row in the relation
(table) uniquely.
Attribute domain − Every attribute has some pre-defined value scope, known as attribute domain.
Constraints
Every relation has some conditions that must hold for it to be a valid relation. These conditions are
called Relational Integrity Constraints. There are three main integrity constraints −
Key constraints
Domain constraints
Referential integrity constraints
Key Constraints
There must be at least one minimal subset of attributes in the relation, which can identify a tuple uniquely. This
minimal subset of attributes is called keyfor that relation. If there are more than one such minimal subsets, these are
called candidate keys.
Key constraints force that −
STUCOR
Relational Algebra
Relational algebra is a procedural query language, which takes instances of relations as input and yields instances
of relations as output. It uses operators to perform queries. An operator can be either unary or binary. They accept
relations as their input and yield relations as their output. Relational algebra is performed recursively on a relation
and intermediate results are also considered relations.
The fundamental operations of relational algebra are as follows −
APP
Select
Project
Union
Set different
Cartesian product
Rename
We will discuss all these operations in the following sections.
Select Operation (σ)
It selects tuples that satisfy the given predicate from a relation.
Notation − σp(r)
Where σ stands for selection predicate and r stands for relation. p is prepositional logic formula which may use
connectors like and, or, and not. These terms may use relational operators like − =, ≠, ≥, < , >, ≤.
For example −
σsubject = "database"(Books)
Output − Selects tuples from books where subject is 'database'.
σsubject = "database" and price = "450"(Books)
Output − Selects tuples from books where subject is 'database' and 'price' is 450.
σsubject = "database" and price = "450" or year > "2010"(Books)
Output − Selects tuples from books where subject is 'database' and 'price' is 450 or those books published after
2010.
Project Operation (∏)
It projects column(s) that satisfy a given predicate.
Notation − ∏A1, A2, An (r)
Where A1, A2 , An are attribute names of relation r.
Duplicate rows are automatically eliminated, as relation is a set.
For example −
∏subject, author (Books)
Selects and projects columns named as subject and author from the relation Books.
Union Operation (∪)
It performs binary union between two given relations and is defined as −
r ∪ s = { t | t ∈ r or t ∈ s}
Notation − r U s
Where r and s are either database relations or relation result set (temporary relation).
For a union operation to be valid, the following conditions must hold −
r, and s must have the same number of attributes.
Attribute domains must be compatible.
s = { q t | q ∈ r and t ∈ s}
Where r and s are relations and their output will be defined as − r Χ
STUCOR
σauthor = 'tutorialspoint'(Books Χ Articles)
Output − Yields a relation, which shows all the books and articles written by tutorialspoint.
Rename Operation (ρ)
The results of relational algebra are also relations but without any name. The rename operation allows us to rename the
APP
output relation. 'rename' operation is denoted with small Greek letter rho ρ.
Notation − ρ x (E)
Where the result of expression E is saved with name of x.
Additional operations are −
Set intersection
Assignment
Natural join
SQL FUNDAMENTALS:
SQL is a standard computer language for accessing and manipulating databases.
What is SQL?
SQL stands for Structured Query Language
SQL allows you to access a database
SQL is an ANSI standard computer language
SQL can execute queries against a database
SQL can retrieve data from a database
SQL can insert new records in a database
SQL can delete records from a database
SQL can update records in a database
SQL is easy to learn
STUCOR
SQL (Structured Query Language) is a syntax for executing queries. But the SQL language also includes a syntax to
update, insert, and delete records.
These query and update commands together form the Data Manipulation Language (DML) part of SQL:
SELECT - extracts data from a database table
UPDATE - updates data in a database table
DELETE - deletes data from a database table
APP
INSERT INTO - inserts new data into a database table
STUCOR
Semicolon is the standard way to separate each SQL statement in database systems that allow more than one SQL
statement to be executed in the same call to the server.
Some SQL tutorials end each SQL statement with a semicolon. Is this necessary? We are using MS Access and SQL
Server 2000 and we do not have to put a semicolon after each SQL statement, but some database programs force you
to use it.
APP
The SELECT DISTINCT Statement
The DISTINCT keyword is used to return only distinct (different) values.
The SELECT statement returns information from table columns. But what if we only want to select distinct
elements?
With SQL, all we need to do is to add a DISTINCT keyword to the SELECT statement:
Syntax
SELECT DISTINCT column_name(s)
FROM table_name
"Orders" table
Company OrderNumber
Sega 3412
W3Schools 2312
Trio 4678
W3Schools 6798
Result
Company
Sega
W3Schools
Trio
W3Schools
Note that "W3Schools" is listed twice in the result-set.
To select only DIFFERENT values from the column named "Company" we use a SELECT DISTINCT statement
like this:
SELECT DISTINCT Company FROM Orders
Result:
Company
Sega
W3Schools
Trio
Now "W3Schools" is listed only once in the result-set.
STUCOR
< Less than
>= Greater than or equal
<= Less than or equal
BETWEEN Between an inclusive range
LIKE Search for a pattern
APP
Note: In some versions of SQL the <> operator may be written as !=
Using Quotes
Note that we have used single quotes around the conditional values in the examples.
SQL uses single quotes around text values (most database systems will also accept double quotes). Numeric values
should not be enclosed in quotes.
For text values:
This is correct:
SELECT * FROM Persons WHERE FirstName='Tove'
This is wrong:
SELECT * FROM Persons WHERE FirstName=Tove
For numeric values:
This is correct:
SELECT * FROM Persons WHERE Year>1965
This is wrong:
SELECT * FROM Persons WHERE Year>'1965'
Using LIKE
The following SQL statement will return persons with first names that start with an 'O':
SELECT * FROM Persons
WHERE FirstName LIKE 'O%'
The following SQL statement will return persons with first names that end with an 'a':
SELECT * FROM Persons
WHERE FirstName LIKE '%a'
The following SQL statement will return persons with first names that contain the pattern 'la':
SELECT * FROM Persons
STUCOR
WHERE FirstName LIKE '%la%'
The INSERT INTO Statement
The INSERT INTO statement is used to insert new rows into a table.
Syntax
INSERT INTO table_name
VALUES (value1, value2,. . .)
APP
You can also specify the columns for which you want to insert data:
INSERT INTO table_name (column1, column2,...)
VALUES (value1, value2,. . .)
STUCOR
Result:
LastName FirstName Address City
Nilsen Fred Kirkegt 56 Stavanger
Rasmussen Nina Storgt 67
APP
We want to change the address and add the name of the city:
UPDATE Person
SET Address = 'Stien 12', City = 'Stavanger'
WHERE LastName = 'Rasmussen'
Result:
LastName FirstName Address City
Nilsen Fred Kirkegt 56 Stavanger
Rasmussen Nina Stien 12 Stavanger
The DELETE Statement
The DELETE statement is used to delete rows in a table.
Syntax
DELETE FROM table_name
WHERE column_name = some_value
Person:
LastName FirstName Address City
Nilsen Fred Kirkegt 56 Stavanger
Rasmussen Nina Stien 12 Stavanger
Delete
Drop
Delete a Row
"Nina Rasmussen" is going to be deleted:
DELETE FROM Person WHERE LastName = 'Rasmussen'
Result
LastName FirstName Address City
Nilsen Fred Kirkegt 56 Stavanger
STUCOR
ORDER BY Company ASC (asending)
Result:
Company OrderNumber
ABC Shop 5678
Sega 3412
APP
W3Schools 6798
W3Schools 2312
Example
To display the company names in alphabetical order AND the OrderNumber in numerical order:
SELECT Company, OrderNumber FROM Orders
ORDER BY Company, OrderNumber
Result:
Company OrderNumber
ABC Shop 5678
Sega 3412
W3Schools 2312
W3Schools 6798
Aggregate functions
Aggregate functions operate against a collection of values, but return a single value.
Note: If used among many other expressions in the item list of a SELECT statement, the SELECT must have a
GROUP BY clause!!
"Persons" table (used in most examples)
Name Age
Hansen, Ola 34
Svendson, Tove 45
Pettersen, Kari 19
Aggregate functions in MS Access
Function Description
AVG(column) Returns the average value of a column
COUNT(column) Returns the number of rows (without a NULL value) of a column
COUNT(*) Returns the number of selected rows
FIRST(column) Returns the value of the first record in a specified field
LAST(column) Returns the value of the last record in a specified field
MAX(column) Returns the highest value of a column
MIN(column) Returns the lowest value of a column
STDEV(column)
STDEVP(column)
SUM(column) Returns the total sum of a column
VAR(column)
STUCOR
LAST(column) Returns the value of the last record in a specified field (not supported in
SQLServer2K)
MAX(column) Returns the highest value of a column
MIN(column) Returns the lowest value of a column
STDEV(column)
STDEVP(column)
APP
SUM(column) Returns the total sum of a column
VAR(column)
VARP(column)
Scalar functions
Scalar functions operate against a single value, and return a single value based on the input value.
Useful Scalar Functions in MS Access
Function Description
UCASE(c) Converts a field to upper case
LCASE(c) Converts a field to lower case
MID(c,start[,end]) Extract characters from a text field
LEN(c) Returns the length of a text field
INSTR(c,char) Returns the numeric position of a named character within a text field
LEFT(c,number_of_char) Return the left part of a text field requested
RIGHT(c,number_of_char) Return the right part of a text field requested
ROUND(c,decimals) Rounds a numeric field to the number of decimals specified
MOD(x,y) Returns the remainder of a division operation
GROUP BY Example
This "Sales" Table:
Company Amount
W3Schools 5500
IBM 4500
W3Schools 7100
And This SQL:
Company
STUCOR
Returns this result:
W3Schools
IBM
SUM(Amount)
12600
4500
APP
HAVING…
HAVING... was added to SQL because the WHERE keyword could not be used against aggregate functions (like
SUM), and without HAVING... it would be impossible to test for result conditions.
The syntax for the HAVING function is:
SELECT column,SUM(column) FROM table
GROUP BY column
HAVING SUM(column) condition value
This "Sales" Table:
Company Amount
W3Schools 5500
IBM 4500
W3Schools 7100
This SQL:
SELECT Company,SUM(Amount) FROM Sales
GROUP BY Company
HAVING SUM(Amount)>10000
Returns this result
Company SUM(Amount)
W3Schools 12600
EMBEDDED SQL
Embedded SQL is a method of inserting inline SQL statements or queries into the code of a programming language,
which is known as a host language. Because the host language cannot parse SQL, the inserted SQL is parsed by an
embedded SQL preprocessor.
Embedded SQL is a robust and convenient method of combining the computing power of a programming language
with SQL's specialized data management and manipulation capabilities.
Structure of embedded SQL
Structure of embedded SQL defines step by step process of establishing a connection with DB and executing the
code in the DB within the high level language.
Connection to DB
This is the first step while writing a query in high level languages. First connection to the DB that we are accessing
needs to be established. This can be done using the keyword CONNECT. But it has to precede with ‘EXEC SQL’ to
STUCOR
it from normal C variables. Hence we have to declare host variables within BEGIN DECLARE and END
DECLARE section. Again, these declare block should be enclosed within EXEC SQL and ‘;’.
EXEC SQL BEGIN DECLARE SECTION;
int STD_ID;
char STD_NAME [15];
APP
char ADDRESS[20];
EXEC SQL END DECLARE SECTION;
We can note here that variables are written inside begin and end block of the SQL, but they are declared using C
code. It does not use SQL code to declare the variables. Why? This is because they are host variables – variables of
C language. Hence we cannot use SQL syntax to declare them. Host language supports almost all the datatypes from
int, char, long, float, double, pointer, array, string, structures etc.
When host variables are used in a SQL query, it should be preceded by colon – ‘:’ to indicate that it is a host
variable. Hence when pre-compiler compiles SQL code, it substitutes the value of host variable and compiles.
EXEC SQL SELECT * FROM STUDENT WHERE STUDENT_ID =:STD_ID;
The following code is a simple embedded SQL program, written in C. The program illustrates many, but
not all, of the embedded SQL techniques. The program prompts the user for an order number, retrieves the
customer number, salesperson, and status of the order, and displays the retrieved information on the screen.
int main() {
EXEC SQL INCLUDE SQLCA;
EXEC SQL BEGIN DECLARE SECTION;
int OrderID; /* Employee ID (from user) */
int CustID; /* Retrieved customer ID */
char SalesPerson[10] /* Retrieved salesperson name */
char Status[6] /* Retrieved order status */
EXEC SQL END DECLARE SECTION;
query_error:
printf ("SQL error: %ld\n", sqlca->sqlcode);
exit();
bad_number:
STUCOR
printf ("Invalid order number.\n");
exit();
}
DYNAMIC SQL
The main disadvantage of embedded SQL is that it supports only static SQLs. If we need to build up queries at
run time, then we can use dynamic sql. That means if query changes according to user input, then it always better to
APP
use dynamic SQL. Like we said above, the query when user enters student name alone and when user enters both
student name and address, is different. If we use embedded SQL, one cannot implement this requirement in the code.
In such case dynamic SQL helps the user to develop query depending on the values entered by him, without making
him know which query is being executed. It can also be used when we do not know which SQL statements like
Insert, Delete update or select needs to be used, when number of host variables is unknown, or when datatypes of
host variables are unknown or when there is direct reference to DB objects like tables, views, indexes are required.
However this will make user requirement simple and easy but it may make query lengthier and complex. That
means depending upon user inputs, the query may grow or shrink making the code flexible enough to handle all the
possibilities. In embedded SQL, compiler knows the query in advance and pre-compiler compiles the SQL code
much before C compiles the code for execution. Hence embedded SQLs will be faster in execution. But in the case
of dynamic SQL, queries are created, compiled and executed only at the run time. This makes the dynamic SQL
little complex, and time consuming.
Since query needs to be prepared at run time, in addition to the structures discussed in embedded SQL, we have
three more clauses in dynamic SQL. These are mainly used to build the query and execute them at run time.
PREPARE
Since dynamic SQL builds a query at run time, as a first step we need to capture all the inputs from the user. It will
be stored in a string variable. Depending on the inputs received from the user, string variable is appended with inputs
and SQL keywords. These SQL like string statements are then converted into SQL query. This is done by using
PREPARE statement.
For example, below is the small snippet from dynamic SQL. Here sql_stmt is a character variable, which holds
inputs from the users along with SQL commands. But is cannot be considered as SQL query as it is still a sting
value. It needs to be converted into a proper SQL query which is done at the last line using PREPARE statement.
Here sql_query is also a string variable, but it holds the string as a SQL query.
EXECUTE
This statement is used to compile and execute the SQL statements prepared in DB.
EXEC SQL EXECUTE sql_query;
EXECUTE IMMEDIATE
This statement is used to prepare SQL statement as well as execute the SQL statements in DB. It performs the task
of PREPARE and EXECUTE in a single line.
EXEC SQL EXECUTE IMMEDIATE :sql_stmt;
Dynamic SQL will not have any SELECT queries and host variables. But it can be any other SQL statements like
insert, delete, update, grant etc. But when we use insert/ delete/ updates in this type, we cannot use host variables.
All the input values will be hardcoded. Hence the SQL statements can be directly executed using EXECUTE
IMMEDIATE rather than using PREPARE and then EXECUTE.
STUCOR
APP