Dbms Unit 1 2
Dbms Unit 1 2
UNIT -1
UNIT -1
The Worlds of Database Systems
What is Data?
The raw facts are called as data. The word ―raw‖ indicates that they have not been processed.
What is information?
What is Knowledge?
DATA/INFORMATION PROCESSING:
The process of converting the data (raw facts) into meaningful information is called as
data/information processing.
When When
Data Information Knowledge
Processed Processed
Note: In business processing knowledge is more useful to make decisions for any organization.
DATA INFORMATION
1.Raw facts 1.Processed data
2. It is in unorganized form 2. It is in organized form
3. Data doesn’t help in 3. Informationhelpsin
decision decision
making process making process
The earliest business computer systems were used to process business records and
produce information. They were generally faster and more accurate than equivalent manual
systems. These systems stored groups of records in separate files, and so they were called file
processing systems.
1. File system is a collection of data. Any management with the file system, user has
to write the procedures
2. File system gives the details of the data representation and Storage of data.
4. Concurrent access to the data in the file system has many problems like a Reading
the file while other deleting some information, updating some information
Since files and application programs are created by different programmers over a long
period of time, the files are likely to be having different formats and the programs may be
written in several programming languages. Moreover, the same piece of information may be
duplicated in several places. This redundancy leads to higher storage and access cost. In
addition, it may lead to data inconsistency.
Data Isolation:
Since data is scattered in various files, and files may be in different formats, it is
difficult to write new application programs to retrieve the appropriate data.
In order to improve the overall performance of the system and obtain a faster response
time, many systems allow multiple users to update the data simultaneously. In such an
environment, interaction of concurrent updates may result in inconsistent data.
Security Problems:
Not every user of the database system should be able to access all the data. For
example, in banking system, payroll personnel need only that part of the database that has
information about various bank employees. They do not need access to information about
customer accounts. It is difficult to enforce such security constraints.
Integrity Problems:
The data values stored in the database must satisfy certain types of consistency
constraints. For example, the balance of a bank account may never fall below a prescribed
amount. These constraints are enforced in the system by adding appropriate code in the various
application programs. When new constraints are added, it is difficult to change the programs
to enforce them. The problem is compounded when constraints involve several data items for
different files.
Atomicity Problem:
A computer system like any other mechanical or electrical device is subject to failure.
In many applications, it is crucial to ensure that once a failure has occurred and has been
detected, the data are restored to the consistent state existed prior to the failure
Example:
Consider part of a savings-bank enterprise that keeps information about all customers
and savings accounts. One way to keep the information on a computer is to store it in
operating system files. To allow users to manipulate the information, the system has a
number of application programs that manipulate the files, including:
Programmers wrote these application programs to meet the needs of the bank. New
application programs are added to the system as the need arises. For example, suppose that
the savings bank decides to offer checking accounts.
As a result, the bank creates new permanent files that contain information about all the
checking accounts maintained in the bank, and it may have to write new application programs
to deal with situations that do not arise in savings accounts, such as overdrafts. Thus, as time
goes by, the system acquires more files and more application programs. The system stores
permanent records in various files, and it needs different
Application programs to extract records from, and add records to, the appropriate
files. Before database management systems (DBMS) came along, organizations usually
stored information in such systems. Organizational information in a file-processing system
has a number of major disadvantages:
The address and telephone number of a particular customer may appear in a file that
consists of savings-account records and in a file that consists of checking-account records. This
redundancy leads to higher storage and access cost. In, it may lead to data inconsistency; that is,
the various copies of the same data may no longer agree. For example, a changed customer
address may be reflected in savings-account records but not elsewhere in the system.
Suppose that one of the bank officers needs to find out the names of all customers
who live within a particular postal-code area. The officer asks the data-processing department
to generate such a list. Because there is no application program to generate that. The bank
officer has now two choices: either obtain the list of all customers and extract the needed
information manually or ask a system programmer to write the necessary application
program. Both alternatives are obviously unsatisfactory.
3. Data Isolation:
Because data are scattered in various files and files may be in different formats,
writing new application programs to retrieve the appropriate data is difficult.
4. Integrity Problems:
The balance of a bank account may never fall below a prescribed amount (say, $25).
Developers enforce these constraints in the system by adding appropriate code in the various
application programs. However, when new constraints are added, it is difficult to change the
programs to enforce them. The problem is compounded when constraints involve several data
items from different files.
5. Atomicity Problems:
A computer system, like any other mechanical or electrical device, is subject to
failure. In many applications, it is crucial that, if a failure occurs, the data be restored to the
consistent state that existed prior to the failure. Consider a program to transfer $50 from
account A to account B. If a system failure occurs during the execution of the program, it is
possible that the $50 was removed from account A but was not credited to account B,
resulting in an inconsistent database state. Clearly, it is essential to database consistency that
either both the credit and debit occur, or that neither occur. That is, the funds transfer must be
atomic—it must happen in its entirety or not at all. It is difficult to ensure atomicity in a
conventional file-processing system.
6. Concurrent-Access Anomalies:
For the sake of overall performance of the system and faster response, many systems
allow multiple users to update the data simultaneously. In such an environment, interaction of
concurrent updates may result in inconsistent data. Consider bank account A, containing
$500. If two customers withdraw funds (say $50 and $100 respectively) from account A at
about the same time, the result of the concurrent executions may leave the account in an
incorrect (or inconsistent) state. Suppose that the programs executing on behalf of each
withdrawal read the old balance, reduce that value by the amount being withdrawn, and write
the result back. If the two programs run concurrently, they may both read the value $500, and
write back $450 and $400, respectively. Depending on which one writes the value last, the
account may contain $450 or $400, rather than the correct value of $350. To guard against
this possibility, the system must maintain some form of supervision. But supervision is
difficult to provide because data may be accessed by many different application programs
that have not been coordinated previously.
7. Security Problems:
Not every user of the database system should be able to access all the data. For
example, in a banking system, payroll personnel need to see only that part of the database
that has information about the various bank employees. They do not need access to
information about customer accounts. But, since application programs are added to the
system in an ad hoc manner, enforcing such security constraints is difficult. These
difficulties, among others, prompted the development of database systems.
INTRODUCTION TO DATABASES:
Data processing tasks such as payroll were automated, with data stored on tapes.
Data could also be input from punched card decks, and output to printers.
Late 1960s and 1970s: The use of hard disks in the late 1960s changed the scenario for
data processing greatly, since hard disks allowed direct access to data.
With disks, network and hierarchical databases could be created that allowed data
structures such as lists and trees to be stored on disk. Programmers could construct and
manipulate these data structures.
With disks, network and hierarchical databases could be created that allowed data
structures such as lists and trees to be stored on disk. Programmers could construct and
manipulate these data structures.
In the 1980’s:
Initial commercial relational database systems, such as IBM DB2, Oracle, Ingress,
and DEC Rdb, played a major role in advancing techniques for efficient processing of
declarative queries.
In the early 1980s, relational databases had become competitive with network and
hierarchical database systems even in the area of performance.
The 1980s also saw much research on parallel and distributed databases, as well
as initial work on object-oriented databases.
Early 1990s:
Decision support and querying re-emerged as a major application area for databases.
Late 1990s:
The major event was the explosive growth of the World Wide Web.
Databases were deployed much more extensively than ever before. Database systems
now had to support very high transaction processing rates, as well as very high
reliability and 24 * 7 availability (availability 24 hours a day, 7 days a week, meaning
no downtime for scheduled maintenance activities).
The file management system also called as FMS in short is one in which all data is
stored on a single large file. The main disadvantage in this system is searching a record or
data takes a long time. This lead to the introduction of the concept, of indexing in this system.
Then also the FMS system had lot of drawbacks to name a few like updating or modifications
to the data cannot be handled easily, sorting the records took long time and so on. All these
drawbacks led to the introduction of the Hierarchical Database System.
The previous system FMS drawback of accessing records and sorting records which
took a long time was removed in this by the introduction of parent-child relationship between
records in database. The origin of the data is called the root from which several branches have
data at different levels and the last level is called the
leaf. The main drawback in this was if there is any modification or addition made to the structure
then the whole structure needed alteration which made the task a tedious one. In order to avoid
this next system took its origin which is called as the Network Database System.
In this the main concept of many-many relationships got introduced. But this also
followed the same technology of pointers to define relationships with a difference in this made
in the introduction if grouping of data items as sets.
In order to overcome all the drawbacks of the previous systems, the Relational
Database System got introduced in which data get organized as tables and each record forms a
row with many fields or attributes in it. Relationships between tables are also formed in this
system.
DATABASE:
(OR)
The following are the various kinds of applications/organizations uses databases for their
business processing activities in their day-to-day life. They are:
1. Banking: For customer information, accounts, and loans, and banking transactions.
2. Airlines: For reservations and schedule information. Airlines were among the first to use
4. Credit Card Transactions: For purchases on credit cards and generation of monthly
statements.
6. Finance: For storing information about holdings, sales, and purchases of financial
instruments such as stocks and bonds.
8. Manufacturing: For management of supply chain and for tracking production of items in
factories, inventories of items in warehouses/stores, and orders for items.
9. Human resources: For information about employees, salaries, payroll taxes and benefits,
and for generation of paychecks.
11. Web: For access the Back accounts and to get the balance amount.
12. E –Commerce: For Buying a book or music CD and browse for things like watches,
mobiles from the Internet.
CHARACTERISTICS OF DATABASE:
The database approach has some very characteristic features which are discussed in detail
below:
Structured and Described Data:
Fundamental feature of the database approach is that the database system does not only
contain the data but also the complete definition and description of these data. These
descriptions are basically details about the extent, the structure, the type and the format of all
data and, additionally, the relationship between the data. This kind of stored data is called
metadata ("data about data").
Application software does not need any knowledge about the physical data storage like
encoding, format, storage place, etc. It only communicates with the management system of a
database (DBMS) via a standardized interface with the help of a standardized language like
SQL. The access to the data and the metadata is entirely done by the DBMS. In this way all
the applications can be totally separated from the data.
Data Integrity:
Data integrity is a byword for the quality and the reliability of the data of a database
system. In a broader sense data integrity includes also the protection of the database from
unauthorized access (confidentiality) and unauthorized changes. Data reflect facts of the real
world.
Transactions:
A transaction is a bundle of actions which are done within a database to bring it from
one consistent state to a new consistent state. In between the data are inevitable inconsistent. A
transaction is atomic what
means that it cannot be divided up any further. Within a transaction all or none of the actions
need to be carried out. Doing only a part of the actions would lead to an inconsistent database
state.
Data Persistence:
Data persistence means that in a DBMS all data is maintained as long as it is not
deleted explicitly. The life span of data needs to be determined directly or indirectly be the
user and must not be dependent on system features. Additionally data once stored in a
database must not be lost. Changes of a database which are done by a transaction are
persistent. When a transaction is finished even a system crash cannot put the data in danger
TYPES OF DATABASES:
1. Number of Users
2. Database Location
3. Expected type
4. Extent of use
Multiuser database:
Multi user database supports multiple users at the same time.
Workgroup database and enterprise databases are examples for multiuser database.
Workgroup database:
If the multiuser database supports relatively small number of users (fewer than 50)
within an organization is called as Workgroup database.
Enterprise database:
If the database is used by the entire organization and supports multiple users (more
than 50) across many departments is called as Enterprise database.
2. Based on Location:
According to the location of database the databases can be classified into following types.
They are:
a).CentralizedDatabase
b).Distributed Database
Centralized Database:
It is a database that is located, stored, and maintained in a single location. This location
is most often a central computer or database system, for example a desktop or server CPU, or
a mainframe computer. In most cases, a centralized database would be used by an
organization (e.g. a business company) or an institution (e.g. a university.)
Distributed Database:
A distributed database is a database in which storage devices are not all attached to a
common CPU. It may be stored in multiple computers located in the same physical location,
or may be dispersed over a network of interconnected computers.
The DBMS is a general purpose software system that facilitates the process of
defining constructing and manipulating databases for various applications.
Goals of DBMS:
The primary goal of a DBMS is to provide a way to store and retrieve database
information that is both convenient and efficient
Properties of DBMS:
1. A Database represents some aspect of the real world. Changes to the real world reflected in
the database.
Need of DBMS:
1. Before the advent of DBMS, organizations typically stored information using a ―File
Processing Systems‖.
Example of such systems is File Handling in High Level Languages like C, Basic and
COBOL etc., these systems have Major disadvantages to perform the Data Manipulation. So
to overcome those drawbacks now we are using the DBMS.
3. In addition to that the database system must ensure the safety of the information stored,
despite system crashes or attempts at unauthorized access. If data are to be shared among
several users, the system must avoid possible anomalous results.
Data Independence:
If data is always accessed through the DBMS, the DBMS can enforce integrity
constraints on the data. For example, before inserting salary information for an employee, the
DBMS can check that the department budget is not exceeded. Also, the DBMS can enforce
access controls that govern what data is visible to different classes of users.
A database system allows several users to access the database concurrently. Answering
different questions from different users with the same (base) data is a central aspect of an
information system. Such concurrent use of data increases the economy of a system.
An example for concurrent use is the travel database of a bigger travel agency. The
employees of different branches can access the database concurrently and book journeys for
their clients. Each travel agent sees on his interface if there are still seats available for a
specific journey or if it is already fully booked.
A DBMS also protects data from failures such as power failures and crashes etc. by
the recovery schemes such as backup mechanisms and log files etc.
Data Administration:
When several users share the data, centralizing the administration of data can offer
significant improvements. Experienced professionals, who understand the nature of the data
being managed, and how different groups of users use it, can be responsible for organizing the
data representation to minimize redundancy and fine-tuning the storage of the data to make
retrieval efficient.
DBMS supports many important functions that are common to many applications
accessing data stored in the DBMS. This, in conjunction with the high-level interface to the
data, facilitates quick development of applications. Such applications are also likely to be
more robust than applications developed from scratch because many important tasks are
handled by the DBMS instead of being implemented by the application.
DISADVANTAGES OF DBMS:
Danger of a Overkill:
For small and simple applications for single users a database system is often not
advisable.
Complexity:
A database system creates additional complexity and requirements. The supply and
operation of a database management system with several users and databases is quite costly
and demanding.
Qualified Personnel:
Costs:
Through the use of a database system new costs are generated for the system itself but
also for additional hardware and the more complex handling of the system.
Lower Efficiency:
A database system is a multi-use software which is often less efficient than specialized
software which is produced and optimized exactly for one problem.
People who work with a database can be categorized as database users or database
administrators.
Database Users:
There are four different types of database-system users, differentiated by the way they
expect to interact with the system.
Naive users:
Naive users are unsophisticated users who interact with the system by invoking one of the
application programs that have been written previously.
For example, a bank teller who needs to transfer $50 from account A to account B
invokes a program called transfer. This program asks the teller for the amount of money to be
transferred, the account from which the money is to be transferred, and the account to which
the money is to be transferred.
Application programmers:
Sophisticated users:
Sophisticated users interact with the system without writing programs. Instead, they
form their requests in a database query language. They submit each such query to a query
processor, whose function is to break down DML statements into instructions that the storage
manager understands. Analysts who submit queries to explore data in the database fall in this
category.
Specialized users:
Specialized users are sophisticated users who write specialized database applications
that do not fit into the traditional data-processing framework.
Database Administrator:
One of the main reasons for using DBMSs is to have central control of both the data
and the programs that access those data. A person who has such central control over the
system is called a database administrator (DBA).
Schema definition:
The DBA creates the original database schema by executing a set of data definition
statements in the DDL, Storage structure and access-method definition.
The DBA carries out changes to the schema and physical organization to reflect the
changing needs of the organization, or to alter the physical organization to improve
performance.
Routine maintenance:
1. Periodically backing up the database, either onto tapes or onto remote servers, to prevent
loss of data in case of disasters such as flooding.
2. Ensuring that enough free disk space is available for normal operations, and upgrading disk
space as required.
3. Monitoring jobs running on the database and ensuring that performance is not degraded by
very expensive tasks submitted by some users.
Hiding certain details of how the data are stored and maintained. A major purpose of
database system is to provide users with an ―Abstract View‖ of the data. In DBMS there are 3
levels of data abstraction. The goal of the abstraction in the DBMS is to separate the users
request and the physical storage of data in the database.
Levels of Abstraction:
Physical Level:
The lowest Level of Abstraction describes ―How‖ the data are actually stored.
The physical level describes complex low level data structures in detail.
Logical Level:
This level of data Abstraction describes ―What‖ data are to be stored in the database and
what relationships exist among those data.
View Level:
It is the highest level of data Abstracts that describes only part of entire database.
Different users require different types of data elements from each database.
The system may provide many views for the some database.
Schema:
The overall design of the database is called the ―Schema‖ or ―Meta Data‖. A database
schema corresponds to the programming language type definition. The value of a variable in
programming language corresponds to an ―Instance‖ of a database Schema.
The goal of this architecture is to separate the user applications and the physical
database. In this architecture, schemas can be defined at the following three levels:
1. The internal level has an internal schema, which describes the physical storage structure
of the database. The internal schema uses a physical data model and describes the
complete details of data storage and access paths for the database.
2. The conceptual level has a conceptual schema, which describes the structure of the whole
database for a community of users. The conceptual schema hides the details of physical
storage structures and concentrates on describing entities, data types, relationships, user
operations, and constraints. A high-level data model or an implementation data model can
be used at this level.
3. The external or view level includes a number of external schemas or user views. Each
external schema describes the part of the database that a particular user group is interested
in and hides the rest of the database from that user group. A high-level data model or an
implementation data model can be used at this level.
DATA INDEPENDENCE:
The ability to modify a scheme definition in one level without affecting a scheme
definition in a higher level is called data independence.
The ability to modify the physical schema without causing application programs to be
rewritten
The ability to modify the conceptual schema without causing application programs to be
rewritten
Usually done when logical structure of database is altered
Logical data independence is harder to achieve as the application programs are usually
heavily dependent on the logical structure of the data.
A database system is partitioned into modules that deal with each of the
responsibilities of the overall system. The functional components of a database system can be
broadly divided into the storage manager and the query processor components.
The storage manager is important because databases typically require a large amount
of storage space. Some Big organizations Database ranges from Giga bytes to Tera bytes. So
the main memory of computers cannot store this much information, the information is stored
on disks. Data are moved between disk storage and main memory as needed.
The query processor also very important because it helps the database system simplify and
facilitate access to data. So quick processing of updates and queries is important. It is the job of
the database system to translate updates and queries written in a nonprocedural language,
StorageManager:
A storage manager is a program module that provides the interface between the low
level data stored in the database and the application programs and queries submitted to the
system. The storage manager is responsible for the interaction with the file manager. The
storage manager translates the various DML statements into low-level file-system commands.
Thus, the storage manager is responsible for storing, retrieving, and updating data in the
database.
Authorization and integrity manager which tests for the satisfaction of integrity
constraints and checks the authority of users to access data.
File manager: which manages the allocation of space on disk storage and the data
structures used to represent information stored on disk.
Buffer manager which is responsible for fetching data from disk storage into main
memory. Storage manager implements several data structures as part of the physical system
implementation. Data files are used to store the database itself. Data dictionary is used to
stores metadata about the structure of the database, in particular the schema of the database.
DDL interpreter: It interprets DDL statements and records the definitions in the data
dictionary.
DML compiler: It translates DML statements in a query language into an evaluation plan
consisting of low-level instructions that the query evaluation engine understands.
Query evaluation engine: It executes low-level instructions generated by the DML compiler.
Application Architectures:
Most users of a database system today are not present at the site of the database system,
but connect to it through a network. We can therefore differentiate between client machines, on
which remote database users’ work, and server machines, on which the database system runs.
Database applications are usually partitioned into two or three parts. They are:
Two-Tier Architecture:
The application is partitioned into a component that resides at the client machine,
which invokes database system functionality at the server machine through query language
statements. Application program interface standards like ODBC and JDBC are used for
interaction between the client and the server.
Three-Tier Architecture:
The client machine acts as merely a front end and does not contain any direct database
calls. Instead, the client end communicates with an application server, usually through forms
interface. The application server in turn communicates with a database system to access data.
The business logic of the application, which says what actions to carry out under what
conditions, is embedded in the application server, instead of being distributed across multiple
clients. Three-tier applications are more appropriate for large applications, and for
applications that run on the World Wide Web.
DATABASE DESIGN:
The database design process can be divided into six steps. The ER Model is most
relevant to the first three steps. Next three steps are beyond the ER Model.
1. Requirements Analysis:
The very first step in designing a database application is to understand what data is to
be stored in the database, what applications must be built on top of it, and what operations are
most frequent and subject to performance requirements. The database designers collect
information of the organization and analyzer, the information to identify the user’s
requirements. The database designers must find out what the users want from the database.
Once the information is gathered in the requirements analysis step a conceptual database
design is developed and is used to develop a high level description of the data to be stored in
the database, along with the constraints that are known to hold over this data. This step is
often carried out using the ER model, or a similar high-level data model.
In this step convert the conceptual database design into a database schema (Logical
Database Design) in the data model of the chosen DBMS. We will only consider relational
DBMSs, and therefore, the task in the
logical design step is to convert an ER schema into a relational database schema. The result is
a conceptual schema, sometimes called the logical schema, in the relational data model.
The first three steps are more relevant to the ER Model. Once the logical scheme is
defined designer consider the physical level implementation and finally provide certain
security measures. The remaining three steps of database design are briefly described below:
4. Schema Refinement:
The fourth step in database design is to analyze the collection of relations in our
relational database schema to identify potential problems, and to refine it. In contrast to the
requirements analysis and conceptual design steps, which are essentially subjective, schema
refinement can be guided by some elegant and powerful theory.
In this step we must consider typical expected workloads that our database must
support and further refine the database design to ensure that it meets desired performance
criteria. This step may simply involve building indexes on some tables and clustering some
tables, or it may involve a substantial redesign of parts of the database schema obtained from
the earlier design steps.
6. Security Design:
The last step of database design is to include security features. This is required to
avoid unauthorized access to database practice after all the six steps. We required Tuning step
in which all the steps are interleaved and repeated until the design is satisfactory.
DBMS FUNCTIONS:
DBMS performs several important functions that guarantee the integrity and
consistency of the data in the database.
Those functions transparent to end users and can be accessed only through the use of
DBMS. They include:
Data Dictionary Management
Data Storage Management
Data transformation and Presentation
Security Management
Multiple Access Control
Backup and Recovery Management
Data Integrity Management
Database Access Languages
Databases Communication Interfaces
DBMS stores definitions of database elements and their relationship (Metadata) in the
data dictionary.
The DBMS uses the data dictionary to look up the required data component structures
and relationships.
Any change made in database structure is automatically recorded in the data dictionary.
Modern DBMS provides storage not only for data but also for related data entities.
Data Storage Management is also important for database ―performance tuning‖.
Performance tuning related to activities that make database more efficiently.
Security Management:
DBMS creates a security system that enforces the user security and data privacy.
Security rules determines which users can access the database, which data items each
user can access etc.
DBA and authenticated user logged to DBMS through username and password or
through Biometric authentication such as Finger print and face reorganization etc.
To provide data integrity and data consistency, DBMS uses sophisticated algorithms to
ensure that multiple users can access the database concurrently without compromising
the integrity of database.
DBMS provides backup and recovery to ensure data safety and integrity.
Recovery management deals with the recovery of database after failure such as bad
sector in the disk or power failure. Such capability is critical to preserve database
integrity.
DBMS provides and enforces integrity rules, thus minimizing data redundancy
and maximizing data consistency.
A query language is a non-procedural language i.e. it lets the user specify what
must be done without specifying how it is to be done.
Current DBMS’s are accepting end-user requests via different network environments.
For example, DBMS might provide access to database via Internet through the use of
web browsers such as Mozilla Firefox or Microsoft Internet Explorer.
What is Schema?
A database schema is the skeleton structure that represents the logical view of the
entire database. (or)
It defines how the data is organized and how the relations among them are associated.
EG:
STUDENT
What is Instance?
The data stored in the database at any given time is an instance of the database
Student
In the above table 1201, 1202, Venkat etc are said to be instance of student table.
UNIT - 2
UNIT-3
Introduction to SQL:
What is SQL?
1. SQL is Structured Query Language, which is a computer language for storing, manipulating and
retrieving data stored in relational database.
2. SQL is the standard language for Relation Database System. All relational database management
systems like MySQL, MS Access, and Oracle, Sybase, Informix, postgres and SQL Server use
SQL as standard database language.
Why SQL?
5. Allows users to define the data in database and manipulate that data.
6. Allows embedding within other languages using SQL modules, libraries & pre-compilers.
History:
10. 1970 -- Dr. E. F. "Ted" of IBM is known as the father of relational databases. He described a
relational model for databases.
12. 1978 -- IBM worked to develop Codd's ideas and released a product named System/R.
13. 1986 -- IBM developed the first prototype of relational database and standardized by ANSI. The
first relational database was released by Relational Software and its later becoming Oracle.
SQL Process:
14. When you are executing an SQL command for any RDBMS, the system determines the best way
to carry out your request and SQL engine figures out how to interpret the task.
15. There are various components included in the process. These components are Query Dispatcher,
Optimization Engines, Classic Query Engine and SQL Query Engine, etc. Classic query engine
handles all non-SQL queries, but SQL query engine won't handle logical files.
SQL Process:
SQL Commands:
The standard SQL commands to interact with relational databases are CREATE, SELECT,
INSERT, UPDATE, DELETE and DROP. These commands can be classified into groups based on
their nature. They are:
DDL Commands
DML Commands
DCL Commands
DRL/DQL Commands
TCL Commands
Command Description
CREATE Creates a new table, a view of a table, or other object in
database
Command Description
INSERT Creates a record
UPDATE Modifies records
DELETE Deletes records
Command Description
GRANT Gives a privilege to user
REVOKE Takes back privileges granted from user
Command Description
SELECT Retrieves certain records from one or more
tables
Command Description
commit Save work done
Save point Identify a point in a transaction to which we
can later roll back.
Roll backs Restore database to original since the last
Commit
What is Query?
• A query is a question.
• A query is formulated for a relation/table to retrieve some useful information from the table.
• This SELECT command is used to retrieve the data from the database.
• For retrieving the data every query must have SELECT clause, which specifies what columns to
be selected.
• And FROM clause, which specifies the table’s names. The WHERE clause, specifies the
selection condition.
• SELECT: The SELECT list is list of column names of tables named in the FROM list.
Column names can be prefixed by a range variable.
• FROM: The FROM list in the FROM clause is a list of table names. A Table name can be
followed by a range variable. A range variable is particularly useful when the same table name
appears more than once in the from-list.
• WHERE: The qualification in the WHERE clause is a Boolean combination (i.e., an expression
using the logical connectives AND, OR, and NOT) of conditions of the form expression op
expression, where op is one of the comparison operators {<, <=, =, <>, >=,>}.
• DISTINCT: The DISTINCT keyword is used to display the unique tuple or eliminated the
duplicate tuple.
DDL Commands:
Create
Alter
Truncate
Drop
CREATE:
• Creating a basic table involves naming the table and defining its columns and each column's data
type.
Syntax:
CREATE TABLE table name (column1 datatype (size), column2 datatype (size),
column3 datatype (size) ... columnN datatype (size), PRIMARY KEY (one or more
columns)); Example:
SQL> create table customers (id number (10) not null, name varchar2 (20) not null, age number
(5) not null, address char (25), salary decimal (8, 2), primary key (id));
ALTER:
• SQL ALTER TABLE command is used to add, delete or modify columns in an existing table
Syntax:
• The basic syntax of ALTER TABLE to add a new column in an existing table is as follows:
ii) The basic syntax of ALTER TABLE to DROP COLUMN in an existing table is as follows:
• The basic syntax of ALTER TABLE to change the DATA TYPE of a column in a table is
as follows:
• The basic syntax of ALTER TABLE to add a NOT NULL constraint to a column in a table is
as follows:
Ex:
• The basic syntax of ALTER TABLE to ADD PRIMARY KEY constraint to a table is as
follows:
Ex:
TRUNCATE:
• SQL TRUNCATE TABLE command is used to delete complete data from an existing table.
Syntax:
The basic syntax of TRUNCATE TABLE is as follows:
EX:
DROP:
SQL DROP TABLE statement is used to remove a table definition and all data, indexes,
triggers, constraints, and permission specifications for that table.
Syntax:
Basic syntax of DROP TABLE statement is as follows:
DML Commands:
• Insert
• Update
• Delete
INSERT:
SQL INSERT INTO Statement is used to add new rows of data to a table in the
database. There are two basic syntaxes of INSERT INTO statement as follows:
Syntax1:
• Here, column1, column2...columnN are the names of the columns in the table into which
you want to insert data.
EX:
insert into customers (id,name,age,address,salary) values (1, 'ramesh', 32, 'ahmedabad', 2000);
insert into customers (id,name,age,address,salary) values (2, 'khilan', 25, 'delhi', 1500.00 );
2 rows inserted.
Syntax2:
INSERT INTO TABLE_NAME VALUES (value1, value2, value3...valueN);
Ex:
UPDATE:
• We can use WHERE clause with UPDATE query to update selected rows, otherwise all the rows
would be affected.
Syntax:
• The basic syntax of UPDATE query with WHERE clause is as follows:
EX:
• UPDATE CUSTOMERS SET ADDRESS = 'Pune' WHERE ID = 6;
• UPDATE CUSTOMERS SET ADDRESS = 'Pune', SALARY = 1000.00;
DELETE:
SQL DELETE Query is used to delete the existing records from a table.
You can use WHERE clause with DELETE query to delete selected rows, otherwise all
the records would be deleted.
Syntax:
If you want to DELETE all the records from CUSTOMERS table, you do not need to use
WHERE clause and DELETE query would be as follows:
DRL/DQL Command:
SELECT:
SELECT Statement is used to fetch the data from a database table which returns data in the
form of result table. These result tables are called result-sets.
Syntax1:
The Following Syntax is used to retrieve specific attributes from the table is as follows:
Here, column1, column2...are the fields of a table whose values you want to fetch.
The Following Syntax is used to retrieve all the attributes from the table is as follows:
Distinct:
SQL DISTINCT keyword is used in conjunction with SELECT statement to eliminate all the
There may be a situation when you have multiple duplicate records in a table. While
fetching such records, it makes more sense to fetch only unique records instead of fetching
duplicate records.
Syntax:
Queries involving more than one relation (or) Full Relation Operations :
The following set operations are used to write a query to combine more than one relation.
They are:
Union
Intersect
Except
UNION:
SQL UNION clause/operator is used to combine the results of two or more SELECT
statements without returning any duplicate rows.
To use UNION, each SELECT must have the same number of columns selected, the same
number of column expressions, the same data type, and have them in the same order, but they
do not have to be the same length.
Syntax:
The basic syntax of UNION is as follows:
SELECT ID, NAME, AMOUNT, DATE FROM CUSTOMERS LEFT JOIN ORDERS
ON CUSTOMERS.ID = ORDERS.CUSTOMER_ID
UNION
SELECT ID, NAME, AMOUNT, DATE FROM CUSTOMERS RIGHT JOIN ORDERS
ON CUSTOMERS.ID = ORDERS.CUSTOMER_ID;
The UNION ALL operator is used to combine the results of two SELECT statements
including duplicate rows.
The same rules that apply to UNION apply to the UNION ALL operator.
Syntax:
SELECT ID, NAME, AMOUNT, DATE FROM CUSTOMERS LEFT JOIN ORDERS
ON CUSTOMERS.ID = ORDERS.CUSTOMER_ID
UNION ALL
SELECT ID, NAME, AMOUNT, DATE FROM CUSTOMERS RIGHT JOIN ORDERS
ON CUSTOMERS.ID = ORDERS.CUSTOMER_ID;
INTERSECT:
• The SQL INTERSECT clause/operator is used to combine two SELECT statements, but returns
rows only from the first SELECT statement that are identical to a row in the second SELECT
statement.
• This means INTERSECT returns only common rows returned by the two SELECT statements.
• Just as with the UNION operator, the same rules apply when using the INTERSECT operator.
MySQL does not support INTERSECT operator
Syntax:
SELECT ID, NAME, AMOUNT, DATE FROM CUSTOMERS LEFT JOIN ORDERS
ON CUSTOMERS.ID = ORDERS.CUSTOMER_ID
INTERSECT
SELECT ID, NAME, AMOUNT, DATE FROM CUSTOMERS RIGHT JOIN ORDERS
ON CUSTOMERS.ID = ORDERS.CUSTOMER_ID;
EXCEPT:
• The SQL EXCEPT clause/operator is used to combine two SELECT statements and returns rows
from the first SELECT statement that are not returned by the second SELECT statement.
• This means EXCEPT returns only rows, which are not available in second SELECT statement.
• Just as with the UNION operator, the same rules apply when using the EXCEPT operator.
Syntax:
SELECT ID, NAME, AMOUNT, DATE FROM CUSTOMERS LEFT JOIN ORDERS
ON CUSTOMERS.ID = ORDERS.CUSTOMER_ID
EXCEPT
SELECT ID, NAME, AMOUNT, DATE FROM CUSTOMERS RIGHT JOIN ORDERS
ON CUSTOMERS.ID = ORDERS.CUSTOMER_ID;
SQL Operators
Operators are used to specify conditions in an SQL statement and to serve as conjunctions for
multiple conditions in a statement.
1. Arithmetic operators
2. Comparison operators
3. Logical operators
- Subtraction - Subtracts right hand operand from left hand a - b will give -10
operand
Multiplication - Multiplies values on either side of
* the a * b will give 200
operator
/ Division - Divides left hand operand by right hand operand b / a will give 2
% Modulus - Divides left hand operand by right hand operand b % a will give 0
and returns remainder
= Checks if the values of two operands are equal or not, if (a = b) is not true.
yes then condition becomes true.
!= Checks if the values of two operands are equal or not, if (a != b) is true.
values are not equal then condition becomes true.
<> Checks if the values of two operands are equal or not, if (a <> b) is true.
values are not equal then condition becomes true.
> Checks if the value of left operand is greater than the value (a > b) is not true.
of right operand, if yes then condition becomes true.
< Checks if the value of left operand is less than the value of (a < b) is true.
right operand, if yes then condition becomes true.
>= Checks if the value of left operand is greater than or equal (a >= b) is not true
to the value of right operand, if yes then condition becomes
true.
<= Checks if the value of left operand is less than or equal (a <= b) is
to the value of right operand, if yes then
condition true.
becomes true.
!< Checks if the value of left operand is not less than the (a !< b) is
value of right operand, if yes then condition becomes false.
true.
!> Checks if the value of left operand is not greater than a !> b) is
the value of right operand, if yes then condition true.
becomes true.
• The following are example illustrate the relational operators usage on tables:
Ex:
Operator Description
The AND operator allows the existence of multiple conditions in an
AND SQL
statement's WHERE clause
NOT The NOT operator reverses the meaning of the logical operator with which it is
used. Eg: NOT EXISTS, NOT BETWEEN, NOT IN, etc. This is a
negatation operator
• SQL AND and OR operators are used to combine multiple conditions to narrow data in an
SQL statement. These two operators are called conjunctive operators.
• These operators provide a means to make multiple comparisons with different operators
in the same SQL statement.
AND Operator:
• The AND operator allows the existence of multiple conditions in an SQL statement's
WHERE clause.
Syntax:
• The basic syntax of AND operator with WHERE clause is as follows:
Ex:
SELECT * FROM CUSTOMERS WHERE AGE >= 25 AND SALARY >= 6500;
OR Operator:
• The OR operator is used to combine multiple conditions in an SQL statement's WHERE clause.
Syntax:
• The basic syntax of OR operator with WHERE clause is as follows:
Ex:
SELECT * FROM CUSTOMERS WHERE AGE >= 25 OR SALARY >= 6500;
NOT Operator:
• The NOT operator reverses the meaning of the logical operator with which it is used. Eg: NOT
EXISTS, NOT BETWEEN, NOT IN, etc. This is a negate operator.
Syntax:
SELECT column1, column2, … column FROM table_name WHERENOT [condition];
EX:
SELECT * FROM CUSTOMERS WHERE AGE IS NOT NULL;
Operator
Description
BETWE
EN The BETWEEN operator is used to search for values that are within a set
of values, given the minimum value and the maximum value.
EXIST
S The EXISTS operator is used to search for the presence of a row in
a specified table that meets certain criteria.
IN
The IN operator is used to compare a value to a list of literal values
LIKE that have been specified.
LIKE Operator:
SQL LIKE clause is used to compare a value to similar values using wildcard operators. There
are two wildcards used in conjunction with the LIKE operator:
Syntax:
or
or
or
Ex:
Statement
Description
WHERE SALARY LIKE '2_%_%'
Finds any values that start with s
WHERE SALARY LIKE '%r' Finds any values that have sad in any
position
WHERE SALARY LIKE '_2%3'
Finds any values that have 00 in the
second and third positions
BETWEEN Operator
The BETWEEN operator is used to select values within a range.
Syntax:
SELECT column_name(s)
FROM table_name
WHERE column_name BETWEEN value1 AND value2;
IN Operator:
Syntax
SELECT column_name(s)
FROM table_name
• SQL Joins clause is used to combine records from two or more tables in a database.
• A JOIN is a means for combining fields from two tables by using values common to each.
• Consider the following two tables, CUSTOMERS and ORDERS tables are as follows:
CUSTOMERS TABLE
| 2 | Khilan | 25 | Delhi |
1500.00 | | 3 | kaushik | 23 |
Kota | 2000.00 |
| 4 | Chaitali | 25 | Mumbai |
6500.00 | | 5 | Hardik | 27 |
Bhopal | 8500.00 |
| 6 | Komal | 22 | MP | 4500.00 |
ORDERS TABLE
| 3 | kaushik | 23 |
3000 | | 3 | kaushik
|23|1500||2|
Khilan | 25 | 1560
NOTE:
• Join is performed in the WHERE clause. Several operators can be used to join tables, such
as =, <, >, <>, <=, >=, !=, BETWEEN, LIKE, and NOT; they can all be used to join tables.
However, the most common operator is the equal symbol.
INNER JOIN:
The most frequently used and important of the joins is the INNER JOIN. They are also
referred to as an EQUIJOIN.
The INNER JOIN creates a new result table by combining column values of two tables
(table1 and table2) based upon the join-predicate.
The query compares each row of table1 with each row of table2 to find all pairs of rows
which satisfy the join-predicate.
When the join-predicate is satisfied, column values for each matched pair of rows of A
and B are combined into a result row.
Syntax:
The basic syntax of INNER JOIN is as follows:
OUTER JOIN:
The Outer join can be classified into 3 types. They are:
Left Outer Join\
Right Outer Join
Full Outer Join
• The SQL LEFT JOIN returns all rows from the left table, even if there are no matches
in the right table.
• This means that a left join returns all the values from the left table, plus matched values
from the right table or NULL in case of no matching join predicate.
Syntax:
RIGHT JOIN:
• The SQL RIGHT JOIN returns all rows from the right table, even if there are no matches
in the left table.
• This means that a right join returns all the values from the right table, plus matched
values from the left table or NULL in case of no matching join predicate.
Syntax:
• The basic syntax of RIGHT JOIN is as follows:
FULL JOIN:
• The SQL FULL JOIN combines the results of both left and right outer joins.
• The joined table will contain all records from both tables, and fill in NULLs for missing
matches on either side.
Syntax:
• The basic syntax of FULL JOIN is as follows:
Ex: SELECT ID, NAME, AMOUNT, DATE FROM CUSTOMERS FULL JOIN
ORDERS ON CUSTOMERS.ID = ORDERS.CUSTOMER_ID;
SELF JOIN:
• The SQL SELF JOIN is used to join a table to it as if the table were two tables,
temporarily renaming at least one table in the SQL statement.
Syntax:
• The basic syntax of SELF JOIN is as follows:
CARTESIAN JOIN:
• The CARTESIAN JOIN or CROSS JOIN returns the cartesian product of the sets of
records from the two or more joined tables.
• Thus, it equates to an inner join where the join-condition always evaluates to True or where
the join-condition is absent from the statement.
Syntax:
• The basic syntax of CROSS JOIN is as follows:
VIEWS IN SQL:
• A view is nothing more than a SQL statement that is stored in the database with an
associated name.
• A view can be created from one or many tables which depends on the written SQL
query to create a view.
• Views, which are kind of virtual tables, allow users to do the following:
• Structure data in a way that users or classes of users find natural or intuitive.
• Restrict access to the data such that a user can see and (sometimes) modify exactly
what they need and no more.
• Summarize data from various tables which can be used to generate reports.
Advantages of views:
• Views provide data security
• Different users can view same data from different perspective in different ways
at the same time.
Creating Views:
• Database views are created using the CREATE VIEW statement. Views can be created
from a single table, multiple tables, or another view.
• To create a view, a user must have the appropriate system privilege according to the
specific implementation.
Updating a View:
NOTE:
So if a view satisfies all the above mentioned rules then you can update a
view. Following is an example to update the age of Ramesh:
• Rows of data can be deleted from a view. The same rules that apply to the UPDATE and
INSERT commands apply to the DELETE command.
Dropping Views:
• Obviously, where you have a view, you need a way to drop the view if it is no longer needed.
GROUP BY:
SQL GROUP BY clause is used in collaboration with the SELECT statement to arrange
identical data into groups.
The GROUP BY clause follows the WHERE clause in a SELECT statement and precedes
the ORDER BY clause.
Syntax:
The GROUP BY clause must follow the conditions in the WHERE clause and must
precede the ORDER BY clause if one is used.
SELECT column1,
column2 FROM
table_name WHERE
[ conditions ]
GROUP BY column1,
column2 ORDER BY
column1, column2; Ex:
ORDER BY:
SQL ORDER BY clause is used to sort the data in ascending or descending order, based
on one or more columns.
Syntax:
The basic syntax of ORDER BY clause is as follows:
SELECT
column-list
FROM
table_name
[WHERE
condition]
HAVING Clause:
The HAVING clause enables you to specify conditions that filter which group results
appear in the final results.
The WHERE clause places conditions on the selected columns, whereas the HAVING
clause places conditions on groups created by the GROUP BY clause.
Syntax:
SELECT column1,
column2 FROM
table1, table2 WHERE
[ conditions ]
GROUP BY column1,
column2 HAVING
[ conditions ] ORDER BY
column1, column2; Ex:
select id, name, age, address, salary from customers group by age having count(age) >= 2;
Aggregate Functions:
SQL aggregate functions return a single value, calculated from values in a column.
AVG () Function
AVG () Syntax
COUNT () Function
COUNT aggregate function is used to count the number of rows in a database table.
COUNT () Syntax:
MAX () Function
The SQL MAX aggregate function allows us to select the highest (maximum) value for a
certain column.
MAX () Syntax:
SQL MIN function is used to find out the record with minimum value among a record set.
MIN () Syntax:
SUM function is used to find out the sum of a field in various records.
SUM () Syntax:
PRIMARY Key:
A primary key is a field in a table which uniquely identifies each row/record in a database
table.
1) Unique values
A table can have only one primary key, which may consist of single or multiple fields.
If a table has a primary key defined on any field(s), then you cannot have two records
having the same value of that field(s).
FOREIGN Key:
The relationship between 2 tables matches the Primary Key in one of the tables with a
Foreign Key in the second table.
A nested query is a query that has another query embedded within it; the embedded query
is called a sub query.
When writing a query, we sometimes need to express a condition that refers to a table that
must itself be computed.
A subquery typically appears within the WHERE clause of a query. Subqueries can
sometimes appear in the FROM clause or the HAVING clause.
Subqueries can be used with the SELECT, INSERT, UPDATE, and DELETE statements
along with the operators like =, <, >, >=, <=, IN, BETWEEN etc.
2. A subquery can have only one column in the SELECT clause, unless multiple columns
are in the main query for the subquery to compare its selected columns.
Subqueries are most frequently used with the SELECT statement. The basic syntax is as
follows:
SELECT column_name
[, column_name ] FROM table1
[, table2 ]
Ex: select *from customers where id in (select id from customers where salary >4500);
The selected data in the subquery can be modified with any of the character, date or
number functions.
Syntax
Ex:
insert into customers_bkp select * from customers where id in (select id from customers) ;
Either single or multiple columns in a table can be updated when using a subquery
with the UPDATE statement.
Syntax:
EX:
Transactions:
A transaction is a unit of program execution that accesses and possibly updates various data
items.
(or)
A transaction is an execution of a user program and is seen by the DBMS as a series or list of
actions i.e., the actions that can be executed by a transaction includes the reading and writing of
database.
Transaction Operations:
Access to the database is accomplished in a transaction by the following two operations,
read(X) : Performs the reading operation of data item X from the database. write(X) :
Performs the writing operation of data item X to the database.
Example:
Let T1 be a transaction that transfers $50 from account A to account B. This transaction
can be illustrated as follows,
T1 : read(A);
A:=A–50;
write(A);
read(B);
B:=B+50;
write(B);
Transaction Concept:
The concept of transaction is the foundation for concurrent execution of transaction in a DBMS
and recovery from system failure in a DBMS.
A user writes data access/updates programs in terms of the high-level query language supported
by the DBMS.
To understand how the DBMS handles such requests, with respect to concurrency control and
recovery, it is convenient to regard an execution of a user program or transaction, as a series of
reads and writes of database objects.
To read a database object, it is first brought in to main memory from disk and then its value is
copied into a program. This is done by read operation.
To write a database object, in-memory, copy of the object is first modified and then written to
disk. This is done by the write operation.