CHAP no.
1👍
2👍
3👍
9👍
6👍
7👍
4👍
8
Chap 1: Database System Architecture
CHAP 1
Definitions
DATA:
A fact that can be stored or recorded. Raw data
Information:
When data is processed, organized, structured or we can say meaningful data is called information.
Database:
It is the collection of logically related data.
System:
A program or tool to manage or manipulate data.
DBMS (Database management system):
It is the collection of inter-related or logically related data and set of programs to manage and
manipulate those data.
DBMS = Database + Set of programs
Ex. Oracle, MySQL, MongoDB
Metadata:
It is the data about data.
Data such as table name, column name, data type, authorized user, user access privileges for any
table is called the metadata for that table.
Data Dictionary:
It is an Information repository which contains metadata.
It is usually a part of catalog.
Data Warehouse:
It is an Information repository which stores data.
It is design to facilitate reporting and analysis.
Field:
Field is a character or group of character which has specific meaning.
Also called data item.
Ex. Cust_id, name, society, and city are the fields of customer data.
Record:
It is the collection of logically related fields.
Ex. (cust_id, cust_name, Address) it is the record for customer.
Advantages of DBMS over File system
File System:
It is an abstraction to store, retrieve, manage, update a set of files.
File system keeps the track on files and also manage them.
1. Minimal data Redundancy (Duplication)
DBMS aims to minimize the data by normalizing data.
2. Shared Data
All authorized user and application programmer can share data easily.
3. Data Consistency
Before and after operation data remains same
As there is no redundancy data remains consistent.
4. Data Access
It is easy to fetch any data from database by providing appropriate query.
5. Data Integrity
DBMS provides some integrity constraints to follow, so that improves the data integrity.
6. Data Security
DBMS prevents the unauthorized user by providing the ways to give the access of data, so
the data remain secure.
7. Concurrent Access
DBMS allows the concurrent access means that multiple users can have the access to a data.
Concurrent access can be allowed under some supervision.
8. Atomicity
The operation of the database is atomic, 0% or 100%.
Application of database
E-commerce (Amazon, Flip-cart)
OTTs (Netflix, Prime, Hotstar)
Social Media (Insta, Facebook, Snap, Twit)
Banking and Insurance
Universities & colleges / Schools
Library Management System
Airlines and railways
Human Resource Development
Gov. Orgs
Hospitals and medical System
Sales
Roles Of DBA
DBA:
Database Administrator
It is the person in the organization who controls the design and the use of database.
Roles
1. Schema (Logical Structure of the database) Definition:
DBA defines the logical schema of the database.
2. Storage Structure and Access Method Definition:
DBA decides that how data should be represented in the database.
Based on this storage structure of database and access method for data is defined.
3. Defining Security and Integrity Constraints
DBA decides various Security and Integrity Constraints.
4. Granting for authorization for data Access
DBA decides to give access to the particular data to which user
This prevents the unauthorized user
5. Liaison with user
DBA is responsible to provide necessary data to user.
6. Assisting Application Programmers
It assists the appl. Programmer to develop application program.
7. Monitoring Performance
DBA monitors the performance of the system
Checks physical as well as Logical schema
8. Backup And Recovery
The task od DBA is to Backing up the data on the Storage devices such as DVD, CD or remote
servers.
Data Abstraction
Database systems are made up of very complex data structures.
To ease the user interaction with database, the developers hide some irrelevant details from the
user.
This process of hiding the irrelevant data is called data Abstraction.
Mapping and Data Independence
Process of transforming requests and results between the three levels is called mapping.
TYPES OF DATA INDEPENDENCE:
Physical DATA Independence:
It is the ability to modify the physical schema without requiring any change in logical(Conceptual)
Schema.
Modifications at internal level is occasionally necessary to improve performance.
Logical DATA Independence
It is the ability to modify the conceptual schema without requiring change in application programs.
It is necessary when logical structure of database is changed.
Types of Database Users
1. Naive Users
Zero Knowledge of Database
Interacts via sophisticated software tools
EX. Clerk in bank
2. Application Programmers
Programmers who write the code in java, php, .NET etc.
Ex Software developers.
3. Sophisticated User
Interacts with database system.
Use query tools like SQL.
Ex Analyst.
4. Specialized User
Writes specialized database application program.
Use administration tools.
Ex DBA
Chap 9: SQL Concepts
SQL is a domain specific language- use in particular area
It is a declarative language
Types Of Languages
DDL (DATA DEFINITION LANGUAGE) – It deals with the schema
It is Normally used by DBA and database designers.
Create
Alter
It deals with the column and its data types.
Use of alter commands
Add column – Alter table table_name ADD column1 datatype(size) ;
Remove column - Alter table table_name Drop Column ColumnName;
Modify data type - Alter table table_name ALTER Column datatype(size) ;
Modify data type length - as above
Add constraints - Alter table table_name ADD Primary key (Column_Name);
Remove constraints -
Rename column / table - Alter table table_name Rename to New_name;
Drop
Truncate – To remove all record from the table
Rename
DML (DATA MANIPULATION LANGUAGE)
Select
Insert
Update – It deals with the data in the table
Delete
DCL (DATA CONTROL LANGUAGE)
Grant
Revoke
TCL (TRANSACTION CONTROL LANGUAGE)
Commit
Roll-Back
Save-point
Constraints
1 Constraints are the rules that restricts the value that is present in the database.
Classification
Foreign Key - a function should take reference from other table or own itself when data is required.
SYNTAX –FOREIGN KEY ColumnName REFERENCES TableName (ColumnName)
Primary Key – It is the unique + not NULL
SYNTAX – ColumnName Datatype(size) PRIMARY KEY
Check – EX. to check the age is 18+ or not
SYNTAX – ColumnName Datatype(size) Check(condition)
Unique – Every row of that particular column or database must be unique on which this constraint is
applied
SYNTAX – ColumnName Datatype(size) UNIQUE
Default – we can set the default value of particular column is user not declared that item.
Not Null - Every row of that particular column or database must Not be NULL on which this
constraint is applied
SYNTAX – ColumnName Datatype(size) NOT NULL
Queries
Q1. Write a query to display emp name who is taking max salary.
= Select emp_name, salary from emp where salary = (select max(salary) from emp);
Q2. Write a query to display emp name who is taking second highest salary.
= SELECT cname from deposit where amount =(SELECT max(amount) as '2nd highest' from deposit
where amount!=(SELECT max(amount) from deposit))
Q3. Grp BY Clause
SELECT NAGPUR,count(NAGPUR) from branch GROUP BY NAGPUR
SELECT NAGPUR,count(NAGPUR) from branch GROUP BY NAGPUR HAVING count(NAGPUR)<3
select empno,deptno,role from emp WHERE deptno IN(SELECT deptno from emp GROUP BY deptno
HAVING COUNT(*)<3)
Details of emp whose address is either Delhi, chd or Pune
Select * from emp where address IN(‘Delhi’ , ‘chd ‘, ‘pune’)
EXISTS : It always returns true or false value
It takes the outer query and checks that the value exists or not in the inner query.
Aggregate Function
SUM
MAX
MIN
AVG
COUNT
Transaction control Commands
TCL is used to control transactional processing in database.
1. COMMIT
Commit command is used to permanently save any transaction into the database.
When we use DML commands like Insert, Update then the changes are not permanent, they can be
rolled back.
To avoid that, Commit is used to mark the permanent changes into the database.
2. ROLLBACK
The Rollback command is the TCL command used to Undo the changes that not been saved into the
database.
U can use the rollback command to Erase all the data modifications made from the start of thr
transaction or to the save-point.
3. SAVEPOINT
It is the Point in the transaction when you can roll back the transaction back to a certain point
without rolling back the entire transaction.
DCL (DATA CONTROL LANGUAGE)
DCL command are used to implement database security in a multiple user database environment.
They are used to give or take the permissions from any user on any object
1. GRANT
Grant command is used to provide access privileges or rights on the database objects to the user.
SYNTAX : Grant Privilege_Name ON Object_name TO {User_Name|PUBLIC} [WITH GRANT OPTION].
2. REVOKE
REVOKE is the DCL command used to take back the access privileges from the user.
SYNTAX: REVOKE Privilege_Name ON Object_name From{USER_name | PUBLIC}
VIEW
Virtual table
Data associated with the view that is not stored physically in the database.
It is stored query
SYNTAX: CREATE VIEW View_name AS Select Column1, column2, … From table Table_name
Simple View
When we create a view on a single table.
In simple view we can Insert, Update, Delete and that changes made into the base table
Insert operation can be performed only if primary key is present and all are not null fields.
Complex View
When we create a table on more the one table.
We can only update the data, no insertion operation can be performed.
It Only Contains: Join, Group By, order By clause…
Materialized View
Stores data physically and updates periodically
It is used to increase the application performance.
They can be updated on a regular basis through triggers or by using the ON COMMIT REFRESH
option.
Chap 2: DATA MODELS
Definitions:
Entity:
Entity is the real thing or object or person in the real world that is distinguishes from all other
objects.
Ex book, college, student etc.
Entity sets:
It is a set of entities of same type of properties.
Ex set of students in the college.
Relationship:
It is the connection between several entities.
The relationship between 2 entities is called binary relationship.
Relationship Set:
It is the set of relationship of same type.
Attributes:
Attributes are the properties hold by each member of the entity set.
Types
1. Simple Attribute: It cannot be divided into sub parts. Ex enrollment no.
2. Composite Attribute: it can be divided into sub-parts. Ex names (First, middle, last)
3. Single valued attribute: It has single data value. Ex enrollment, Bdate
4. Multi-valued attribute: It has multiple value. Ex Mo.no (May have multiple numbers).
5. Stored attribute: Its value is stored manually in database. Ex Birthdate
6. Derived attribute: Its value is derived from other attributes. Ex age
7. Descriptive attributes: Descriptive attributes are the attributes are the attributes that describes
the entity.
Recursive relationship set
The same entity set participates in the relationship set more than once then it is called recursive
relationship set.
Degree of Relationship
The number of entity types participates in the relation.
Three types of relationship: unary, binary and ternary.
1. Unary: When both the participant entities in the relationship are same.
2. Binary: It is the relationship between two different entities.
3. Ternary: It is the relationship between three different entities.
Mapping Cardinality (Cardinality Constraints)
It represents the number of entities of another entity set which are connected to an entity using a
relationship set.
It is most useful in describing binary relationship sets.
For binary relationship set:
1. One to one
2. One to Many
3. Many to One
4. Many to Many
One to one:
An entity from both should associated with only one entity.
One to Many:
An entity in A is associated with any number of entities in B but An entity in B is associated with only one
entity in A
Many to One:
An Entity in A is associated with only one entity in B but an entity in B can associate with any number
entities in A.
Many to Many:
An entity from either A and B can associate with any number of entities from another.
ER MODEL
ER model is the graphical representation of entities and their relationship between each other with their
attributes.
Database Model:
It is the type of data model that defines the logical structure of the DBMS.
It determines that how data can be stored, accessed, retrieve and update in DBMS.
1. Hierarchical Model- It organizes the data into the Tree Like Structure.
Start from the root data and expands like a tree, adding child nodes to the parent nodes.
2. Network Model- It is an extension of hierarchical model that allows the many to many
relation.ie. Multiple parents.
3. ER Model – As Above.
4. Relational Model
In this model the data is organized in a 2D Tables and the relationship is maintained by storing a
common attribute.
5. Object Oriented Database Model:
It considers each object in the world as objects and isolates it from each other.
Specialization and Generalization
Chap 3: Relational Query Languages
Relational Query language
Keys
1. Super Key
It is the set of one or more attribute that allows us to identify each tuple uniquely in a relation
2. Candidate Key
It is super Key for which do not have any proper subset.
Every candidate key is a super key but not every super key is a candidate key.
3. Primary Key
A primary Key is a Candidate key that is chosen by the database designer to identify tuples uniquely
from the relation.
4. Alternate key
An Alternate Key is a Candidate key that is not chosen by the database designer to identify tuples
uniquely from the relation.
5. Foreign Key
It is the set of one or more attribute whose values are derived from the primary key attribute of
another relation.
Relational Algebra
It is the language for expressing relational database queries.
It is a procedural query language
Refer Notes for relational algebra
Open source and commercial DBMS
For relational algebra and functional dependencies refer assignment 2 and 3
Chap 6: Storage Strategies
What is Index?
Index are a special lookup table that is used to speed up the retrieval of data from the database.
The database search engine uses Index to speed up the data retrieval operation on a database table.
An Index is very similar to the index in the back of the book.
The users can’t see the index, they are just use to speed up search and query operations.
Updating the table with indexes is more time consuming.
It is the way to Optimize the performance.
It is a database technique which is used to quickly locate and access the data in a database.
SYNTAX:
CREATE INDEX index_name ON table_name(col1,col2….);
STRUCTURE OF INDEX:
Search-Key Pointer
Indexes are made up of 2 cols.
First column is the Search key which contains the copy of primary key or candidate key of that table.
These values are stored in sorted order.
Second col. Is the pointer which contains the address of the disk block where the key value can be found.
Types of attributes:
1. Access types: This refers to the type of access ex. Value based search, Range access etc.
2. Access Time: It refers to the time needed to find a particular data element.
3. Insertion Time: It refers to the time needed to find appropriate space and insert new data.
4. Deletion Time: It refers to the time taken to find the item and delete as well as update the Index
structure.
5. Space Overhead: It refers to the additional space required by the index.
Type of Indexing:
1. Primary Indexing:
Dense Indexing
Sparse Indexing
2. Secondary indexing
3. Clustering indexing
Primary indexing:
If the index is created on the primary key of the table, then it is called Primary indexing.
As Primary keys are stored in sorted order , so the searching operation is quite efficient.
The Primary Index can be classified into 2 types ‘
1. Dense Index
2. Sparse index
1. Dense index
In the dense index there is an index record for every search key in the database.
Faster but takes more space.
No. of records in index = no. of records in main table.
2. Sparse Index
In sparse index the index record is not created for every search key.
The index record appears only for few data items.
It requires less space, less maintenance overhead but slower than dense index.
To search a record in sparse index we search for a value that is <= the index value for which we are
looking.
Linear search is performed to retrieve the data.
Secondary Index
In this method a huge range for the columns is selected initially so mapping size of first level becomes
small.
Then each range is further divided into smaller ranges.
The mapping of first level is stored in primary memory, so fetching faster.
The mapping of second level and actual data is stored in secondary memory.
Clustering Index
Sometimes the non-primary key may not be unique for each record.
In this case we will grp 2 or more cols to get the unique value. It is called Clustering Index.
The similar characteristics are grouped.
B-TREES
It is a data structure that stores the data in sorted order.
Stores data in such a way that key are in ascending order.
Each of these keys has 2 reference nodes to another 2 child.
The left child is smaller than current key and right child is greater than current key.
It uses Sequential search to search for any element.
HASHING
For huge database it is next to impossible To search all the index values
Hashing is a technique To directly search the element without using indexing structure.
Hashing uses hash function as a parameter to generate the address of a data record.
DATA BUCKET: It is a memory loc. Where the data records are stored.
HASH Function: It is a mapping function . It generally uses primary key to generate the Hash Index.
Types Of Hashing:
1. Static Hashing
2. Dynamic Hashing
Static hashing:
In static hashing the resultant data bucket will always remain constant.
Therefore the no. of data bucket in memory will always remain constant.
The drawback is that it does not expand or shrink as the size of data base increase or decrease.
Dynamic Hashing:
In Dynamic hashing the data bucket grows and shrink according to the records increases or decreases.
It is also known as extended hashing.
The hash function is made to produce large number of values.
It increases the bit size as the records increase.
Ex 0 and 1 00,01,10,11
Chap 8: Database Security
Data security:
It is the protection of data from unauthorized user.
Only the authorized user can allow to access the data.
Most of the users have the access to the part of database i.e. the data which is related to them or
their department.
Mostly DBA or head of dept have the access to all data.
They generate the users account and hives the rights to access it.
The database access is controlled by DBA.
Difference between security and integrity.
DATA ENCRYPTION
Encryption is a technique to encode the data so that only authorized user can understand it.
It converts the readable table into unreadable table by using some techniques so that no other person
can read it.
Tye sender encrypts the data i.e. he converts the plain text to cipher text
The after encryption to cipher text it send to user.
Cipher text is encrypted or unreadable data.
The key is send separately to the user.
Then user uses the key to decrypt the data.
This technique is used to protect data from data theft.
Types to data Encryption
1.Symmetric key encryption / private key encryption
2. Asymmetric key encryption / public key encryption
1. Symmetric key:
The encryption and decryption key is same.
Faster
2. Asymmetric key
The encryption and decryption key are different.
Slower due to high computational.
Access control methods:
1. Discretionary access control
2. Mandatory access control
1. Discretionary access control
In this method the user is given rights (privileges) that leads to access over the data items or table or
view.
Single user have access over multiple data or multiple user can have access over the single data.
A. GRANT
GRANT privilege ON object to user[GRANT OPTION]
B. REVOKE
REVOKE privilege ON object FROM user
2. Mandatory Access control
In this method no individual user can have access rights.
All users are classified into different categories. Each user is assigned to a clearance level.
The user can have access to that data only if they have recommended clearance level.
TS (Top secret) > S(secret) > C(Confidential) > U(Unclassified)
RBAC (Role based access control)
It restricts the access based on the person’s role in the organization
Employees are only allowed to access the information necessary to effectively perform
their job duties.
It can based on several factors such as authority, responsibility and job competence.
Authorization VS Authentication