UNIT-V:
Concurrency Control: Lock Based Protocols, Timestamp – Based
Protocols Validation Based Protocols, Deadlock Handling.
Recovery System: Failure Classification, Storage Structure Recovery
and
Atomicity, Log Based Recovery
NoSQL: Introduction to NOSQL, NoSQL Vs RDBMS, Categories of
NoSQL Databases, Case studies: HBase, Firebase, MongoDB, Cloud
DB.
Concurrency Control
Concurrency Control is the management procedure that is required for controlling concurrent
execution of the operations that take place on a database.
Concurrent Execution in DBMS: In a multi-user system, multiple users can access and use the
same database at one time, which is known as the concurrent execution of the database. It means
that the same database is executed simultaneously on a multi-user system by different users.
Problem 1: Lost Update Problems (W - W Conflict)
Dirty Read Problems (W-R Conflict)
Unrepeatable Read Problem (W-R Conflict)
https://www.tpointtech.com/dbms-concurrency-control
Concurrency Control is the working concept that is required for controlling and managing the
concurrent execution of database operations and thus avoiding the inconsistencies in the
database. Thus, for maintaining the concurrency of the database, we have the concurrency
control protocols.
Concurrency Control Protocols
The concurrency control protocols ensure the atomicity, consistency, isolation, durability and
serializability of the concurrent execution of the database transactions. Therefore, these protocols
are categorized as:
Lock Based Concurrency Control Protocol
Time Stamp Concurrency Control Protocol
Validation Based Concurrency Control Protocol
Lock Based Protocol
Validation Based Protocol
Validation phase is also known as optimistic concurrency control technique. In this technique, it
is assumed that all data items can be successfully updated at the end of a transaction. While
applying all updates, they are applied to the local copies of the data item that are kept for the
transaction. At the end, the validation phase check whether any of the transactions updates
violates serializiblity, if serializiblity is not violated the commit operation is performed and
database is updated from local copies. Otherwise transaction will be rolled back and restarted.
Optimistic scheduling methods are called optimistic because they assume that there will be little
interference and hence there is no need to check during transaction execution.
1. Read phase: In this phase, the transaction T is read and executed. It is used to read the
value of various data items and stores them in temporary local variables. It can perform
all the write operations on temporary variables without an update to the actual database.
2. Validation phase: In this phase, the temporary variable value will be validated against
the actual data to see if it violates the serializability.
3. Write phase: If the validation of the transaction is validated, then the temporary results
are written to the database or system otherwise the transaction is rolled back.
The validation phase examines the reads and writes of the transaction that may cause
overlapping. So each transaction is assigned with the following different timestamps:
Start(Ti): It contains the time when Ti started its execution.
Validation (Ti): It contains the time when Ti finishes its read phase and starts its validation
phase.
Finish(Ti): It contains the time when Ti finishes its write phase.
o This protocol is used to determine the time stamp for the transaction for serialization
using the time stamp of the validation phase, as it is the actual phase which determines if
the transaction will commit or rollback.
o Hence TS(T) = validation(T).
o The serializability is determined during the validation process. It can't be decided in
advance.
o While executing the transaction, it ensures a greater degree of concurrency and also less
number of conflicts.
o Thus it contains transactions which have less number of rollbacks.
A transaction T can complete its validation phase successfully if at least one of the
following conditions is satisfied:
o All older transaction i.e. the transactions with smaller timestamps must have completed
before the requesting transaction started.
o If a transaction T starts before earlier one finishes then the transaction T should not read
the data items written by the earlier transactions. This rule guarantees that write of earlier
transactions are not read by other transaction T.
o If a transaction T starts before earlier one finishes then the earlier transaction should
complete its write phase before transaction T enters its validation phase. So this rule
guarantees that writes are done serially ensuring that there is no conflict.
The main idea for using validation based protocol is:
o Minimum Overhead: All the data items are updated at the end of the transaction so
minimum overhead is caused during the execution of transaction.
o No cascade rollback: Since the rollbacks involve only a local copy of data and no
database is involved, so there will not be any cascading rollbacks.
o No locking required: This technique allows greater concurrency then traditional
protocols since no locking is required.
o Efficient: This technique is very efficient when conflicts are rare. Occasionally conflicts
results in transaction rollbacks.
The main disadvantage of using validation-based protocols is:
o If the rolled back transaction is very long then valuable processing time will be lost.
Thomas write Rule
Thomas Write Rule provides the guarantee of serializability order for the protocol. It improves
the Basic Timestamp Ordering Algorithm.
The basic Thomas write rules are as follows:
o If TS(T) < R_TS(X) then transaction T is aborted and rolled back, and operation is
rejected.
o If TS(T) < W_TS(X) then don't execute the W_item(X) operation of the transaction and
continue processing.
o If neither condition 1 nor condition 2 occurs, then allowed to execute the WRITE
operation by transaction Ti and set W_TS(X) to TS(T).
If we use the Thomas write rule then some serializable schedule can be permitted that does not
conflict serializable as illustrate by the schedule in a given figure:
Figure: A Serializable Schedule that is not Conflict Serializable
In the above figure, T1's read and precedes T1's write of the same data item. This schedule does
not conflict serializable.
Thomas write rule checks that T2's write is never seen by any transaction. If we delete the write
operation in transaction T2, then conflict serializable schedule can be obtained which is shown in
below figure.
Figure: A Conflict Serializable Schedule
https://www.tpointtech.com/deadlock-in-dbms
Deadlock Handling:
A deadlock is a condition where two or more transactions are waiting indefinitely for one
another to give up locks. Deadlock is said to be one of the most feared complications in DBMS
as no task ever gets finished and is in waiting state forever.
For example: In the student table, transaction T1 holds a lock on some rows and needs to update
some rows in the grade table. Simultaneously, transaction T2 holds locks on some rows in the
grade table and needs to update the rows in the Student table held by Transaction T1.
Now, the main problem arises. Now Transaction T1 is waiting for T2 to release its lock and
similarly, transaction T2 is waiting for T1 to release its lock. All activities come to a halt state
and remain at a standstill. It will remain in a standstill until the DBMS detects the deadlock and
aborts one of the transactions.
Below is a list of conditions necessary for a deadlock to occur:
o Circular Waiting: It is when two or more transactions wait each other indefinitely for a
lock held by the others to be released.
o Partial Allocation: When a transaction acquires some of the required data items but not
all the data items as they may be exclusively locked by others.
o Non-Preemptive scheduling: A data item that could be only single transaction at a time.
o Mutual Exclusion: A data item can be locked exclusively by one transaction at a time.
To avoid a deadlock atleast one of the above mentioned necessary conditions should not occur.
Deadlock Avoidance
o When a database is stuck in a deadlock state, then it is better to avoid the database rather
than aborting or restating the database. This is a waste of time and resource.
o Deadlock avoidance mechanism is used to detect any deadlock situation in advance. A
method like "wait for graph" is used for detecting the deadlock situation but this method
is suitable only for the smaller database. For the larger database, deadlock prevention
method can be used.
Deadlock Detection
In a database, when a transaction waits indefinitely to obtain a lock, then the DBMS should
detect whether the transaction is involved in a deadlock or not. The lock manager maintains a
Wait for the graph to detect the deadlock cycle in the database.
Wait for Graph
o This is the suitable method for deadlock detection. In this method, a graph is created
based on the transaction and their lock. If the created graph has a cycle or closed loop,
then there is a deadlock.
o The wait for the graph is maintained by the system for every transaction which is waiting
for some data held by the others. The system keeps checking the graph if there is any
cycle in the graph.
The wait for a graph for the above scenario is shown below:
Deadlock Prevention
o Deadlock prevention method is suitable for a large database. If the resources are allocated
in such a way that deadlock never occurs, then the deadlock can be prevented.
o The Database management system analyzes the operations of the transaction whether
they can create a deadlock situation or not. If they do, then the DBMS never allowed that
transaction to be executed.
Each transaction has unique identifier which is called timestamp. It is usually based on the state
of the transaction and assigned once the transaction is started. For example if the transaction T1
starts before the transaction T2 then the timestamp corresponding to the transaction T1 will be
less than timestamp corresponding to transaction T2. The timestamp decides whether a
transaction should wait or abort and rollback. Aborted transaction retain their timestamps values
and hence the seniority.
The following deadlock prevention schemes using timestamps have been proposed.
o Wait-Die scheme
o Wound wait scheme
The significant disadvantage of both of these techniques is that some transactions are aborted and
restarted unnecessarily even though those transactions never actually cause a deadlock.
Wait-Die scheme
In this scheme, if a transaction requests for a resource which is already held with a conflicting
lock by another transaction then the DBMS simply checks the timestamp of both transactions. It
allows the older transaction to wait until the resource is available for execution.
Let's assume there are two transactions Ti and Tj and let TS(T) is a timestamp of any transaction
T. If T2 holds a lock by some other transaction and T1 is requesting for resources held by T2
then the following actions are performed by DBMS:
1. Check if TS(Ti) < TS(Tj) - If Ti is the older transaction and Tj has held some resource,
then Ti is allowed to wait until the data-item is available for execution. That means if the
older transaction is waiting for a resource which is locked by the younger transaction,
then the older transaction is allowed to wait for resource until it is available.
2. Check if TS(Ti) > TS(Tj) - If Ti is older transaction and has held some resource and if Tj
is waiting for it, then Tj is killed and restarted later with the random delay but with the
same timestamp.
If T1 request a data item is locked by transaction T2 then T1 has to wait until T2 completes and
all locks acquired by it are released because t(T1) < t(T2). On the other hand, if transaction T3
requests a data item locked by transaction T2 and T3 has to abort and rollback i.e. dies because
t(T3) < t(T2).
Wait-Die scheme Wound wait scheme
It is based on non-preemptive technique. It is based on preemptive technique.
An older transaction must wait for In this scheme an older transaction does not wait for
younger one to release their data items. younger transactions.
The number of aborts and rollback is The number of abort and rollback transactions is
higher. lower in this scheme.
A transaction T1 aborts and rollback As the transaction gets older, greater is the
because transactions T2 contain the probability of retrieving a data item. An older would
requested data items so T1 can reissue the force the abortion of any younger transactions that
same transaction sequence of requests holds data item it needs but would only be aborted
when started again. by transaction older then itself.
Deadlock Detection and Recovery
Recovery from Deadlock
If the wait for graph which is used for deadlock detection contains a deadlock situation i.e. there
exists cycles in it then those cycles should be removed to recover from the deadlock. The most
widely used technique of recovering from a deadlock is to rollback one or more transactions till
the system no longer displays a deadlock condition.
The selection of the transactions to be rolled back is based on the following deliberations:
Selection of victim: There may be many transactions which are involved in a deadlock i..e
deadlocked transaction. So to recover from the deadlock some of the transaction should be rolled
back, out of the possible transactions causing a deadlock. The one that is rolled back is known as
victim transaction and the mechanism is known as victim election.
The transactions to be rolled back are the one which has just started or has not made many
changes. Avoid selecting transactions that have made many updates and have been running for a
long time.
Rollback: Once the selection of the transaction to be rolled back is decided we should find out
how far the current transaction should be rolled back. One of the simplest solution is the total
rollback i.e. abort the transaction and restart it. However, the transaction should be rolled back to
the extent required to break the deadlock. Also, the additional information of the state of
currently executing transactions should be maintained.
Starvation: To recover from the deadlock, we must ensure that the same transaction should not
be selected again and again as a victim to rollback. The transaction will never complete if the
type of situation is not avoided. To avoid starvation, only a finite number of times a transaction
should be picked up as a victim.
Log Based Recovery Refer my notes (check the problems)
NoSQL Aaron and Akash notes
Once refer text book ppt if possible
Previous Year Questions
The Two-Phase Locking (2PL) Protocol is an essential concept in database management systems
used to maintain data consistency and ensure smooth operation when multiple transactions are
happening simultaneously. It helps to prevent issues like data conflicts where two or more
transactions try to access or modify the same data at the same time, potentially causing errors.
Two-Phase Locking is widely used to ensure serializability, meaning transactions occur in a
sequence that maintains data accuracy. This article will explore the workings of the 2PL
protocol, its types, advantages and its role in maintaining a reliable database system
Two Phase Locking
The Two-Phase Locking (2PL) Protocol is a key technique used in database management
systems to manage how multiple transactions access and modify data at the same time. When
many users or processes interact with a database, it’s important to ensure that data remains
consistent and error-free. Without proper management, issues like data conflicts or corruption
can occur if two transactions try to use the same data simultaneously.
The Two-Phase Locking Protocol resolves this issue by defining clear rules for managing data
locks. It divides a transaction into two phases:
1. Growing Phase: In this step, the transaction gathers all the locks it needs to access the
required data. During this phase, it cannot release any locks.
2. Shrinking Phase: Once a transaction starts releasing locks, it cannot acquire any new
ones. This ensures that no other transaction interferes with the ongoing process.
Types of Lock
Shared Lock (S): Shared Lock is also called a read-only lock, allows multiple transactions to
access the same data item for reading at the same time. However, transactions with this lock
cannot make changes to the data. A shared lock is requested using the lock-S instruction.
Exclusive Lock (X): An Exclusive Lock allows a transaction to both read and modify a data
item. This lock is exclusive, meaning no other transaction can access the same data item while
this lock is held. An exclusive lock is requested using the lock-X instruction.
Lock Conversions
In the Two-Phase Locking Protocol, lock conversion means changing the type of lock on data
while a transaction is happening. This process is carefully controlled to maintain consistency in
the database.
1. Upgrading a Lock: This means changing a shared lock (S) to an exclusive lock (X). For
example, if a transaction initially only needs to read data (S) but later decides it needs to
update the same data, it can request an upgrade to an exclusive lock (X). However, this
can only happen during the Growing Phase, where the transaction is still acquiring locks.
Example: A transaction reads a value (S lock) but then realizes it needs to modify
the value. It upgrades to an X lock during the Growing Phase.
2. Downgrading a Lock: This means changing an exclusive lock (X) to a shared lock (S).
For instance, if a transaction initially planned to modify data (X lock) but later decides it
only needs to read it, it can downgrade the lock. However, this must happen during the
Shrinking Phase, where the transaction is releasing locks.
Example: A transaction modifies a value (X lock) but later only needs to read the
value, so it downgrades to an S lock during the Shrinking Phase.
2PL-Locking
These rules ensure that the Two-Phase Locking Protocol maintains consistency and avoids
conflicts between transactions. By limiting when upgrades and downgrades can occur, the
system prevents situations where multiple transactions interfere with each other’s operations.
Let’s see a transaction implementing 2-PL.
T1 T2
1 lock-S(A)
2 lock-S(A)
T1 T2
3 lock-X(B)
4 ………. ……….
5 Unlock(A)
6 Lock-X(C)
7 Unlock(B)
8 Unlock(A)
9 Unlock(C)
1
………. ……….
0
This is a basic outline of a transaction that demonstrates how locking and unlocking work in the
Two-Phase Locking Protocol (2PL).
Transaction T1
The growing Phase is from steps 1-3
The shrinking Phase is from steps 5-7
Lock Point at 3
Transaction T2
The growing Phase is from steps 2-6
The shrinking Phase is from steps 8-9
Lock Point at 6
Lock Point
The lock point in a transaction is the moment when the transaction finishes acquiring all the
locks it needs. After this point, no new locks can be added, and the transaction starts releasing
locks. It’s a key step in the Two-Phase Locking Protocol to ensure the rules of growing and
shrinking phases are followed.
Example of 2PL
Imagine a library system where multiple users can borrow or return books. Each action (like
borrowing or returning) is treated as a transaction. Here’s how the Two-Phase Locking Protocol
(2PL) works, including the lock point:
User A wants to:
1. Check the availability of Book X.
2. Borrow Book X if it’s available.
3. Update the library’s record.
Growing Phase (Locks are Acquired):
1. User A locks Book X with a shared lock (S) to check its availability.
2. After confirming the book is available, User A upgrades the lock to an exclusive lock (X)
to borrow it.
3. User A locks the library’s record to update the borrowing details.
Lock Point: Once User A has acquired all the necessary locks (on Book X and the library
record), the transaction reaches the lock point. No more locks can be acquired after this.
Shrinking Phase (Locks are Released):
1. User A updates the record and releases the lock on the library’s record.
2. User A finishes borrowing and releases the exclusive lock on Book X.