cb3401 Unit 3
cb3401 Unit 3
write(X), which transfers the value in the variable X in the main-memory buffer of the transaction that
executed the write to the data item X in the database.
Example:
Let Ti be a transaction that transfers $50 from account A to account B. This transaction can be
defined as:
Ti : read(A);
A := A − 50;
write(A);
read(B);
B := B + 50;
write(B).
Let us now consider each of the ACID properties.
• Consistency: The consistency requirement here is that the sum of A and B be unchanged by the
execution of the transaction. This task may be facilitated by automatic testing of integrity constraints.
Atomicity:The basic idea behind ensuring atomicity is this: The database system keeps track (on disk) of
the old values of any data on which a transaction performs a write. This information is written to a file
called the log. If the transaction does not complete its execution, the database
system restores the old values from the log to make it appear as though the transaction never executed.
Suppose that, just before the execution of transaction Ti, the values of accounts A and B are
$1000 and $2000, respectively. Suppose that the failure happened after the write(A) operation but before
the write(B) operation. In this case, the values of accounts A and B reflected in the database are $950 and
$2000.
The system destroyed $50 as a result of this failure. In particular, we note that the sum A + B
is no longer preserved.
Ensuring atomicity is the responsibility of the database system; specifically, it is handled by a
component of the database called the recovery system.
Durability: The durability property guarantees that, once a transaction completes successfully, all the
updates that it carried out on the database persist, even if there is a system failure after the transaction
completes execution.
We can guarantee durability by ensuring that either:
1. The updates carried out by the transaction have been written to disk before the
transaction completes.
2. Information about the updates carried out by the transaction and written to disk is
sufficient to enable the database to reconstruct the updates when the database system is restarted after the
failure.
The recovery system of the database is responsible for ensuring durability.
Isolation: If several transactions are executed concurrently, their operations may interleave in some
undesirable way, resulting in an inconsistent state.
For example, the database is temporarily inconsistent while the transaction to transfer funds from
A to B is executing, with the deducted total written to A and the increased total yet to be written to B. If a
second concurrently running transaction reads A and B at this intermediate point and computes A+B, it
will observe an inconsistent value. Furthermore, if this second transaction then performs updates on A and
B based on the inconsistent values that it read, the database may be left in an inconsistent
state even after both transactions have completed.
A way to avoid the problem of concurrently executing transactions is to execute transactions
serially—that is, one after the other.
Ensuring the isolation property is the responsibility of a component of the database system called
the concurrency-control system.
States of Transaction
The state diagram corresponding to a transaction is shown in Figure.
3.3 SCHEDULES:
Schedule is defined as a sequence of instructions that specify the chronological order in which
instructions of concurrent transactions are executed.
A schedule is serializable if it is equivalent to a serial schedule.
A schedule where the operations of each transaction are executed consecutively without any
interference from other transactions is called serial sechedule.
Types of serializability are
1. Conflict Serializability
2. View Serializability
3.4 SERIALIZABILITY:
Conflict Serializability:
Instructions Ii and Ij, of transactions Ti and Tj respectively, conflict if and only if there exists some
item Q accessed by both Ii and Ij, and at least one of these instructions wrote Q.
1.Ii = read( Q), Ij = read( Q). Ii and Ij don't conflict.
2.Ii = read( Q), Ij = write( Q). They conflict.
3.Ii = write( Q), Ij = read( Q). They conflict.
4.Ii = write( Q), Ij = write( Q). They conflict.
If Ii and Ij are consecutive in a schedule and they do not conflict, their results would remain the
same even if they had been interchanged in the schedule.
Consider following schedule 3.
The write (A) of T1 conflicts with read (A) of T2. However, write (A) of T2 does not conflict with read (B)
of T1, because, the two instructions access different data items.
Because of no conflict, we can swap write (A) and read (B) instructions to generate a new schedule 5.
Regardless of initial system state, schedule 3 and 5 generates same result.
Swap the write (B) instruction of T1 with write (A) instruction of T2.
Swap the write (B) instruction of T1 with the read (A) instruction of T2.
The final result of these swaps - is shown below, which is serial schedule.
Consider schedule 7. It consists of two transactions T3 and T4. This schedule is not conflict
serializable, since it is not equivalent to either the serial schedule <T3, T4> or the serial schedule <T4, T3>.
View Serializability:
A schedule S is view serializable if it is view equivalent to a serial schedule.
Let S and S0 be two schedules with the same set of transactions. S and S0 are view equivalent if
the following three conditions are met:
1. For each data item Q, if transaction Ti reads the initial value of Q in schedule S, then
transaction Ti must, in schedule S0, also read the initial value of Q.
2. For each data item Q, if transaction Ti executes read(Q) in schedule S, and that value
was produced by transaction Tj (if any), then transaction Ti must in schedule S0 also read the value of Q
that was produced by transaction Tj.
3. For each data item Q, the transaction (if any) that performs the final write (Q)
operation in schedule S must perform the final write (Q) operation in schedule S0.
Every conflict serializable schedule is also view serializable.
The following schedule is view-serializable but not conflict serializable
In the above schedule, transactions T4 and T5 performs write(Q) operations without having
performed a read(Q) operation. Writes of this sort are called blind writes. View serializable schedule with
blind writes is not conflict serializable.
The precedence graph for schedule 1 and schedule 2 are shown in figure given below.
The precedence graph for schedule 1 contains a single edge T1→T2, since all the instructions of T1
are executed before the first instruction of T2 is executed.
To test conflict serializability construct a precedence graph for given schedule. If the graph
contains cycle, the schedule is not conflict serializable. If the graph contains no cycle, the schedule is
conflict serializable.
Schedule 1 and schedule 2 are conflict serializable as the precedence graph for both schedules does
not contain any cycle. While the schedule 4 is not conflict serializable as the precedence graph for it
contains cycle.
A serializability order of the transactions can be obtained through topological sorting, which
determines a linear order consistent with the partial order of the precedence graph.
(a)Test for Conflict Serializability
To test conflict serializability, construct a precedence graph for given schedule. If graph contains
cycle, the schedule is not conflict serializable. If the graph contains no cycle, then the schedule is conflict
serializable.
Schedule 1 and schedule 2 are conflict serializable, as the precedence graph for both schedules does not
contain any cycle. However the schedule 9 is not conflict serializable, as precedence graph for it contains
cycle.
Example: Consider the schedule given in Fig. 5.14. Find out whether that schedule is conflict
serializable or not?
Solution:
The precedence graph for given schedule is
Locks:
The two modes of locks are:
1. Shared. If a transaction Ti has obtained a shared-mode lock (denoted by S) on item Q, then Ti can read,
but cannot write, Q.
2. Exclusive. If a transaction Ti has obtained an exclusive-mode lock (denoted by X) on
item Q, then Ti can both read and write Q.
We require that every transaction request a lock in an appropriate mode on data item Q,
depending on the types of operations that it will perform on Q.
The transaction makes the request to the concurrency-control manager.
The transaction can proceed with the operation only after the concurrency-control manager
grants the lock to the transaction.
The use of these two lock modes allows multiple transactions to read a data item but limits write
access to just one transaction at a time.
To state this more generally, given a set of lock modes, we can define a compatibility function on
them as follows:
Let A and B represent arbitrary lock modes. Suppose that a transaction Ti requests a lock of mode
A on item Q on which transaction Tj (Ti = Tj ) currently holds a lock of mode B.
If transaction Ti can be granted a lock on Q immediately, in spite of the presence of the mode B
lock, then we say mode A is compatible with mode B. Such a function can be represented conveniently by
Department of Computer Science and Engineering Dhanalakshmi Srinivasan College of Engineering
An element comp(A, B) of the matrix has the value true if and only if mode A is compatible with
mode B.
A transaction requests a shared lock on data item Q by executing the lock-S(Q) instruction.
Similarly, a transaction requests an exclusive lock through the lock-X(Q) instruction. A transaction
can unlock a data item Q by the unlock(Q) instruction.
To access a data item, transaction Ti must first lock that item. If the data item is already locked by
another transaction in an incompatible mode, the concurrency control manager will not grant the lock until
all incompatible locks held by other transactions have been released.
Thus, Ti is made to wait until all incompatible locks held by other transactions have been released.
Transaction Ti may unlock a data item that it had locked at some earlier point.
Example:
Let A and B be two accounts that are accessed by transactions T1 and T2. Transaction T1
transfers $50 from account B to account A
Transaction T2 displays the total amount of money in accounts A and B—that is, the sum
A+ B
Suppose that the values of accounts A and B are $100 and $200, respectively.
If these two transactions are executed serially, either in the order T1, T2 or the order T2, T1, then
transaction T2 will display the value $300.
If, however, these transactions are executed concurrently, then schedule 1is possible. In this case,
transaction T2 displays $250, which is incorrect.
The reason for this mistake is that the transaction T1 unlocked data item B too early, as a result
of which T2 saw an inconsistent state.
Advantages:
The two-phase locking protocol ensures conflict serializability.
Consider any transaction, the point in the schedule where the transaction has obtained its final lock
is called the lock-point of the transaction. Now, the transactions can be ordered according to their lock
points. This ordering is the serializability ordering for the transactions.
Disadvantages
Consider schedule 2
Cascading rollbacks can be avoided by a modification of two-phase locking-called the strict two-
phase locking protocol.
Types:
This protocol requires that locking should be two phase, and all exclusive-mode locks taken by a
transaction should be held until the transaction. This requirement prevents any transaction from reading
the data written by any uncommitted transaction under exclusive mode until the transaction commits,
preventing any other transaction from reading the data.
Two-phase locking protocol allows lock conversions. There is a mechanism for upgrading a shared
lock to an exclusive lock, downgrading an exclusive lock to a shared lock. We donote the conversion from
shared to exclusive modes by upgrade, and from exclusive to shared by downgrade. Lock conversion
Department of Computer Science and Engineering Dhanalakshmi Srinivasan College of Engineering
Strict two-phase locking and rigorous two-phase locking (with lock conversions) are used extensively in
commercial database systems.
A simple but widely used scheme automatically generates the appropriate lock and unlock
instructions for a transaction, on the basis of read and write requests from the transaction:
When a transaction Ti issues a read (Q) operation, the system issues a lock-
S(Q) instruction followed by the read (Q) instruction.
When Ti issues a write (Q) operation, the system checks to see whether Ti
already holds a shared lock on Q. If it does, then the system issues an upgrade
(Q) instruction, followed by the write (Q) instruction. Otherwise, the system issues a lock - X(Q)
instruction, followed by the write (Q) instruction.
All locks obtained by a transaction are unlocked after that transaction commits or aborts.
3.8 DEADLOCK:
Deadlock refers to a particular situation where two or more processes are each waiting for another
to release a resource, or more than two processes are waiting for resources in a circular chain. Deadlock is
a common problem in multiprocessing where many processes share a specific type of mutually exclusive
resource.
Some computers, usually those intended for the time-sharing and/or real-time markets, are often
equipped with a hardware lock, or hard lock, which guarantees exclusive access to processes, forcing
serialization.
Deadlocks are particularly disconcerting because there is no general solution to avoid them.
Livelock:
Livelock is a special case of resource starvation. A livelock is similar to a deadlock, except that the
states of the processes involved constantly change with regard to one another wile never progressing.
The general definition only states that a specific process is not progressing.
For example, the system keeps selecting the same transaction for rollback causing the transaction
to never finish executing.
Another livelock situation can come about when the system is deciding which transaction gets a
lock and which waits in a conflict situation.
Deadlock Prevention:
To prevent any deadlock situation in the system, the DBMS aggressively inspects all the
operations which transactions are about to execute.
DBMS inspects operations and analyze if they can create a deadlock situation. If it finds that a
deadlock situation might occur then that transaction is never allowed to be executed.
There are deadlock prevention schemes, which uses time-stamp ordering mechanism of
transactions in order to pre-decide a deadlock situation.
Wait-Die Scheme:
In this scheme, if a transaction request to lock a resource (data item), which is already held with
conflicting lock by some other transaction, one of the two possibilities may occur:
▪ If TS(Ti) < TS(Tj), that is Ti, which is requesting a conflicting lock, is older than Tj, Ti is allowed to wait
until the data-item is available.
▪ If TS(Ti) > TS(tj), that is Ti is younger than Tj, Ti dies. Ti is restarted later with random delay but
with same timestamp.
This scheme allows the older transaction to wait but kills the younger one.
Wound-Wait Scheme:
In this scheme, if a transaction request to lock a resource (data item), which is already held with
conflicting lock by some other transaction, one of the two possibilities may occur:
▪ If TS(Ti) < TS(Tj), that is Ti, which is requesting a conflicting lock, is older than Tj, Ti forces Tj to be
rolled back, that is Ti wounds Tj. Tj is restarted later with random delay but with same timestamp.
▪ If TS(Ti) > TS(Tj), that is Ti is younger than Tj, Ti is forced to wait until the resource is available.
This scheme, allows the younger transaction to wait but when an older transaction request an item
held by younger one, the older transaction forces the younger one to abort and release the item. In both
cases, transaction, which enters late in the system, is aborted.Deadlock Avoidance:
Aborting a transaction is not always a practical approach. Instead deadlock avoidance mechanisms
can be used to detect any deadlock situation in advance.
Methods like "wait-for graph" are available but for the system where transactions are light in
weight and have hold on fewer instances of resource. In a bulky system deadlock prevention techniques
may work well.
Wait-for Graph:
This is a simple method available to track if any deadlock situation may arise. For each transaction entering
in the system, a node is created.
When transaction Ti requests for a lock on item, say X, which is held by some other transaction Tj,
a directed edge is created from Ti to Tj. If Tj releases item X, the edge between them is dropped and Ti
locks the data item.
The system maintains this wait-for graph for every transaction waiting for some data items held by
others. System keeps checking if there's any cycle in the graph.
Two approaches can be used, first not to allow any request for an item, which is already locked
by some other transaction.
This is not always feasible and may cause starvation, where a transaction indefinitely waits for
data item and can never acquire it. Second option is to roll back one of the transactions.
It is not feasible to always roll back the younger transaction, as it may be important than the older
one.
With help of some relative algorithm a transaction is chosen, which is to be aborted, this
transaction is called victim and the process is known as victim selection.
Phase 1
(1) Write a begin_commit record to the log file and force-write it to stable storage.
Send a PREPARE message to all participants.
Wait for participants to respond within a timeout period.
Phase 2
(2) If a participant returns an ABORT vote,
Write an abort record to the log file and force write it to stable storage.
Send a GLOBAL_ABORT message to all participants.
Wait for participants to acknowledge within a timeout period.
(3) If a participant returns a READY_COMMIT vote,
Write a commit record to the log file and force-write it to stable storage.
Send a GLOBAL_COMMIT message to all participants.
Wait for participants to acknowledge within a timeout period.
(4) Once all acknowledgements have been received,
Write an end_transaction message to the log file.
The following code block contains the series of operations. SQL> SAVEPOINT SP1;
Savepoint created.
1 row deleted.
Savepoint created.
1 row deleted.
Savepoint created.
1 row deleted.
Now that the three deletions have taken place, let us assume that you have changed your mind and decided
to ROLLBACK to the SAVEPOINT that you identified as SP2. Because SP2 was created after the first
deletion, the last two deletions are undone −
Rollback complete.
Notice that only the first deletion took place since you rolled back to SP2.
Dirty Reads returns different results within a single transaction when an SQL operation an uncommitted
or modified record created by another transaction. Dirty Reads increases concurrency, but reduces
consistency.
Non-Repeatable Reads returns different results within a single transaction when an SQL operation reads
the same row in a table twice. Non-Repeatable Reads can occur when another transaction modifies and
commits a change to the row between transaction reads. Non-repeatable reads increases consistency, but
reduces concurrency.
Phantoms returns different results within a single transaction when an SQL operation retrieves a range of
data values twice. Phantoms can occur if another transaction inserted a new record and committed the
insertion between executions of the range retrieval.
Read Uncommitted – Read Uncommitted is the lowest isolation level. In this level, one transaction may
read not yet commited changes made by other transaction, thereby allowing dirty reads. In this level,
transactions are not isolated from each other.
Repeatable Read – This is the most restrictive isolation level. The transaction holds read locks on all
rows it references and write locks on all rows it inserts, updates, or deletes. Since other transaction cannot
read, update or delete these rows, consequently it avoids non repeatable read.
Serializable – This is the Highest isolation level. A serializable execution is guaranteed to be serializable.
Serializable execution is defined to be an execution of operations in which concurrently executing
transactions appears to be serially executing.
3.12 SQL FACILITIES FOR CONCURRENCY AND RECOVERY:
Crash Recovery
DBMS is a highly complex system with hundreds of transactions being executed every second.
The durability and robustness of a DBMS depends on its complex architecture and its underlying
hardware and system software. If it fails or crashes amid transactions, it is expected that the system would
follow some sort of algorithm or techniques to recover lost data.
Failure Classification
To see where the problem has occurred, we generalize a failure into various categories, as follows
−
a) Transaction failure
A transaction has to abort when it fails to execute or when it reaches a point from where it can’t go
any further. This is called transaction failure where only a few transactions or processes are hurt.
Logical errors − Where a transaction cannot complete because it has some code error or any internal error
condition.
System errors − Where the database system itself terminates an active transaction because the DBMS is
not able to execute it, or it has to stop because of some system condition. For example, in case of deadlock
or resource unavailability, the system aborts an active transaction.
System Crash
There are problems − external to the system − that may cause the system to stop abruptly and cause
the system to crash. For example, interruptions in power supply may cause the failure of underlying
hardware or software failure.
Disk Failure
In early days of technology evolution, it was a common problem where hard-disk drives or storage
drives used to fail frequently. Disk failures include formation of bad sectors, unreachability to the disk,
disk head crash or any other failure, which destroys all or a part of disk storage.
Storage Structure
Volatile storage − As the name suggests, a volatile storage cannot survive system crashes. Volatile
storage devices are placed very close to the CPU; normally they are embedded onto the chipset itself. For
example, main memory and cache memory are examples of volatile storage. They are fast but can store
only a small amount of information.
Non-volatile storage − These memories are made to survive system crashes. They are huge in data storage
capacity, but slower in accessibility. Examples may include hard-disks, magnetic tapes, flash memory, and
non-volatile (battery backed up) RAM.
It should check the states of all the transactions, which were being executed.
A transaction may be in the middle of some operation; the DBMS must ensure the atomicity of the
transaction in this case.
It should check whether the transaction can be completed now or it needs to be rolled back.
There are two types of techniques, which can help a DBMS in recovering as well as maintaining the
atomicity of a transaction −
Maintaining the logs of each transaction, and writing them onto some stable storage before actually
modifying the database.
Maintaining shadow paging, where the changes are done on a volatile memory, and later, the actual
database is updated.
Log-based Recovery
Log is a sequence of records, which maintains the records of actions performed by a transaction. It
is important that the logs are written prior to the actual modification and stored on a stable storage media,
which is failsafe.
Deferred database modification − All logs are written on to the stable storage and the database is
updated when a transaction commits.
Immediate database modification − Each log follows an actual database modification. That is, the
database is modified immediately after every operation.
Recovery with Concurrent Transactions
When more than one transaction are being executed in parallel, the logs are interleaved. At the time
of recovery, it would become hard for the recovery system to backtrack all logs, and then start recovering.
To ease this situation, most modern DBMS use the concept of 'checkpoints'.
1. Checkpoint
Keeping and maintaining logs in real time and in real environment may fill out all the memory
space available in the system. As time passes, the log file may grow too big to be handled at all.
Checkpoint is a mechanism where all the previous logs are removed from the system and stored
permanently in a storage disk. Checkpoint declares a point before which the DBMS was in consistent
state, and all the transactions were committed.
2. Recovery
When a system with concurrent transactions crashes and recovers, it behaves in the
following manner −
If the recovery system sees a log with <Tn, Start> and <Tn, Commit> or just <Tn, Commit>, it puts the
transaction in the redo-list.
If the recovery system sees a log with <Tn, Start> but no commit or abort log found, it puts the transaction
in undo-list.
All the transactions in the undo-list are then undone and their logs are removed. All the transactions in
the redo-list and their previous logs are removed and then redone before saving their logs.
*******************