Transaction Management
&
Concurrency Control
Transactions
A transaction is a collection of operations that form a single logical unit of work.
e.g. Someone wants to transfer INR 500 from one bank account (maintained by
variable A) to another bank account (maintained by variable B)
T1
r(A) // I/O burst operation
A=A-500; // CPU burst operation LOGICALLY
w(A) // I/O burst operation A SINGLE
r(B) // I/O burst operation OPERATION
B=B+500; // CPU burst operation
w(B) // I/O burst operation
At any instance of moment, there can be many such transactions T1, T2,
T3, …, Tn submitted by same or many users.
If these transactions are executed sequentially:
(a) waiting time will be more for some transactions than expected
(b) Many CPU cycles will be idle when I/O operations are going on.
So, it is advisable to interlace the concurrent transactions for better
CPU utilization.
Users submit transactions, and can think of each transaction as
executing by itself.
Concurrency is achieved by the DBMS, which interleaves actions
(reads/writes of DB objects) of various transactions.
However simultaneously executing transactions T1, T2, T3, …, Tn may
act on same variables residing in database, and therefore their order of
execution (which adheres to business logic) must be satisfied by such
concurrent execution.
Hence it is termed as ‘controlled concurrency’.
Let us assume T1, T2, T3, T4, T5, T6 are submitted to DBMS
simultaneously for execution, with constraint that it should execute in
the following order: T2, T1, T5, T6, T3, T4. Hence, these transactions may
be executed concurrently if possible, but should yield the same result
as it would have been generated had they been executed sequentially
in the given order.
Properties of transaction
Atomicity: all or nothing
Consistency: takes DB from one consistent state to another, without necessarily
preserving interim consistency.
Isolation: Concealed from each other. The changes of a variable by one transaction
are not realizable by another transaction until written back.
Durability: Once a transaction commits, its updates survive in the database.
Transactions
A user’s program may carry out many operations on the data
retrieved from the database, but the DBMS is only concerned about
what data is read/written from/to the database.
A transaction is the DBMS’s abstract view of a user program: a
sequence of reads and writes.
Lifecycle of a transaction
Lifecycle of a transaction
Concurrency of transactions
T1 T2
Let us assume initial
(to withdraw INR 50 (to withdraw INR
value of A is 1000 before
from account A) 100
submission of both
from account A) transactions.
r(A)
r(A) Desired answer after T1
A=A-50; and T2 executed:
A=A-100; A=850
w(A)
w(A) After execution of T1 and
T2 (be in any serial
order), the answer
should be what is
produced by this non-
concurrent execution!!!
Anomalies with Interleaved Execution
Reading Uncommitted Data (WR Conflicts: dirty read problem):
T1: R(A), W(A), R(B), W(B), Fail, Rollback
T2: R(A), W(A), C
Unrepeatable Reads or non-repeatable read problem (RW Conflicts):
T1: R(A), R(A), W(A), C
T2: R(A), W(A), C
Overwriting Uncommitted Data (WW Conflicts: lost update problem):
T1: R(A), W(A), W(B), C
T2: R(A), W(A), W(B), C
Anomalies with Interleaved Execution
Reading Uncommitted Data (WR Conflicts: dirty read problem):
T1: R(A), W(A), R(B), W(B), Fail, Rollback
T2: R(A), W(A), C
Reading uncommitted data
Dirty Read: Example
T1 T2
r(A) reads the data
(to withdraw INR 50 (to credit 6% interest
modified by T1 through
from account A) to account A)
w(A) but T1 has not
committed yet (value of
r(A) A read by T2 is not yet
A=A-50; saved permanently by
T1).
w(A) r(A)
X=A*0.06;
A=A+X; As T1 fails, all
operations rollback and
w(A)
value of A is not
updated in database, but
r(B) T2 reads and works on
Failure B=B+50; it!!
w(B)
Anomalies with Interleaved Execution
Unrepeatable Reads or non-repeatable read problem (RW Conflicts):
T1: R(A), R(A), W(A), C
T2: R(A), W(A), C
When a transaction reads a data item twice and meanwhile another
transaction operates on the same data, leading to reading two different
values of same data item by the first transaction.
Unrepeatable read: Example
T1 T2
Two subsequent reads of
(to withdraw INR 50 (to withdraw INR
item A in T1 finds two
from account A) 100
different values!!!
from account A)
r(A)
r(A)
A=A-100;
w(A)
r(A)
A=A-50;
w(A)
Anomalies with Interleaved Execution
Overwriting Uncommitted Data (WW Conflicts: lost update problem):
T1: R(A), W(A), W(B), C
T2: R(A), W(A), W(B), C
The second WRITE overwrites the first WRITE.
If there are two write operation on same variable from two different
transactions, and there is no read operation in between, it will cause
loss of result of first WRITE operation, termed as lost update.
Lost update : Example
T1 T2
The effect of value
update of A by w(A) in
r(A) T1 is lost due to w(A) in
A=A-50; T2!!!
r(A)
X=A*0.06;
A=A+X;
w(A)
w(A)
r(B)
B=B+50;
w(B)
Incorrect summary problem is discussed in some books, but that is a kind
of the above discussed problems.
Scheduling Transactions
Serial schedule: Schedule that does not interleave the actions of different transactions.
Equivalent schedules: For any database state, the effect (on the set of objects in the
database) of executing the first schedule is identical to the effect of executing the
second schedule.
Non-serial schedule: Schedule that interleaves the actions of different transactions.
Serializable schedule: A non-serial schedule that is equivalent to some serial
execution of the transactions.
(Note: If each transaction preserves consistency, every serializable schedule
preserves consistency.)
Scheduling Transactions
A schedule is a defined order of execution of the transactions.
Let us assume k concurrent transactions. T1 has n1 operations, T1 has n1
operations, …, Tk has nk operations.
Number of possible ways of serial execution (i.e. number of serial
schedules) is: k!
One or few of these serial executions are actually intended by business
logic.
Scheduling Transactions
Let us assume k concurrent transactions. T1 has n1 operations, T2 has n2
operations, …, Tk has nk operations.
Number of possible ways of interlaced execution (i.e. number of non-
serial schedules) is:
(n1 + n2 +…+ nk)!
- k!
n1! n2! … nk!
Few of these interlaced executions will actually produce the same
output as required by the business logic (equivalent to some decided
serial execution).
Types of Schedules
Serial schedule
Complete schedule
Recoverable and non-recoverable
schedule
Cascading and Cascadeless
schedule
Strict schedule
Equivalent and non-equivalent
schedule
Serial Schedules
T1 T2
Serial schedule means no
r(A)
interleaving.
A=A-50;
Ensures to take database to
w(A)
consistent state.
r(A)
A=A+500;
w(A)
Non-serial Schedules
T1 T2
Non-serial schedule means there
r(A)
will be interleaving.
A=A-50;
May take database to consistent
r(A)
state.
A=A+500;
May generate unwanted result. w(A)
w(A)
Complete Schedules
T1 T2
begin begin
Commit or Abort statement is
r(A)
used at the end of the transaction
to permanently record the values A=A-50;
of the data items in database. r(A)
A=A+500;
Unless particularly needed, we
w(A)
usually do not explicitly write it
abort
in our notation, but we should!
w(A)
commit
Recoverable Schedules
T2 has read a data written interim by T1 T2
T1. begin begin
T2 should be rolled back due to rolling r(A)
back of T1, as T2 has read a data A=A-50;
written by T1. w(A)
But T2 cannot be rolled back after T1 r(A)
fails at it has already committed. A=A+500;
w(A)
This schedule is not recoverable as one
participating transaction cannot be r(B)
rolled back if one of the transactions B=B+50;
Failure
abort. w(B)
If the transaction reading data from
another transaction commits ONLY Commit Commit
after the commit of the transaction
from which it reads data, the schedule
can be recoverable.
Recoverable Schedules
T1 T2 T1 T2
begin begin begin begin
r(A) r(A)
A=A-50; A=A-50;
w(A) w(A)
r(A) r(A)
A=A+500; A=A+500;
w(A) w(A)
Commit r(B)
r(B) B=B+50;
Failure
B=B+50; w(B)
Failure Commit
w(B)
Commit Commit
Non-Recoverable Recoverable
Cascadeless Schedules
A bit stricter condition! T1 T2 T3
begin begin begin
One transaction reads only those data
ALREADY committed by another r(A)
transaction. w(A)
Commit
No need of cascading of rollback in this
case. r(A)
w(A)
Cascadeless -> Recoverable
Commit
r(A)
w(A)
Commit
Strict Schedules
A further bit stricter condition! T1 T2 T3
begin begin begin
One transaction reads or writes only
those data ALREADY committed by r(A)
another transaction. w(A)
Commit
No need of cascading of rollback in this
case too! r(A)
w(A)
Strict -> Cascadeless
Commit
r(A)
w(A)
Commit
Non-Recoverable Recoverable Cascadeless Strict
T1 T2 T1 T2 T1 T2 T1 T2
begin begin begin begin begin begin begin begin
r(A) r(A) r(A) r(A)
r(A) w(A) w(A) w(A)
w(A) r(A) Commit Commit
Commit w(A) r(A) r(A)
w(A) w(A) w(A)
Commit Commit Commit Commit
Commit
Equivalent Schedules
Two schedules can be compared and they can be said
‘equivalent’ if they follow certain rules.
Result equivalent
Conflict equivalent
Result Equivalent Schedules
Two schedules are said to be result Initially A = 100
equivalent if they produce same S1 S2
database state for a given database state r(A) r(A)
as input.
A=A+10; A=A*1.1;
w(A) w(A)
The two schedules are result equivalent
if the initial state is A=100.
Result Equivalent Schedules
Initially X = 2, Y=5
T1 T2 T1 T2
S1 S2
r(X) r(X)
X=X+5; X=X*3;
w(X) w(X)
r(X) r(X)
X=X*3; X=X+5;
w(X) w(X)
r(Y) r(Y)
Y=Y+5; Y=Y+5;
w(Y) w(Y)
Finally X = 21, Y=10 Finally X = 11, Y=10
S1 and S2 are not result equivalent with respect to the given input state.
Conflicts
Conflicting actions are any combination of
the following: Ti Tj
r(A) by Ti and w(A) by Tj r(A) r(A)
w(A) by Ti and r(A) by Tj A=A+10; A=A*1.1;
w(A) by Ti and w(A) by Tj w(A) w(A)
Non-conflicting actions are:
r(A) by Ti and r(A) by Tj
Operation on two different data items
Conflict Equivalent Schedules
For two schedules S1 and S2 to be conflict equivalent,
they should have same conflicts and in same order.
Conflict Equivalent Schedules
T1 T2 T1 T2
S1 S2
r(A) r(A)
w(A) w(A)
r(A) r(A)
w(A) r(B)
r(B) w(A)
w(B) w(B)
Are S1 and S2 conflict equivalent?
Conflict Equivalent Schedules
T1 T2
S1
r(A)
Conflicts from S1 are:
w(A)
r(A)|T1 - w(A)|T2
r(A)
w(A)|T1 - r(A)|T2
w(A)
w(A)|T1 - w(A)|T2
r(B)
w(B)
Conflict Equivalent Schedules
T1 T2
S2
r(A)
Conflicts from S2 are:
w(A)
r(A)|T1 - w(A)|T2
r(A)
w(A)|T1 - r(A)|T2
r(B)
w(A)|T1 - w(A)|T2
w(A)
w(B)
Conflict Equivalent Schedules
Conflicts from S1 are: Conflicts from S2 are:
r(A)|T1 - w(A)|T2 r(A)|T1 - w(A)|T2
w(A)|T1 - r(A)|T2 w(A)|T1 - r(A)|T2
w(A)|T1 - w(A)|T2 w(A)|T1 - w(A)|T2
As conflicts in S1 and S2 are same and in same order, they are conflict
equivalent.
Conflict Equivalent Schedules
Are S1 and S2 conflict equivalent?
Transaction numbers are written as subscripts
c
Is S1 = S2 ?
S1: R1(A), R2(B),W1(A), W2(B)
S2: R2(B), R1(A), W2(B), W1(A)
Conflict Equivalent Schedules
Are S1 and S2 conflict equivalent?
Transaction numbers are written as subscripts
c
Is S1 = S2 ? YES
S1: R1(A), R2(B),W1(A), W2(B)
S2: R2(B), R1(A), W2(B), W1(A)
Conflict Equivalent Schedules
Are S1 and S2 conflict equivalent?
Transaction numbers are written as subscripts
c
Is S1 = S2 ?
S1: R1(A), W1(A), R2(B), W2(B), R1(B)
S2: R1(A), W1(A) , R1(B), R2(B), W2(B)
Conflict Equivalent Schedules
Are S1 and S2 conflict equivalent?
Transaction numbers are written as subscripts
c
Is S1 = S2 ? NO
S1: R1(A), W1(A), R2(B), W2(B), R1(B)
S2: R1(A), W1(A) , R1(B), R2(B), W2(B)
Conflict Equivalent Schedules
Are S1 and S2 conflict equivalent?
Transaction numbers are written as subscripts
c
Is S1 = S2 ?
S1: R2(A), R1(A), W2(B), W1(B)
S2: R1(A), R1(A) , W2(B), W1(B)
Conflict Equivalent Schedules
Are S1 and S2 conflict equivalent?
Transaction numbers are written as subscripts
c
Is S1 = S2 ? YES
S1: R2(A), R1(A), W2(B), W1(B)
S2: R1(A), R1(A) , W2(B), W1(B)
“Conflict equivalence” implies “View /
Result Equivalence”
Conflict Serializable Schedules
If a non-serial schedule S1 is conflict equivalent to a serial schedule
S2, then S1 is termed as conflict serializable.
Conflict Serializable Schedules
To summarize what we learnt till now,
Two schedules are conflict equivalent if:
Involve the same actions of the same transactions
Every pair of conflicting actions is ordered the
same way
Schedule S is conflict serializable if S is
conflict equivalent to some serial schedule
Testing conflict serializability
T1: R(A), W(A), R(B), W(B),
T2: R(A), W(A), R(B), W(B), W(B)
T1: R(A), W(A), R(B), W(B),
T2: R(A), W(A), R(B), W(B), W(B)
Step 1:
Start drawing a graph, which is called as precedence
graph. Create a node Ti for each participating transaction
in the schedule S under test.
T1 T2
T1: R(A), W(A), R(B), W(B),
T2: R(A), W(A), R(B), W(B), W(B)
Step 2:
For each case where Tj executes R(X) after Ti executing
W(X), we draw an edge from Ti to Tj.
A,B
T1 T2
T1: R(A), W(A), R(B), W(B),
T2: R(A), W(A), R(B), W(B), W(B)
Step 3:
For each case where Tj executes W(X) after Ti executing
R(X), we draw an edge from Ti to Tj.
A,B
T1 T2
A,B
T1: R(A), W(A), R(B), W(B),
T2: R(A), W(A), R(B), W(B), W(B)
Step 4:
For each case where Tj executes W(X) after Ti executing
W(X), we draw an edge from Ti to Tj.
A,B
T1 T2
A,B
T1: R(A), W(A), R(B), W(B),
T2: R(A), W(A), R(B), W(B), W(B)
Step 5:
Cycle in precedence graph -> NOT conflict serializable
NO cycle in precedence graph -> conflict serializable
T1 T2 NOT conflict serializable
•Consider the following two statements about database transaction schedules:
•I. Strict two-phase locking protocol generates conflict serializable schedules
that are also recoverable.
•II. Timestamp-ordering concurrency control protocol with Thomas’ Write Rule
can generate view serializable schedules that are not conflict serializable.
Which of the above statements is/are TRUE?
A. I only
B. II only
C. Both I and II
D. Neither I nor II
GATE 2019
•Consider the following two statements about database transaction schedules:
•I. Strict two-phase locking protocol generates conflict serializable schedules
that are also recoverable.
•II. Timestamp-ordering concurrency control protocol with Thomas’ Write Rule
can generate view serializable schedules that are not conflict serializable.
Which of the above statements is/are TRUE?
A. I only
Explanation:
Following Strict 2-PL ensures that our
B. II only schedule is: serializablity, Recoverable,
and Cascadeless. So statement (I) is true.
Thomas Write Rule for concurrency control
C. Both I and II does not enforce Conflict Serializablity, so
statement (II) is also true. Option (C) is
correct.
D. Neither I nor II
GATE 2019
Consider a schedule of transactions T1 and T2:
Here, RX stands for “Read(X)” and WX stands for “Write(X)”. Which one of
the following schedules is conflict equivalent to the above schedule ?
A.
B.
C.
D.
GATE 2020
Consider a schedule of transactions T1 and T2:
Here, RX stands for “Read(X)” and WX stands for “Write(X)”. Which one of
the following schedules is conflict equivalent to the above schedule ?
A.
B.
C.
D.
GATE 2020
Consider the following two schedules
Which among the following is correct option?
A. Both S1 and S2 are Conflict Serializability Schedule.
B. Only S1 is Conflict Serialiazble.
C. Only S2 is Conflict Serializable.
D. Neither S1 and nor S2 is Conflict Serializable.
GATE 2021
Consider the following two schedules
Which among the following is correct option?
A. Both S1 and S2 are Conflict Serializability Schedule.
B. Only S1 is Conflict Serialiazble.
C. Only S2 is Conflict Serializable.
D. Neither S1 and nor S2 is Conflict Serializable.
GATE 2021
Let S be the following schedule of operations of three transactions T1, T2
and T3 in a relational database system:
R2(Y),R1(X),R3(Z),R1(Y)W1(X),R2(Z),W2(Y),R3(X),W3(Z)
Consider the statements P and Q below:
P: S is conflict-serializable.
Q: If T3 commits before T1 finishes, then S is recoverable.
Which among the following is correct option?
A. Both P and Q are true
B. P is true and Q is false
C. P is false and Q is true
D. Both P and Q are false
GATE 2021
Let S be the following schedule of operations of three transactions T1, T2
and T3 in a relational database system:
R2(Y),R1(X),R3(Z),R1(Y)W1(X),R2(Z),W2(Y),R3(X),W3(Z)
Consider the statements P and Q below:
P: S is conflict-serializable.
Q: If T3 commits before T1 finishes, then S is recoverable.
Which among the following is correct option?
A. Both P and Q are true
B. P is true and Q is false
C. P is false and Q is true
D. Both P and Q are false
GATE 2021