Unit-5: Transactions
Unit-5: Transactions
Unit-5
Transactions
✓ Outline
Looping
• What is transaction?
• ACID property
• Transaction State Diagram \ State Transition
Diagram
• Schedule
• Two phase commit protocol
• Database recovery
• Concurrency
• Deadlock
What is transaction?
Section – 1
What is transaction?
• A transaction is a sequence of operations performed as a single logical unit of
work.
Want to transfer Rs. 50 from Account-A to
• A transaction is a logical unit of work that contains one or more SQL
Account-B
statements.
• Example of transaction: Works as a single
logical unit
read (A)
A = A – 50
Transaction write (A) Operations
read (B)
B = B + 50
write (B)
ACID property
Section – 2
ACID properties of transaction
• Atomicity (Either transaction execute 0% or 100%)
• Consistency (Database must remain in a consistent state after any
transaction)
• Isolation (Intermediate transaction results must be hidden from other
concurrently executed transactions)
• Durability (Once a transaction completed successfully, the changes it has
made into the database should be permanent)
ACID properties of transaction (Atomicity)
• This property states that a transaction must be
treated as an atomic unit, that is, either all of its 0%
operations are executed or none.
read (A)
• Either transaction execute 0% or 100%.
A = A – 50
• For example, consider a transaction to transfer Rs. 50 write (A)
from account A to account B. read (B)
FAIL
TheThis
Discoveris that
state the initial
afternormal state.
executionhas
the transaction canbeen
no longer
rolledproceed.
back and the
Thea transaction
Once
database has been stays
transaction in be
cannot
restoredthis state while
tocompleted,
its state it istochanges
any
prior executing.
that
the start of
it made
the must be undone rolling it back.
transaction.
Transaction State Diagram \ State Transition Diagram
• Active
• This is the initial state.
• The transaction stays in this state while it is executing.
• Partial Committed
• When a transaction executes its final operation/ instruction, it is said to be in a partially committed state.
• Failed
• Discover that normal execution can no longer proceed.
• Once a transaction cannot be completed, any changes that it made must be undone rolling it back.
• Committed
• The transaction enters in this state after successful completion of the transaction (after committing
transaction).
• We cannot abort or rollback a committed transaction.
• Aborted
• The state after the transaction has been rolled back and the database has been restored to its state prior
to the start of the transaction.
Schedule
Section – 4
What is schedule?
• A schedule is a process of grouping the transactions into one and
executing them in a predefined order.
• A schedule is the chronological (sequential) order in which
instructions are executed in a system.
• A schedule is required in a database because when some
transactions execute in parallel, they may affect the result of the
transaction.
• Means if one transaction is updating the values which the other
transaction is accessing, then the order of these two transactions
will change the result of another transaction.
• Hence a schedule is created to execute the transactions.
Example of schedule
Schedule Schedule Execution
T1 T2 A=B=1000
Read (A) Read (1000)
A = A - 50 A = 1000 - 50
Write (A) Write (950)
Read (B) Read (1000)
B = B + 50 B = 1000 + 50
Write (B) Write (1050)
Commit Commit
Read (A) Read (950)
temp = A * 0.1 temp = 950 * 0.1
A = A - temp A = 950 - 95
Write (A) Write (855)
Read (B) Read (1050)
B = B + temp B = 1050 + 95
Write (B) Write (1145)
Commit Commit
Example of schedule
Schedule Schedule Execution
T1 T2 A=B=1000
Read (A) Read (1000)
Temp = A * 0.1 Temp = 1000 * 0.1
A = A - temp A = 1000 - 100
Write (A) Write (900)
Read (B) Read (1000)
B = B + temp B = 1000 + 100
Write (B) Write (1100)
Commit Commit
Read (A) Read (900)
A = A - 50 A = 900 - 50
Write (A) Write (850)
Read (B) Read (1100)
B = B + 50 B = 1100 + 50
Write (B) Write (1150)
Commit Commit
Serial schedule
• A serial schedule is a schedule in which no transaction starts
until a running transaction has ended.
• A serial schedule is a schedule in which one transaction is
executed completely before starting another transaction.
• Transactions are executed one after the other.
• This type of schedule is called a serial schedule, as
transactions are executed in a serial manner.
Example of Serial Schedule
Serial Schedule Serial Schedule
T1 T2 T1 T2
Read (A) Read (A)
A = A - 50 A = A - 50
Write (A) Write (A)
Read (B) Read (B)
B = B + 50 B = B + 50
Write (B) Write (B)
Commit Commit
Read (A) Read (A)
temp = A * 0.1 temp = A * 0.1
A = A - temp A = A - temp
Write (A) Write (A)
Read (B) Read (B)
B = B + temp B = B + temp
Write (B) Write (B)
Commit Commit
Non-serial Schedule (Interleaved Schedule)
• Schedule that interleave the execution of different
transactions.
• Means second transaction is started before the first one
could end and execution can switch between the transactions
back and forth.
• It contains many possible orders in which the system can
execute the individual operations of the transactions.
Example of Non-serial Schedule (Interleaved Schedule)
Non-serial Schedule Non-serial Schedule
T1 T2 T1 T2
Read (A) Read (A)
A = A - 50 A = A - 50
Write (A) Write (A)
Read (A) Read (A)
temp = A * 0.1 temp = A * 0.1
A = A - temp A = A - temp
Write (A) Write (A)
Read (B) Read (B)
B = B + 50 B = B + 50
Write (B) Write (B)
Commit Commit
Read (B) Read (B)
B = B + temp B = B + temp
Write (B) Write (B)
Commit Commit
Equivalent Schedule
• If two schedules produce the same result after execution,
they are said to be equivalent schedule.
• They may yield the same result for some value and different
results for another set of values.
• That's why this equivalence is not generally considered
significant.
Equivalent Schedule
Schedule-1 (A=B=1000) Schedule-2 (A=B=1000)
T1 T2 T1 T2
Read (A) Read (A)
A = A - 50 A = A - 50
Write (A) Write (A)
Read (A) Read (B)
equivalent
Commit Write (A)
Read (B) Read (B)
preserved.
B = B + temp B = B + 50
Write (B) Write (B)
Commit Commit
Serializability
• A schedule is serializable if it is equivalent to a serial schedule.
• In serial schedules, only one transaction is allowed to execute
at a time i.e. no concurrency is allowed.
• Whereas in serializable schedules, multiple transactions can
execute simultaneously i.e. concurrency is allowed.
• Types (forms) of serializability
• Conflict serializability
• View serializability
Conflicting instructions
• Let li and lj be two instructions of transactions Ti and Tj respectively.
Ti Tj Ti Tj
1. li = read(Q), lj = read(Q) read (Q) read (Q)
li and lj don’t conflict read (Q) read (Q)
Ti Tj Ti Tj
2. li = read(Q), lj = write(Q) read (Q) write(Q)
li and lj conflict write(Q) read (Q)
3. li = write(Q), lj = read(Q) Ti Tj Ti Tj
write(Q) read (Q)
li and lj conflict
read (Q) write(Q)
4. li = write(Q), lj = write(Q) Ti Tj Ti Tj
write(Q) write(Q)
li and lj conflict
write(Q) write(Q)
Conflict serializability
• If a given schedule can be converted into a serial schedule by swapping its
non-conflicting operations, then it is called as a conflict serializable schedule.
Conflict serializability (Example)
T1 T2 T1 T2
Read (A) Read (A)
A = A - 50 A = A - 50
Write (A) Write (A)
Read (A) Read (B)
Temp = A * 0.1 B = B + 50
A = A - temp Write (B)
Write (A) Commit
Read (A)
Read (B) Temp = A * 0.1
B = B + 50 A = A - temp
Write (B) Write (A)
Commit Read (B) Read (B)
B = B + temp B = B + temp
Write (B) Write (B)
Commit Commit
Conflict serializability (Example)
• Example of a schedule that is not conflict serializable:
T1 T2
Read (A)
Write (A)
Read (A)
• Above two schedules S1 and S3 are not view equivalent because initial read
operation in S1 is done by T1 and in S3 it is done by T2.
• Above two schedules S1 and S2 are view equivalent because initial read
operation in S1 is done by T1 and in S2 it is also done by T1.
Updated Read
• If in schedule S1 transaction Ti executes read(Q), and that value was produced
by transaction Tj (if any), then in schedule S2 also transaction Ti must read
the value of Q that was produced by transaction Tj.
S1 S3 S2
T1 T2 T3 T1 T2 T3 T1 T2 T3
Write (A) Write (A) Write (A)
Write (A) Write (A) Read (A)
Read (A) Read (A) Write (A)
• Above two schedules S1 and S3 are not view equal because, in S1, T3 is
reading A that is updated by T2 and in S3, T3 is reading A which is updated by
T1.
• Above two schedules S1 and S2 are view equal because, in S1, T3 is reading A
that is updated by T2 and in S2 also, T3 is reading A which is updated by T2.
Final Write
• If Ti performs the final write on the data value in S1, then it also performs the
final write on the data value in S2.
S1 S3 S2
T1 T2 T3 T1 T2 T3 T1 T2 T3
Write (A) Write (A) Read (A)
Read (A) Write (A) Write (A)
Write (A) Read (A) Write (A)
• Above two schedules S1 and S3 are not view equal because final write
operation in S1 is done by T3 and in S3 final write operation is also done by
T1.
• Above two schedules S1 and S2 are view equal because final write operation
in S1 is done by T3 and in S2 also the final write operation is also done by T3.
View serializable example
• If a schedule is view equivalent to its serial schedule then the given schedule is
said to be view serializable.
Non-Serial Schedule (S1) Serial Schedule (S2)
T1 T2 T1 T2
Read (A) Read (A)
Write (A) Write (A)
Read (A) Read (B)
Write (A) Write (B)
Read (B) Read (A)
Write (B) Write (A)
Read (B) Read (B)
Write (B) Write (B)
• S2 is the serial schedule of S1. If we can prove that they are view equivalent
then we can says that given schedule S1 is view serializable.
View serializable example (Initial Read)
Non-Serial Schedule (S1) Serial Schedule (S2)
T1 T2 T1 T2
Read (A) Read (A)
Write (A) Write (A)
Read (A) Read (B)
Write (A) Write (B)
Read (B) Read (A)
Write (B) Write (A)
Read (B) Read (B)
Write (B) Write (B)
• In schedule S1, transaction T1 first reads the data item X. In S2 also transaction T1 first
reads the data item X.
• In schedule S1, transaction T1 first reads the data item Y. In S2 also the first read
operation on Y is performed by T1.
• The initial read condition is satisfied for both the schedules.
View serializable example (Updated Read)
Non-Serial Schedule (S1) Serial Schedule (S2)
T1 T2 T1 T2
Read (A) Read (A)
Write (A) Write (A)
Read (A) Read (B)
Write (A) Write (B)
Read (B) Read (A)
Write (B) Write (A)
Read (B) Read (B)
Write (B) Write (B)
• In schedule S1, transaction T2 reads the value of X, written by T1. In S2, the same
transaction T2 reads the X after it is written by T1.
• In schedule S1, transaction T2 reads the value of Y, written by T1. In S2, the same
transaction T2 reads the value of Y after it is updated by T1.
• The updated read condition is also satisfied for both the schedules.
View serializable example (Final Write)
Non-Serial Schedule (S1) Serial Schedule (S2)
T1 T2 T1 T2
Read (A) Read (A)
Write (A) Write (A)
Read (A) Read (B)
Write (A) Write (B)
Read (B) Read (A)
Write (B) Write (A)
Read (B) Read (B)
Write (B) Write (B)
• In schedule S1, the final write operation on X is done by transaction T2. In S2 also
transaction T2 performs the final write on X.
• In schedule S1, the final write operation on Y is done by transaction T2. In schedule
S2, final write on Y is done by T2.
• The final write condition is also satisfied for both the schedules.
View serializable example
Non-Serial Schedule (S1) Serial Schedule (S2)
T1 T2 T1 T2
Read (A) Read (A)
Write (A) Write (A)
Read (A) Read (B)
Write (A) Write (B)
Read (B) Read (A)
Write (B) Write (A)
Read (B) Read (B)
Write (B) Write (B)
• Since all the three conditions that checks whether the two schedules are view
equivalent are satisfied in this example, which means S1 and S2 are view equivalent.
• Also, as we know that the schedule S2 is the serial schedule of S1, thus we can say
that the schedule S1 is view serializable schedule.
Concurrency
What is concurrency?
• Concurrency is the ability of a database to allow multiple (more than one)
users to access data at the same time.
• Three problems due to concurrency
• Lost update problem
• Dirty read problem
• Incorrect retrieval problem
Lost update problem
• This problem indicate that if two X=100
transactions T1 and T2 both read the same T1 Time T2
data and update it then effect of first --- T0 ---
update will be overwritten by the second Read X T1 ---
update. --- T2 Read X
Update X
• How to avoid: A transaction T2 must not T3 ---
X=75
update the data item (X) until the --- T4
Update X
transaction T1 can commit data item (X). X=50
--- T5 ---
Dirty read problem
• The dirty read arises when one X=100
transaction update some item and T1 Time T2
then fails due to some reason. This --- T0 ---
updated item is retrieved by another Update X
transaction before it is changed back --- T1
X=50
to the original value. Read X T2 ---
• How to avoid: A transaction T1 must --- T3 Rollback
not read the data item (X) until the --- T4 ---
transaction T2 can commit data item
(X).
Incorrect retrieval problem
• The inconsistent retrieval problem Balance (A=200, B=250, C=150)
arises when one transaction retrieves
data to use in some operation but T1 Time T2
before it can use this data another Read (A)
transaction updates that data and T1 ---
commits. Sum ← 200
• Through this change will be hidden Read (B)
from first transaction and it will T2 ---
Sum ← Sum + 250 = 450
continue to use previous retrieved --- T3 Read (C)
data. This problem is also known as
inconsistent analysis problem. Update (C)
--- T4
• How to avoid: A transaction T2 must 150 → 150 – 50 = 100
not read or update data item (X) until --- T5 Read (A)
the transaction T1 can commit data Update (A)
item (X). --- T6
200 → 200 + 50 = 250
--- T7 COMMIT
Read (C)
T8 ---
Sum ←Sum + 100 = 550
Database Recovery
Database recovery
• There are many situations in which a transaction may not reach a commit or
abort point.
• Operating system crash
• DBMS crash
• System might lose power (power failure)
• Disk may fail or other hardware may fail (disk/hardware failure)
• Human error
• In any of above situations, data in the database may become inconsistent or lost.
• For example, if a transaction has completed 30 out of 40 write instructions to the
database when the DBMS crashes, then the database may be in an inconsistent
state as only part of the transaction’s work was completed.
• Database recovery is the process of restoring the database and the data to a
consistent state.
• This may include restoring lost data up to the point of the event (e.g. system
crash).
Log based recovery method
• The log is a sequence of log records, which
maintains information about update activities on
the database.
• A log is kept on stable storage (i.e HDD).
• Log contains
• Start of transaction
• Transaction-id
• Record-id
• Type of operation (insert, update, delete)
• Old value, new value
• End of transaction that is committed or aborted.
Log based recovery method
• When transaction Ti starts, it registers itself by writing a record
<Ti start> to the log.
• Before Ti executes write(X), a log record <Ti, X, V1, V2> is
written, where V1 is the value of X before the write (the old
value), and V2 is the value to be written to X (the new value).
• When Ti finishes it last statement, the log record <Ti commit> is
written.
• Undo of a log record <Ti, X, V1, V2> writes the old value V1 to X
• Redo of a log record <Ti, X, V1, V2> writes the new value V2 to X
• Types of log based recovery method
• Immediate database modification
• Deferred database modification
Immediate v/s Deferred database modification
Immediate database modification Deferred database modification
Updates (changes) to the database are Updates (changes) to the database are
applied immediately as they occur without deferred (postponed) until the transaction
waiting to reach to the commit point. commits.
Immediate v/s Deferred database modification
Immediate database modification Deferred database modification
T1
T2
T3
T4
T1
T2
T3
T4
T5
T6
Exercise Give the name of transactions which has already been committed before
checkpoint.
Exercise Give the name of transactions which will perform Redo
operation.
Exercise Give the name of transactions which will perform Undo
operation.
Page table structure
1 2 1
2 1 2
3 4 3
4 101 4
5 202 :
. :
n n 101
Page Table :
• Two pages - page 2 & 5 - are affected by a transaction and copied to new physical
pages. The current page table points to these pages.
• The shadow page table continues to point to old pages which are not changed by
the transaction. So, this table and pages are used for undoing the transaction.
Shadow paging technique
• When transaction start, both the page tables are identical.
• The shadow page table is never changed over the duration of the transaction.
• The current page table will be changed when a transaction performs a write
operation.
• All input and output operations use the current page table.
• Whenever any page is about to be written for the first time
• A copy of this page is made onto an unused page
• The current page table is then made to point to the copy
• The update is performed on the copy
• When the transaction completes, all the modifications which are done by
transaction which are present in current page table are transferred to shadow page
table.
• When the transaction fails, the shadow page table are transferred to current page
table.
What is lock?
• A lock is a variable associated with data item to control concurrent access to
that data item.
Locking is a strategy that is used to
prevent such concurrent access of data.
Lock variable
01
Database
Lock based protocols
• Data items can be locked in two modes :
• Shared (S) mode: When we take this lock we can just read the item but cannot write.
• Exclusive (X) mode: When we take this lock we can read as well as write the item.
• Lock-compatibility matrix
T1
Transaction
Transaction Transaction Time
begin end
Strict two phase locking protocol V/S Rigorous two phase locking
protocol
• Strict two phase locking protocol
• In this protocol, a transaction may release all the shared locks after the Lock Point has
been reached, but it cannot release any of the exclusive locks until the transaction
commits or aborts.
• It ensures that if data is being modified by one transaction, then other transaction
cannot read it until first transaction commits.
• This protocol solves dirty read problem.
R(B) W-TS 0 0 0
W(B)
W(A)
Deadlock
Section – 8
What is deadlock?
• Consider the following two transactions:
T1 T2
Lock-X (A)
Granted for (A)
Write (A)
Lock-X (B)
Granted for (B)
Write (B)
Lock-X (A)
Waiting for (A)
Write (A)
Lock-X (B)
Waiting for (B)
Write (B)
CK
• This wait-for graph has no cycle, so there is no deadlock
LO
AD
state.
DE
C
• Suppose now that transaction D is requesting an item
held by C. Then the edge D → C is added to the
wait-for graph.
• Now this graph contains the cycle.
•B → D → C → B
• It means that transactions B, D and C are all
deadlocked.
Deadlock recovery
• When a deadlock is detected, the system must recover from the
deadlock.
B D
• The most common solution is to roll back one or more
transactions to break the deadlock.
A
CK
• Choosing which transaction to abort is known as victim selection.
LO
AD
• In this wait-for graph transactions B, D and C are deadlocked.
DE
• In order to remove deadlock one of the transaction out of these C
three (B, D, C) transactions must be roll backed.
• We should rollback those transactions that will incur the
minimum cost.
• When a deadlock is detected, the choice of which transaction to
abort can be made using following criteria:
• The transaction which have the fewest locks
• The transaction that has done the least work
• The transaction that is farthest from completion
Deadlock prevention
• A protocols ensure that the system will never enter into a deadlock state.
• Some prevention strategies :
• Require that each transaction locks all its data items before it begins execution
(pre-declaration).
• Impose partial ordering of all data items and require that a transaction can lock data
items only in the order specified by the partial.
Deadlock prevention
• Following schemes use transaction timestamps for the sake of deadlock
prevention alone.
1. Wait-die scheme — non-preemptive
• If an older transaction is requesting a resource which is held by younger transaction,
then older transaction is allowed to wait for it till it is available.
• If an younger transaction is requesting a resource which is held by older transaction,
then younger transaction is killed.
2. Wound-wait scheme — preemptive
• If an older transaction is requesting a resource which is held by younger transaction,
then older transaction forces younger transaction to kill the transaction and release the
resource. Wait-die Wound-wait
• If anO younger transaction
needs a resource held byis requesting
O waitsa resource which
Y diesis held by older transaction,
then Y younger transaction is allowed to wait till older transaction will releases it.
Y needs a resource held by Y dies Y waits
O
Deadlock prevention
• Following schemes use transaction timestamps for the sake of deadlock
prevention alone.
3. Timeout-Based Schemes
• A transaction waits for a lock only for a specified amount of time. After that, the wait
times out and the transaction is rolled back. So deadlocks never occur.
• Simple to implement; but difficult to determine good value of the timeout interval.
Database Management Systems (DBMS)
Than
k
You