[go: up one dir, main page]

0% found this document useful (0 votes)
14 views62 pages

Unit 4 CH1 Transactions

The document discusses transactions in databases, defining them as sequences of actions that form a logical unit of work, and introduces the ACID properties (Atomicity, Consistency, Isolation, Durability) essential for maintaining data integrity. It explains transaction states, the importance of serial and non-serial schedules, and the challenges of concurrent execution, including anomalies like lost updates and dirty reads. Additionally, it covers serializability concepts, including conflict and view serializability, which ensure that concurrent transactions do not compromise database consistency.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views62 pages

Unit 4 CH1 Transactions

The document discusses transactions in databases, defining them as sequences of actions that form a logical unit of work, and introduces the ACID properties (Atomicity, Consistency, Isolation, Durability) essential for maintaining data integrity. It explains transaction states, the importance of serial and non-serial schedules, and the challenges of concurrent execution, including anomalies like lost updates and dirty reads. Additionally, it covers serializability concepts, including conflict and view serializability, which ensure that concurrent transactions do not compromise database consistency.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 62

UNIT IV

Transactions
Introduction:
Sequence of actions that form a single logical unit of work are called transactions.

It is an execution of a user program seen by DBMS as the series of read(access) and


write(change) operations

Transaction basically represents the change in Database


E.g., transaction to transfer $50 from account A to account B:

1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
ACID Properties
To ensure integrity of the data, we require the database
system maintain the following properties of the
transactions:

Atomicity.
Consistency
Isolation
Durability

These properties are often called the ACID properties; the


acronym is derived from the first letter of each of the four
properties.
ACID properties:

Atomicity: Either all operations of the transaction are reflected properly in the
database, or none are.
Example: Transferring money between accounts either completes fully or
doesn't happen at all.

Consistency: Guarantees that the database remains in a valid state before


and after the transaction.
Example: Total balance remains unchanged during money transfers.

Isolation:Ensures that transactions are executed independently without


interference from other concurrent transactions.
Example: Two users withdrawing money at the same time won't affect each
other’s transaction.

Durability: Ensures that once a transaction is committed, the changes are


permanent even in the case of a system failure.
Example: After a successful payment, the transaction details are saved
securely.
Transaction State:

A transaction must be in one of the following states:

• Active, the initial state; the transaction stays in this state while it is
starting execution.

• Partially committed, after the final statement has been executed.

• Failed, after the discovery that normal execution can no longer proceed.

• Aborted, abnormal termination

• Committed, after successful completion.


Transaction State:
Consistency & Transaction:
Users are responsible for the transaction
consistency i.e., the user who submits a
transaction must ensure that when run to
completion by itself against a consistent DB .
The transfer will leave the DB in a consistent
state.
Ex: Funds Transfer
Isolation & Transaction:
The isolation property is ensured by
guaranteeing that even though the actions in
several transactions might interleave , the
next effect is identical to executing all
transactions one after the other in some serial
order.
Atomicity & Durability:
Transaction can be incomplete for 3 reasons.
1.A transaction can be aborted or terminated
unsuccessfully by the DBMS because some
anomaly arise during execution
2.The system may crash when 1 or more
transactions are in progress.
3.The transaction may encounter an unexpected
situation
• The DBMS ensures transaction atomicity by undoing
the action of incomplete transaction. This means that
users can ignore incomplete transaction.
• The DBMS maintains a record called the ‘log’ of all
writes to the database.
• The ‘log’ also ensures durability if system crashes
before the changes made by a completed transaction
are written to the disk. The log is used to remember
and restore the changes when the system restarts
• The DBMS component that ensure atomicity &
durability is called the recovery manager.
Implementation of Atomicity and Durability:

• The recovery-management component of a database


system implements the support for atomicity and durability.
• The shadow-database scheme:
– assume that only one transaction is active at a time.
– a pointer called db_pointer always points to the current
consistent copy of the database.
– all updates are made on a shadow copy of the
database, and db_pointer is made to point to the
updated shadow copy only after the transaction
reaches partial commit and all updated pages have
been flushed to disk.
– in case transaction fails, old consistent copy pointed to
by db_pointer can be used, and the shadow copy can
be deleted.
Implementation of Atomicity and Durability:
Transaction & Schedules :
• A transaction in DBMS is seen as a series or list of actions.
• The actions that can be executed by a transaction includes read
and write of DB objects.
• For example, Object O is a program variable:
R(O) : action of transaction T which is reading an object O.
W(O) : action of a transaction T writing an object O
Abort : denotes the action of transaction aborting
Commit: denotes transaction committing
SCHEDULE
• A schedule is a list of actions from a set of transactions (The
actions include write, read , commit, abort)
• The order in which the actions of a transaction appears in a
schedule must be same as the order in transaction
Transaction Schedule
A B
read(A) read(B) read(A) read(B)
A= A-50 B=B+50 write(A) write(B)
write(A) write(B)
This schedule does not contain an abort or commit for each
transaction
• If the actions of different transactions are executed from start to
finish one by one then this schedule is called Serial Schedule

• A schedule that contains abort or commit for each transaction


whose actions are listed in it is called complete schedule.

• If the action of different transactions are interleaved then it is


called as concurrent/ parallel schedule
Serial Schedule
If the actions of different transactions are executed from start to
finish one by one then this schedule is called Serial Schedule
For example: Suppose there are two transactions T1 and T2
which have some operations. If it has no interleaving of
operations, then there are the following two possible
outcomes:
Execute all the operations of T1 which was followed by all the
operations of T2.
Execute all the operations of T2 which was followed by all the
operations of T1.
Serial Schedule:

serial schedule in which serial schedule in which


T1 is followed by T2 T2 is followed by T1
Non-serial Schedule
• If interleaving of operations is allowed, then
there will be non-serial schedule.
• It contains many possible orders in which the
system can execute the individual operations of
the transactions.
Serializable schedule
• The serializability of schedules is used to find non-
serial schedules that allow the transaction to execute
concurrently without interfering with one another.
• It identifies which schedules are correct when
executions of the transaction have interleaving of
their operations.
• A non-serial schedule will be serializable if its result
is equal to the result of its transactions executed
serially.
• A schedule is serializable when a concurrent
schedule is transformed to a serial schedule
Concurrent Executions:

The database system must control the interaction among the


concurrent transactions to prevent them from destroying the
consistency of the database. It does so through a variety of
mechanisms called concurrency-control schemes.

Concurrency control schemes – mechanisms to achieve


isolation.
Let T1 transfer $50 from A to B, and T2 transfer 10% of the balance from A to B.
Concurrent Executions:

a concurrent schedule
Concurrent Executions:
We consider only two operations: read and write.

We assume that, between a read(Q) instruction and a write(Q)


instruction on a data item Q, a transaction may perform an arbitrary
sequence of operations on the copy of Q that is residing in the local
buffer of the transaction. Thus, the only significant operations of a
transaction, from a scheduling point of view, are its read and write
instructions.
CONCURRENT EXECUTION OF TRANSACTIONS

Advantages of concurrent execution of a transaction


•Decrease waiting time or turnaround time.
•Improve response time
•Increased throughput or resource utilization.
Anomalies due to Interleaved Execution

There are following different types of problems


or conflicts which occur due to concurrent
execution of transaction:
1.Lost update problem (Write – Write conflict)
2.Dirty Read Problem (Write- Read Conflict)
3.Unrepeatable Read Problem(Read-Write
Conflict)
1. Lost update problem
(Write – Write
conflict)
This type of problem occurs Transaction Transaction
when two transactions in Time
T1 T2
database access the same t1 Read(A)
data item and have their
t2 A=A-50
operations in an
interleaved manner that t3 Read(A)
makes the value of some t4 A=A+50
database item incorrect. t5 Write(A)
If there are two transactions t6 Write(A)
T1 and T2 accessing the
same data item value and
then update it, then the
second record overwrites
the first record.
Here, Let’s take the value of A is
100
• At t1 time, T1 transaction reads the value of A i.e., 100.
• At t2 time, T1 transaction deducts the value of A by 50.
• At t3 time, T2 transactions read the value of A i.e., 100.
• At t4 time, T2 transaction adds the value of A by 150.
• At t5 time, T1 transaction writes the value of A data item on the basis of
value seen at time t2 i.e., 50.
• At t6 time, T2 transaction writes the value of A based on value seen at
time t4 i.e., 150.
• So at time T6, the update of Transaction T1 is lost because Transaction T2
overwrites the value of A without looking at its current value.
• Such type of problem is known as the Lost Update Problem.
2. Dirty Read
Transaction
Problem (Write- Time
T1
Transaction T2
Read Conflict) t1 Read(A)
This type of problem t2 A=A+20
occurs when one t3 Write(A)
transaction T1 updates t4 Read(A)
a data item of the t5 A=A+30
database, and then
t6 Write(A)
that transaction fails
due to some reason, Write(A) –
fails due to
but its updates are t7
server
accessed by some down
other transaction.
Here, Let’s take the value of A is
100
• At t1 time, T1 transaction reads the value of A i.e., 100.
• At t2 time, T1 transaction adds the value of A by 20.
• At t3 time, T1transaction writes the value of A (120) in the database.
• At t4 time, T2 transactions read the value of A data item i.e., 120.
• At t5 time, T2 transaction adds the value of A data item by 30.
• At t6 time, T2transaction writes the value of A (150) in the database.
• At t7 time, a T1 transaction fails due to power failure then it is rollback
according to atomicity property of transaction (either all or none).
• So, transaction T2 at t4 time contains a value which has not been
committed in the database. The value read by the transaction T2 is
known as a dirty read.
3. Unrepeatable Read
Problem(Read-Write
Transaction Transaction
Conflict) Time
T1 T2
It is also known as an t1 Read(A)
inconsistent retrieval t2 Read(A)
problem. If a transaction t3 A=A+30
T1 reads a value of data item t4 Write(A)
twice and the data item is t5 Read(A)
changed by another
transaction T2 in between
the two read operations.
Hence T1 access two
different values for its two
read operations of the same
data item.
Here, Let’s take the value of A is
100
• At t1 time, T1 transaction reads the value of A i.e., 100.
• At t2 time, T2transaction reads the value of A i.e., 100.
• At t3 time, T2 transaction adds the value of A data item by
30.
• At t4 time, T2 transaction writes the value of A (130) in the
database.
• Transaction T2 updates the value of A. Thus, when another
read statement is performed by transaction T1, it accesses
the new value of A, which was updated by T2. Such type of
conflict is known as R-W conflict.
Serializability:

• The database system must control concurrent execution of


transactions, to ensure that the database state remains consistent.

• Each transaction must preserve database consistency.

• Serial execution of a set of transactions preserves database


consistency.

• A (possibly concurrent) schedule is serializable if it is equivalent to a


serial schedule.

• Different forms of schedule equivalence give rise to the notions of:

1. Conflict serializability

2. View serializability
Conflict Serializability:

Let us consider a schedule S in which there are two consecutive


instructions Ii and Ij, of transactions Ti and Tj , respectively (i != j).

If Ii and Ij refer to different data items, then we can swap Ii and Ij without
affecting the results of any instruction in the schedule.
If Ii and Ij refer to same data item Q, then the order of two steps may matter.

R(A) R(A)  Non Conflict Pair

R(A) W(A)
W(A) R(A)  Conflict Pair
W(A) W(A)

R(A) W(B)
R(B) W(A)
W(B) R(A)  Non Conflict Pair
W(A) R(B)
W(B) W(A)
R(B) R(A)
Conflict Serializability:

There are four cases that we need to consider:

1. Ii = read(Q), Ij = read(Q). The order of Ii and Ij does not


matter.
2. Ii = read(Q), Ij = write(Q). The order of Ii and Ij matters.
3. Ii = write(Q), Ij = read(Q). The order of Ii and Ij matters.
4. Ii = write(Q), Ij = write(Q). The order of Ii and Ij matters.

• Thus, only in the case where both Ii and Ij are read


instructions does the relative order of their execution not
matter.

• We say that Ii and Ij conflict if they are operations by


different transactions on the same data item, and at least one
of these instructions is a write operation.
Conflict Serializability:

If a schedule S can be transformed into a schedule S’ by a series of swaps


of non conflicting instructions, we say that S and S’ are conflict
equivalent.

A schedule S is conflict serializable if it is conflict equivalent to a serial


schedule.

Hint: Check adjacent Non Conflict Pair and then swap the positions
Conflict Serializability: Example1

Is this schedule a
conflict serializable schedule?
Conflict Serializability: Example1

Since the write(A) instruction of T2 does


not conflict with the read(B) instruction
of T1, we can swap these instructions to
generate an equivalent schedule
Conflict Serializability: Example1

Swap the read(B) instruction of T1 with


the read(A) instruction of T2.
Conflict Serializability: Example1
Swap the write(B) instruction of T1
with the write(A) instruction of T2.

Swap the write(B) instruction of T1


with the read(A) instruction of T2.
Conflict Serializability: Example1

Swap
Swap
Swap
Swap

Thus, the concurrent schedule is conflict serializable, since it is


conflict equivalent to the serial schedule.
Conflict Serializability: Example2

T1 T2
R(A)
W(A)
R(A)
W(A)
R(B)

T1 T2 T1 T2
R(A) R(A)
W(A) W(A)
R(A) R(B)
R(B) R(A)
W(A) W(A)

So, The final schedule is equivalent to a serial schedule


Conflict Serializability: Example3

Is this schedule a
conflict serializable schedule?

NO!!
Conflict Serializability: Example1

Is this schedule a
conflict serializable schedule?

NO!!
Conflict Serializability: Example4

Is this schedule a
conflict serializable schedule?

NO!!
View Serializability:

Consider two schedules S and S’, where the same


set of transactions participates in both schedules.
The schedules S and S’ are said to be view
equivalent if three conditions are met:
1. If Ti reads the initial value of object A in S1, it
must also read the initial value of A in S2.
2. If Ti reads a value of A written by Tj in S1, it must
also read the value of A written by Tj in S2.
3. For each data object A, the transaction (if any)
that performs the final write on A in S1 must also
perform the final write on A in S2.
commit

commit

CHECK IF IT IS CONFLICT SERIALIZABLE ??


NO!!!
TO CHECK IF WE HAVE ANY BLIND WRITES IN A
TRANSACTION ???
-a blind write occurs when a transaction writes a value
without reading it.
YES!!!
THEN IT IS VIEW SERIALIZABLE
• A schedule is view serializable if it is view equivalent to Same serial
schedule.
• Every conflict serializable schedule is view serializable, although
the converse is not true.
For example, the schedule shown in Figure 17.1 is view serializable,
although it is not conflict serializable. Incidentally, note that this
example contains blind writes.

• This is not a coincidence; it can be shown that any view serializable


schedule that is not conflict serializable contains a blind write.

• Enforcing or testing view serializability turns out to be much more


expensive, and the concept therefore has little practical use,
although it increases our understanding of serializability.
View Serializability: Example2

Is this schedule a
view serializable schedule?
YES!!
Precedence Graph(Testing of Serializability)
• Precedence Graph or Serialization Graph is used
commonly to test Conflict Serializability of a schedule.
• Assume a schedule S. For S, we construct a graph
known as precedence graph.
• This graph has a pair G = (V, E), where V consists a
set of vertices, and E consists a set of edges. The
set of vertices is used to contain all the transactions
participating in the schedule.
• The set of edges is used to contain all edges Ti ->Tj
for which one of the three conditions holds:
1. Create a node Ti → Tj if Ti executes write (Q) before
Tj executes read (Q).
2. Create a node Ti → Tj if Ti executes read (Q) before
Tj executes write (Q).
3. Create a node Ti → Tj if Ti executes write (Q) before
• If a precedence graph contains a single edge Ti
→ Tj, then all the instructions of Ti are executed
before the first instruction of Tj is executed.
• If a precedence graph for schedule S contains a
cycle, then S is non-serializable.
• If the precedence graph has no cycle, then S is
known as serializable.
Precedence Graph: Examples
Precedence Graph: Examples
Example: To check whether a schedule is conflict
serializable or NOT ?
T1 T2 T3 Precedence Graph
R(X)
R(y) T2
R(X)
R(Y)
T1 T3
R(Z)
- Check the conflict pair in other transaction and
W(Y) draw edge
W(Z) - Check if there is a loop/cycle in graph ?
R(Z) NO!!!.
W(X) Means this is a conflict serializable
W(Z) Then it can be serializable
It is consistent too.

Check the conflict pair


in other transaction
and draw edges
Check graph whose vertex indegree = 0 T1T2T3
T2 is not having any edge pointing T1T3T2
T2T3T1
Remove T2
T2T1T3
Check Indegree for T3 T3T2T1
Remove T3 T3T1T2
So, the possible order T2  T3 T1
T1 T2 T3
R(Y)
R(Z)
W(Z)
R(Y)
R(X)
W(Y)
R(Z)
W(X)
W(Z)
Recoverability:
Recoverable Schedule:
A recoverable schedule is one where, for each pair of
transactions Ti and Tj such that Tj reads a data
item previously written by Ti, the commit operation
of Ti appears before the commit operation of Tj .
Irrecoverable Schedule:
The table below shows a schedule with two
transactions, T1 reads and writes A and that value T1 T2
is read and written by T2. T2 commits. But later R(A)
on, T1 fails. So we have to rollback T1. Since T2 A=A-5
has read the value written by T1, it should also be W(A)
rollbacked. But we have already committed that. R(A)
So this schedule is irrecoverable schedule. When A=A-2
Tj is reading the value updated by Ti and Tj is W(A)
committed before committing of Ti, the schedule commit
R(B)
will be irrecoverable.
*fails
CASCADING & CASCADELESS SCHEDULE
Recoverable with Cascading Rollback:
• The table below shows a schedule with two
transactions, T1 reads and writes A and that value is
read and written by T2. But later on, T1 fails. So we
have to rollback T1. Since T2 has read the value written
by T1, it should also be rollbacked. As it has not
committed, we can rollback T2 as well. So it is
recoverable with cascading rollback. Therefore, if Tj is
reading value updated by Ti and commit of Tj is
delayed till commit of Ti, the schedule is called
recoverable with cascading rollback.
• The phenomenon, in which a single transaction
failure leads to a series of transaction rollbacks, is
called cascading rollback.
Disadv:
•Reduced
Performance Acc to Atomicity property the transaction has to be rolled
•CPU Utilization back if it fails. Thus T2 also should be rolled back as it has
wasted read a value that was updated by T1 caused by Dirty Read
Recoverable with Cascadeless Rollback:
• If a schedule with two transactions, T1 reads and writes A and
commits and that value is read by T2. But if T1 fails before
commit, no other transaction has read its value, so there is no
need to rollback other transaction. So this is a Cascadeless
recoverable schedule. So, if Tj reads value updated by Ti only
after Ti is committed, the schedule will be cascadeless
recoverable.
T1 T2
No other transaction allows R(A) till T1
R(A)
commits or aborts
W(A) What if any other transaction writes ?
Instead of Read ??
R(A) There will be a blind write problem there.
fails So, here comes STRICT RECOVERABLE
to the rescue.
STRICT RECOVERABLE:
No other transaction can read /
One more example of Strict
write on a data item till the schedule which isn’t a serial
current transaction is schedule:
committed.
T1 T2
T1 T2 R(A)
R(A) R(B)
W(A) W(A)
W(B)
Commit Commit
R(A) R(A)
W(A) commit
commit
This is strict schedule but it
doesn’t have to be a serial
schedule
Implementation of Isolation
Transaction Isolation Levels in DBMS:
• In order to maintain consistency in a database, it follows ACID
properties. Among these four properties (Atomicity,
Consistency, Isolation and Durability) Isolation determines
how transaction integrity is visible to other users and
systems. It means that a transaction should take place in a
system in such a way that it is the only transaction that is
accessing the resources in a database system.
• Isolation levels define the degree to which a transaction must
be isolated from the data modifications made by any other
transaction in the database system.
A transaction isolation level is defined by the following phenomena –
Dirty Read – A Dirty read is the situation when a transaction reads a data that
has not yet been committed. For example, Let’s say transaction 1 updates
a row and leaves it uncommitted, meanwhile, Transaction 2 reads the
updated row. If transaction 1 rolls back the change, transaction 2 will have
read data that is considered never to have existed.
Non Repeatable read – Non Repeatable read occurs when a transaction
reads same row twice, and get a different value each time. For example,
suppose transaction T1 reads data. Due to concurrency, another
transaction T2 updates the same data and commit, Now if transaction T1
rereads the same data, it will retrieve a different value.
Phantom Read - A phantom read occurs when, in the course of a transaction,
two identical queries are executed, and the collection of rows returned by
the second query is different from the first. Simple examples: User A runs
the same query twice. In between, User B runs a transaction and commits.
Based on these phenomena, The SQL standard defines four isolation levels :
Read Uncommitted – Read Uncommitted is the lowest isolation level. In this
level, one transaction may read not yet committed changes made by
other transaction, thereby allowing dirty reads. In this level, transactions
are not isolated from each other.
Read Committed – This isolation level guarantees that any data read is
committed at the moment it is read. Thus it does not allows dirty read.
The transaction holds a read or write lock on the current row, and thus
prevent other transactions from reading, updating or deleting it.
Repeatable Read – This is the most restrictive isolation level. The transaction
holds read locks on all rows it references and writes locks on all rows it
inserts, updates, or deletes. Since other transaction cannot read, update
or delete these rows, consequently it avoids non-repeatable read.
Serializable – This is the Highest isolation level. A serializable execution is
guaranteed to be serializable. Serializable execution is defined to be an
execution of operations in which concurrently executing transactions
appears to be serially executing.
The Table is given below clearly depicts the relationship
between isolation levels, read phenomena and locks :

You might also like