Transaction Processing Concepts
Advanced Database Systems
Introduction
What is Transaction?
Business(money) Exchange(dictionary definition)
A unit of a program execution that accesses and possibly
modifies various data objects (tuples, relations)
Action, or series of actions, carried out by user or application,
which accesses or changes contents of database.
Basic operations a transaction can include “actions”:
• Reads, writes
• Special actions: commit, abort
Transaction: Database Read and Write Operations
A database operation is represented as a collection of named data items
Read-item (X)
1. Find the address of the disk block that contains item X
2. Copy the disk block into a buffer in main memory
3. Copy the item X from the buffer to the program variable named X
Write-item (X)
1. Find the address of the disk block that contains item X.
2. Copy that disk block into a buffer in main memory
3. Copy item X from the program variable named X into its correct location in the
buffer.
4. Store the updated block from the buffer back to disk (either immediately or at some
later point in time).
Transaction(example)
Example: fund Transfer
transaction to transfer $50 from account A to account B:
For a user it is one activity
To database
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
Two main issues to deal with:
Failures of various kinds, such as hardware failures and system crashes
Concurrent execution of multiple transactions
Introduction(Cont.)
One criterion for classifying a database system is according to
the number of users who can use the system concurrently.
Single-user VS multi-user systems
A DBMS is single-user if at most one user can use the system at a time
A DBMS is multi-user if many users can use the system and access
database concurrently.
Problem
How to make the simultaneous interactions of multiple users with the
database safe, consistent, correct, and efficient?
Introduction(Cont.)
It depends on
Computing systems(CPU + programming language)
Single-processor computer system(one cpu)
Multiprogramming execute some commands from one process, then suspend
that process and execute some commands from the next process.
A process is resumed at the point where it was suspended whenever it gets its turn
to use the CPU again.
concurrent execution of processes is actually interleaved
Inter-leaved Execution
Pseudo-parallel processing
Multi-processor computer system (multiple CPUs)
Parallel processing
Concurrent Transactions
B B B
CPU2
A A
CPU1 A
CPU1
time
t1 t2 t1 t2
Interleaved processing Parallel processing
(Single processor) (Two or more processors)
Introduction(cont..)
Why concurrency control(during multiple transaction
Execution) is needed?
Three problems are
1. The lost update problem
2. The temporary update (dirty read) problem
3. Incorrect summary problem
A Transaction: A Formal Example
T1
t0
read_item(X);
read_item(Y);
X:=X - 400000;
Y:=Y + 400000;
write _item(X);
tk
write_item(Y);
Problems in concurrent execution of
Transaction(Lost Update)
Occurs when two transactions that access the same database
items have their operations interleaved in a way that makes
the value of some database items incorrect.
Problems in concurrent execution of
Transaction(Dirty Read)
Occurs when one transaction updates a database item and
then the transaction fails for some reason.
Problems in concurrent execution of Transaction
(Incorrect Summary)
Occurs if one transaction is calculating an aggregate summary
function on a number of database items while other transactions are
updating some of these items, the aggregate function may calculate
some values before they are updated and others after they are
updated.
How those problems are solved?
DBMS has a Concurrency Control subsystem to assure database remains
in consistent state despite concurrent execution of transactions.
Other problems
System failures may occur
Types of failures:
System crash
Transaction or system error
Local errors
Disk failure
Physical failures
DBMS has a Recovery Subsystem to protect database against system
failures
Transaction and System Concepts
Transaction State
A transaction is an atomic unit of work that should either be completed
in its entirety or not done at all.
For recovery purposes, the system needs to keep track of when each
transaction starts, terminates, and commits or aborts.
The recovery manager of the DBMS needs to keep track of the
following operations:
Transaction and System Concepts
BEGIN_TRANSACTION: marks start of transaction
READ or WRITE: two possible operations on the data
END_TRANSACTION: marks the end of the read or write
operations; start checking whether everything went according
to plan
COMIT_TRANSACTION: signals successful end of transaction;
changes can be “committed” to DB
Partially committed
ROLLBACK (or ABORT): signals unsuccessful end of
transaction, changes applied to DB must be undone
Transaction States
State transition diagram illustrates how a transaction moves
through its execution states.
A transaction must be in one of these states.
Properties of Transaction
ACID properties
Atomicity
Consistency
Isolation
Durability
Atomicity and Consistency
Atomicity Consistency
Transactions are atomic – Transactions take the
they don’t have parts database from one
(conceptually) consistent state into
can’t be executed another
partially; it should not be In the middle of a
detectable that they transaction the database
interleave with another might not be consistent
transaction
Isolation and Durability
Isolation Durability
The effects of a transaction Once a transaction has
are not visible to other completed, its changes are
transactions until it has made permanent
completed Even if the system crashes,
the effects of a transaction
must remain in place
Properties of Transaction(cont..)
• Transfer $50 from account A to account B
Read(A) Atomicity
A = A - 50 - shouldn’t take money from A
Write(A) without giving it to B
Transaction Consistency
Read(B)
B = B+50 - money isn’t lost or gained
Isolation
Write(B)
- other queries shouldn’t see A or
B change until completion
Durability
- the money does not go back to A
Who will enforce the ACID properties?
The transaction manager
It schedules the operations of transactions
COMMIT and ROLLBACK are used to ensure atomicity
Locks or timestamps are used to ensure consistency and isolation for
concurrent transactions
A log is kept to ensure durability in the event of system failure.
COMMIT and ROLLBACK
COMMIT signals the successful ROLLBACK signals the
end of a transaction unsuccessful end of a transaction
Any changes made by the Any changes made by the
transaction should be saved transaction should be undone
These changes are now visible It is now as if the transaction
to other transactions never existed
The Transaction Log
The transaction log records the The log is stored on disk, not in
details of all transactions memory
Any changes the transaction If the system crashes it is
makes to the database preserved
How to undo these changes Write ahead log rule
When transactions complete The entry in the log must be
and how made before COMMIT
processing can complete
Schedules and Recoverability
What is Schedule?
Sequences that indicate the chronological order in which instructions
of concurrent transactions are executed.
Ordering of execution of operations from various transactions T1,
T2, … , Tn is called a schedule S.
Given multiple transactions,
A schedule is a sequence of interleaved actions from all transactions
Schedules and Recoverability(cont..)
A schedule for a set of transactions must consist of all
instructions of those transactions
Must preserve the order in which the instructions appear in
each individual transaction.
Example
Transaction T1: r1(X); w1(X); r1(Y); w1(Y); c1
Transaction T2: r2(X); w2(X); c2
A schedule, S:
r1(X); r2(X); w1(X); r1(Y); w2(X); w1(Y); c1; c2
Example: A “Good” Schedule
One possible schedule,
initially X = 10, Y=10
Resulting Database: X=21,Y=21, X=Y
Example: A “Bad” Schedule
Another possible schedule
Resulting Database X=22,Y=21, X=Y ?
When does Conflicts occur between two
operations?
Two operations conflict if they satisfy ALL three conditions:
1. they belong to different transactions AND
2. they access the same item AND
3. at least one is a write_item()operation
Example.: Transaction T1 T2
Read(X) Read(X)
Read(X) Write(X)
Write(X) Read(X)
Write(X) Write(X)
Read(X) Write(Y)
Serializability of Schedules
What is serializable schedules?
types of schedules that are always considered to be correct when
concurrent transactions are executing.
Serializability of Schedules(classification)
Serial Schedule
Schedule where operations of each transaction are executed
consecutively without any interleaved operations from other
transactions. The opposite of serial is non serial schedule.
No guarantee that results of all serial executions of a given set of
transactions will be identical.
Serializability of Schedules
Objective of serializability is to find non serial schedules that
allow transactions to execute concurrently without interfering
with one another.
In other words, want to find non serial schedules that are
equivalent to some serial schedule. Such a schedule is called
serializable.
When are two schedules equivalent?
Option 1: They lead to same result (result equivalent)
Option 2: The order of any two conflicting operations is the same
(conflict equivalent)
Result Equivalent Schedules
Two schedules are result equivalent if they produce the same
final state of the database
Problem: May produce same result by accident!
S1 S2
read_item(X); read_item(X);
X:=X+10; X:=X*1.1;
write_item(X); write_item(X);
Schedules S1 and S2 are result equivalent for X=100 but not in general
Conflict Equivalent schedule(example)
Schedule S1
T1 T2
read_item(A);
write_item(A);
order doesn’t matter
order matters
read_item(B);
write_item(B);
read_item(A):
write_item(A);
read_item(B);
order matters order
write_item(B);
doesn’t matter
Conflict Equivalence(Example)
Schedule S1’
T1 T2
read_item(A);
read_item(B);
write_item(A); same order as in S1
read_item(A):
write_item(B); same order as in S1 write_item(A);
S1 and S1’ are conflict equivalent read_item(B);
(S1’ produces the same result as S1) write_item(B);
Two schedules are conflict equivalent, if the order of any two conflicting operations is the same in
both schedules
Transaction Support in SQL
A single SQL statement is always considered to be atomic.
There is no explicit Begin_Transaction statement.
SET TRANSACTION statement in SQL2 sets the characteristics of a
transaction.
Access mode
READ only or READ-WRITE
Diagnostic area size
Indicates the number of conditions that can be held simultaneously in the
diagnostic area.
Isolation level
READ UNCOMMITTED, READ COMMITTED, REPEATABLE READ,
SERIALIZABLE