[go: up one dir, main page]

0% found this document useful (0 votes)
29 views69 pages

CC Part 1 No Quizzes

Uploaded by

est09serra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views69 pages

CC Part 1 No Quizzes

Uploaded by

est09serra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 69

Introduction to Transaction

Management

CMPSCI 445

1
Things we want the DB to do
 Execute more than one query/statement at a
time *CONCURRENT EXECUTION*
 reduces latency, increases performance
 allows multi-client interactions
 Combine several statements into a larger
atomic unit *COMPOSITION*
 Build large DB operations by combining
smaller ones
 Similar to combining expressions into
larger expressions in PL
2
Concurrency Control
 Concurrent execution of user programs is
essential for good DBMS performance
 Overlap CPU and I/O
 Leverage CPU core parallelism
 Trivially support multiple clients
 BUT we must cope with failures

 partial state, concurrent op interference


 The transaction is the foundation for:
 Concurrent execution
 Recovery from system failure, incomplete ops
3
What is a Transaction?
 A transaction is the DBMS’s abstract view of
a user program: a sequence of reads and
writes.
 Organizes a sequence of operations into a
history
 Useful for concurrency reasoning
 Also useful for programming in a world of
concurrent data access and mutation

4
A Transaction - Programmers view
BEGIN;  A BEGIN statement to start
SELECT name  A series of statements executed in
FROM T
WHERE …;
order
 Including reads and writes
INSERT INTO T
VALUES(…);

COMMIT;  A COMMIT to end the transaction


 make changes visible
 free up resources

5
A Transaction - DB’s view
BEGIN; BEGIN;

SELECT name READ (R_1)


FROM T READ (R_2)
WHERE …; …
READ (R_K)
INSERT INTO T
VALUES(…); WRITE (R_M)

COMMIT; COMMIT;

 The database has a more physical view


 It manages things in terms of individual records

 Reads and Writes


 From now on, we’ll be considering the DB’s view
6
A simple transaction example

 Imagine a simple
banking application T1: Transfer
 Two database objects:
Begin
• A: balance of
account A A=A+100
• B: balance of B=B-100
account B
End
 Transaction T1:
 “Transfer $100 from
account B to account A”.

7
A simple transaction example
 Transaction T1:  Transaction T2:
 “Transfer $100 from  “withdraw $100 from
account B to account A”. account A”.

T1: Transfer T2: Withdraw


Begin Begin
A=A+100 A=A-100
B=B-100 End
End

8
A simple transaction example
 Transaction T1:  Transaction T2:
 “Transfer $100 from  “withdraw $100 from
account B to account A”. account A”.

T1: Transfer T2: Withdraw


Begin Begin
A=A+100 A=A-100
B=B-100 End
End

 Both want to modify A.


 What happens if A starts with balance $1?
 What if B has less than $100? 9
The ACID Properties
 Database systems ensure the ACID properties:
 Atomicity
 Consistency
 Isolation
 Durability

10
Atomicity
 A very important property guaranteed by the
DBMS for all transactions is that they are atomic.
 User can think of a txn as executing all its actions in
one step, or executing no actions at all.
 DBMS logs all actions so that it can undo the actions of
aborted transactions.
 If it succeeds, the effects of write operations
persist (commit);
 If it fails, no effects of write operations persist
(abort)

11
Atomicity
 A transaction can fail for many reasons
 User program can manually call ABORT
•‘business logic’ requires a different action
 DBMS can abort the transaction at any time
•to free up resources
•to deal with deadlock (more on this later)
 Crashes and network failures
•client can crash or hangup connection
•DB server can crash
•network failures also happen (a lot, actually)
12
Consistency
 Each transaction must leave the database in a
consistent state if the DB is consistent when
the transaction begins.
 Consistency rules vary from DB to DB
• e.g. eventual consistency in NoSQL
• serializability vs. MVCC read consistency
 Consistency rules tell you what read + write
behaviors are excluded
 DB will maintain integrity constraints
 Consistency in ACID is usually alloyed with
Isolation because they both deal with behavioral
restrictions on reads and writes
13
Consistency
 DBMS doesn’t ‘understand’ your data
 consistency rules are about how reads and writes
of individual data can be interleaved
 maintaining ‘higher level’ properties of your data
is the responsibility of the application
 In banking example, sum (A + B) should be
unchanged by execution.

14
Isolation
 Many concurrent transactions are running at
one time.
 Each transaction should be isolated from the
effects of other transactions
 Full isolation is impossible
•resource competition is always an issue
 SERIALIZABILITY is the strongest kind of
isolation

15
Isolation
 With SERIALIZABLE isolation
 Transactions should behave as if nothing
else is executing on the system
 A transaction shouldn’t see any intermediate
state from any other transaction
 Whatever interleaving the DBMS permits, it
should produce a final state equivalent to
some serial execution of all the transactions
•no guarantee of a particular serial order

16
Isolation
 SERIALIZABLE is incredibly restrictive
 most DBs do not run with it as the default
 many DBs cannot actually support it
•in general, or sometimes at all
 many applications don’t need it
 SERIALIZABLE is slow
•it disallows many faster interleavings
 Many DBs run by default in a kind of
consistent read mode

17
Isolation
 We’ll analyze SERIALIZABLE because:
 it’s arguably more intuitive
 it is the standard starting example of
isolation
 But understand that in practice, most DBs run
with a more relaxed consistency

18
Durability

 If transaction commits, its effects will persist


in the database.
 In particular, if the system crashes before
effects are written to disk, they will be redone
 Recovery manager is responsible for this.
 DBs can be restarted after a hard crash (or
power outage)
 and will come back is a valid, consistent
state
 May be missing the last few updates 19
The ACID Properties
 Database systems ensure the ACID properties:
 Atomicity: all operations of transaction reflected
properly in database, or none are.
 Consistency: each transaction in isolation keeps
the database in a consistent state
 Isolation: should be able to understand what’s
going on by considering each separate transaction
independently.
 Durability: updates stay in the DBMS!!!

20
Two transactions
 “Transfer $100 from  “Add 6% interest to
account B to account A” accounts A and B”

T1: Transfer T2: Interest


Begin Begin
A=A+100 A=1.06*A
B=B-100 B=1.06*B
End End

21
Serial execution: T1, then T2
 Starting balances T1: Transfer
 A = 1000 Begin
 B = 2000 A=A+100
 Execute T1 B=B-100
 A = 1100
End
 B = 1900
 Execute T2 T2: Interest
 A = 1166 Begin
 B = 2014 A=1.06*A
B=1.06*B
End
Serial execution: T2, then T1
 Starting balances T2: Interest
 A = 1000 Begin
 B = 2000 A=1.06*A
 Execute T2
B=1.06*B
 A = 1060
End
 B = 2120
 Execute T1 T1: Transfer
 A = 1160 Begin
 B = 2020
A=A+100
B=B-100
End
23
Interleaved execution
 What other results are possible if
operations of T1 and T2 are interleaved?
 Starting balances
 A = 1000
 B = 2000

T1: Transfer T2: Interest


... ...
A=A+100 A=1.06*A
... ...
B=B-100 B=1.06*B
... ...
24
Interleaving operations

T1: Transfer T2: Interest


A=A+100
A=1.06*A
B=B-100
B=1.06*B

Is this interleaving consistent with serializability?

25
Interleaving operations

T1: Transfer T2: Interest


A=A+100
A=1.06*A
B=B-100
B=1.06*B

Is this interleaving consistent with serializability?


Yes. After running this interleaving you have:
A = 1166
B = 2014
equivalent to T1 ; T2
26
Interleaving operations

T1: Transfer T2: Interest


A=A+100
A=1.06*A
B=1.06*B
B=B-100
How about this interleaving?

27
Interleaving operations

T1: Transfer T2: Interest


A = 1166 A=A+100
A=1.06*A
B=1.06*B
B=B-100
How about this interleaving?

28
Interleaving operations

T1: Transfer T2: Interest


A = 1166 A=A+100
A=1.06*A
B=1.06*B B = 2020
B=B-100
How about this interleaving?

29
Interleaving operations

T1: Transfer T2: Interest


A = 1166 A=A+100
A=1.06*A
B=1.06*B B = 2020
B=B-100
How about this interleaving?
NO. Neither T1; T2 or T2; T1 would yield these results

30
Interleaving operations

T1: Transfer T2: Interest


A = 1166 A=A+100
A=1.06*A
B=1.06*B B = 2020
B=B-100
How about this interleaving?
NO. Neither T1; T2 or T2; T1 would yield these results

The issue is that T2 read the intermediate T1 state of A


This is generally known as an atomicity violation, but
using ACID words, it’s a violation of ISOLATION
31
Serializability
 You could always execute transactions in some
forced order
 That would be very slow

 lots of waiting
 So, how do you run multiple transactions but
maintain the illusion of serial execution?
 Implementing serializability requires
understanding effects of operations well
enough to prevent non-serial-equivalent
interleavings

32
Schedules
 The way we’re going to reason about transaction
interleavings is with schedules
 A schedule is a list of operations

 potentially from multiple transactions


 Schedules assume a single thread of execution

 no multi-threading or anything
 Two operations next to each other in the schedule
may be from different transactions
 CANNOT re-order operations within the SAME
transaction
33
Scheduling Transactions
 A transaction is seen by DBMS as sequence of
reads and writes
 read of object O denoted R(O)
 write of object O denoted W(O)
 must end with Abort or Commit
 A schedule of a set of transactions is a list of
all actions where order of two actions from
any transaction must match order in that
transaction.

34
A schedule

T1: Transfer T2: Interest


Read(A)
T1: Transfer T2: Interest
Write(A)
A=A+100
Read(A)
A=1.06*A
Write(A)
B=1.06*B
Read(B)
B=B-100
Write(B)
Read(B)
Write(B)
35
A schedule

T1: Transfer T2: Interest


Read(A)
T1: Transfer T2: Interest
Write(A)
A=A+100
Read(A)
A=1.06*A
Write(A)
B=1.06*B
Read(B)
B=B-100
Write(B)
Read(B)
Write(B)
36
Not A schedule

T1: Transfer T2: Interest


Read(A)
T1: Transfer T2: Interest
Write(A)
A=A+100
Read(B)
A=1.06*A
Write(B)
B=1.06*B
Read(A)
B=B-100
Write(A)
Read(B)
Write(B)
37
Not A schedule
Why is this not a schedule?
T1: Transfer T2: Interest
Read(A)
T1: Transfer T2: Interest
Write(A)
A=A+100
Read(B)
A=1.06*A
Write(B)
B=1.06*B
Read(A)
B=B-100
Write(A)
Read(B)
Write(B)
38
Not A schedule
Why is this not a schedule?
T1: Transfer T2: Interest
Read(A)
T1: Transfer T2: Interest
Write(A)
A=A+100
Read(B)
A=1.06*A
Write(B)
B=1.06*B
Read(A)
B=B-100
Write(A)
Read(B)
Write(B)
39
Scheduling Transactions
 Serial schedule: Schedule that does not interleave the
actions of different transactions.
 Equivalent schedules: For any database state, the effect
(on the set of objects in the database) of executing the
first schedule is identical to the effect of executing the
second schedule.
 Serializable schedule: A schedule that is equivalent to
some serial schedule

40
Serializable Schedule

T1 T2
R(A)
W(A)
R(A)
W(A)
R(B)
W(B)
R(B)
W(B)
Commit
Commit
41
Serializable Schedule
T1 T2
R(A)  Equivalent to T1;T2
W(A)  Each read and write will
read (or write) the same
R(A)
values in this
W(A) interleaving as in T1; T2
R(B)  T1’s R/W of B is not

W(B) influenced by T2’s


operations on A
R(B)
W(B)
Commit
Commit
42
When can actions be re-ordered?
 Let I, J be two consecutive actions of T1 and T2
 I=Read(O), J=Read(O)
 I=Read(O), J=Write(O)
 I=Write(O), J=Read(O)
 I=Write(O), J=Write(O)
 If I and J are both reads, then they can be freely
reordered (with respect to each other).
 In all other cases, order (may) impact outcome
of schedule.

43
Conflicting operations
 Two operations conflict* if:
 they operate on the same data object, and
 at least one is a WRITE.
 Schedule outcome is determined by order of
the conflicting operations.
 non-conflicting operations can be
reordered with each other
 conflicting operations act as a barrier to
reordering
*this is only definitely a conflict for SERIALIZABLE, and even then
not all WRITEs conflict with each other
44
Conflict Serializable Schedules
 Two schedules are conflict equivalent if:
 Involve the same actions of the same transactions
 Every pair of conflicting actions (of committed
txn) are ordered the same way.
 Alternatively: S can be transformed to S’ by swaps
of non-conflicting actions.
 Schedule S is conflict serializable if S is
conflict equivalent to some serial schedule

Every conflict serializable schedule is serializable.


Conflict-serializable schedule

T1 T2
R(A)
W(A)
R(A)
W(A)
R(B)
W(B)
R(B)
W(B)
Commit
Commit
46
Not conflict-serializable
T1 T2
R(A)
W(A)
R(A)
W(A)
R(B)
W(B)
Commit
R(B)
W(B)
Commit
Not conflict-serializable
T1 T2
R(A) This interleaving
W(A) suggests T1; T2
b/c T2 is reading
R(A) T1’s write
W(A)
R(B)
W(B)
Commit
R(B)
W(B)
Commit
Not conflict-serializable
T1 T2
R(A) This interleaving
W(A) suggests T1; T2
b/c T2 is reading
R(A) T1’s write
W(A)
R(B)
This interleaving W(B)
suggests T2; T1
b/c T1 is reading Commit
T2’s write R(B)
W(B)
Commit
Precedence graphs

 Directed graph derived from schedule S:


 Vertex for each transaction
 Edge from Ti to Tj if:
• Ti executes Write(O) before Tj executes Read(O)
• Ti executes Read(O) before Tj executes Write(O)
• Ti executes Write(O) before Tj executes Write(O)

If edge Ti -> Tj appears in precedence graph, then in any


serial schedule equivalent to S, Ti must appear before Tj.

50
Construct precedence graphs:

T1 T2 T1 T2
R(A) R(A)
W(A) W(A)
R(A) R(A)
W(A) W(A)
R(B) R(B)
W(B) W(B)
R(B) Commit
W(B) R(B)
Commit W(B)
Commit Commit

1 2
Construct precedence graphs:

T1 T2 T1 T2
R(A) R(A)
W(A) W(A)
R(A) R(A)
W(A) W(A)
R(B) R(B)
W(B) W(B)
R(B) Commit
W(B) R(B)
Commit W(B)
Commit Commit

1 2 1 2
Conflict serializable
Construct precedence graphs:

T1 T2 T1 T2
R(A) R(A)
W(A) W(A)
R(A) R(A)
W(A) W(A)
R(B) R(B)
W(B) W(B)
R(B) Commit
W(B) R(B)
Commit W(B)
Commit Commit

1 2 1 2
Conflict serializable Not Conflict serializable
Construct precedence graph:
T1 T2 T3
R(A)
• W(O) before R(O) W(A)
• R(O) before W(O)
Commit
• W(O) before W(O)
W(A)
Commit
W(A)
Commit

T1 T2 T3

54
Construct precedence graph:
T1 T2 T3
R(A)
• W(O) before R(O) W(A)
• R(O) before W(O)
Commit
• W(O) before W(O)
W(A)
Commit
W(A)
Commit

T1 T2 T3

55
Construct precedence graph:
T1 T2 T3
R(A)
• W(O) before R(O) W(A)
• R(O) before W(O)
Commit
• W(O) before W(O)
W(A)
Commit
W(A)
Commit

T1 T2 T3

56
Construct precedence graph:
T1 T2 T3
R(A)
• W(O) before R(O) W(A)
• R(O) before W(O)
Commit
• W(O) before W(O)
W(A)
Commit
W(A)
Commit

T1 T2 T3

Cycle means there’s no serial schedule 57


Construct precedence graph:
T1 T2 T3
R(A)
• W(O) before R(O) W(A)
• R(O) before W(O) Re-order to eliminate back edge
W(A)
• W(O) before W(O)
Commit
Commit
W(A)
Commit

T1 T2 T3

Re-order to eliminate back edge 58


Transaction state

 Active: a transaction is active while


executing.
 Partially committed: after the final
statement has executed
 Failed: after discovery that normal
execution cannot proceed
 Aborted: transaction has been rolled-back
and database restored to prior state.
 Committed: after successful completion.

59
Dealing with Abort
 ABORT means halt and ‘undo’ all the
operations of the transaction
 May be issued due to error or user decision
 A rolled-back transaction should leave the
system in a state equivalent to having never
run the transaction in the first place
 If the system is interleaving transactions
 One or more of the transactions could ABORT
 What if one of the other transactions read a value
the aborting transaction wrote?

60
Schedules involving abort
T1 T2
R(A)
W(A) T2 Read a value generated
by T1
R(A)
W(A) This is an
unrecoverable
R(B) schedule
W(B)
Commit
Abort

61
Schedules involving abort
T1 T2
R(A)
W(A) T2 Read a value generated
by T1
R(A)
W(A) This is an
unrecoverable
R(B) schedule
W(B)
Commit And then T2 committed
Abort

62
Schedules involving abort
T1 T2
R(A)
W(A) T2 Read a value generated
by T1
R(A)
W(A) This is an
unrecoverable
R(B) schedule
W(B)
Commit And then T2 committed
Abort

 T1’s Abort should be equivalent to T1 not


executing at all
 This should be schedule equiv. to just running T2
63
Schedules involving abort
 In addition to being serial equivalent, we
need schedules to be recoverable
 A recoverable schedule can handle aborts
of any uncommitted transaction
 Recoverable schedule: transactions commit
only after all transactions whose changes
they read commit.
 commits are delayed until the system can
guarantee no failure

64
Cascading rollback
T1 T2 T3
 Even if schedule is
R(A)
recoverable, several R(B)
transactions may need to be W(A)
rolled back to recover R(A)

correctly. W(A)
R(A)
 Cascading Rollback: a single Abort
transaction failure leading to
a series of rollbacks

65
Cascading rollback
T1 T2 T3
 Even if schedule is
R(A)
recoverable, several R(B)
transactions may need to be W(A)
rolled back to recover R(A)

correctly. W(A)
R(A)
 Cascading Rollback: a single Abort
transaction failure leading to
a series of rollbacks
 T2 read T1’s write, and T3 read T2’s write
 T1’s abort -> T2 to abort -> T3 to abort

66
Cascadeless schedules
 Cascadeless schedule: For
any transactions Ti and Tj: if
Tj reads data written by Ti,
then Ti commits before read
operation of Tj. T1 T2 T3
R(A)
R(B)
T2’s read would have to wait for T1 to commit W(A)
R(A)
T3’s read would have to wait for T2 to commit W(A)
R(A)
Abort
67
Strict schedules

 A schedule is strict if:


 A value written by a transaction T is not read
or overwritten by other transactions until T
either aborts or commits.

 Strict schedules are recoverable and


cascadeless.

68
Properties of schedules

69

You might also like