[go: up one dir, main page]

0% found this document useful (0 votes)
13 views64 pages

Week7 Lecture

1. The document discusses the structure and architecture of a typical database management system (DBMS). 2. It explains that a DBMS has a layered architecture, with components like query optimization and execution, relational operators, buffer management, and concurrency control. 3. The buffer manager plays a crucial role in optimizing data access by efficiently managing data pages in memory. Concurrency control ensures correct and fast data access when the database is used by many users simultaneously.

Uploaded by

samia lachgar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views64 pages

Week7 Lecture

1. The document discusses the structure and architecture of a typical database management system (DBMS). 2. It explains that a DBMS has a layered architecture, with components like query optimization and execution, relational operators, buffer management, and concurrency control. 3. The buffer manager plays a crucial role in optimizing data access by efficiently managing data pages in memory. Concurrency control ensures correct and fast data access when the database is used by many users simultaneously.

Uploaded by

samia lachgar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 64

Course

Introduction to Databases

Professor: Karima Echihabi


Program: Computer Engineering
Session: Fall 2023

1
Databases
Transactions Management Overview
Chapter 16

2
Structure of a DBMS
Queries Answers
• A typical DBMS has a
layered architecture.
Query Optimization
• The figure does not show the and Execution
concurrency control and
recovery components. Relational Operators

• This is one of several Files and Access Methods


possible architectures; each the buffer manager in a DBMS plays a

system has its own Buffer Management crucial role in optimizing data access by
efficiently managing the storage and
retrieval of data pages in memory.

variations. Disk Space Management

DB

3
Structure of a DBMS

User 1 User 2 User 3 User 1 User 2 User 3

Query Optimization
and Execution

Relational Operators

Files and Access Methods


Concurrency Control Recovery
Buffer Management

Disk Space Management

DB 4
Structure of a DBMS

User 1 User 2 User 3 User 1 User 2 User 3


concurrency refers to the decomposability of a
program, algorithm, or problem into
Query Optimization order-independent or partially-ordered components or
units of computation.
and Execution Concurrency Control ensures
correct/fast data access
Relational Operators when DB is used by many
users
Files and Access Methods
Concurrency Control Recovery
Buffer Management
Recovery ensures that DB is
Disk Space Management fault-tolerant
Recovery in a database management system (DBMS) refers to the process of
restoring the database to a consistent and valid state after a failure or crash has
occurred. Failures can be caused by various reasons, such as hardware failures,
software errors, or unexpected system shutdowns.

DB 5
Why concurrency?
In summary, concurrent execution is a key strategy for optimizing the performance of a DBMS. It allows the system to make efficient use of resources,
maintain high throughput, and reduce latency, ultimately providing a more responsive and efficient database environment.
• Concurrent execution of user programs is essential for good
DBMS performance.
• Throughput (transactions per second)
• Mutli-core and single core
• Because disk accesses are frequent, and relatively slow, it is important to keep the cpu
humming by working on several user programs concurrently.
1. **Throughput (transactions per second):**
• Latency (average response time per transaction)
- Concurrent execution allows the DBMS to process multiple transactions simultaneously. This parallel processing capability improves the overall throughput of the system.
- Multiple transactions can be in different stages of execution concurrently, leading to a higher number of transactions processed per unit of time.

2. **Multi-core and Single-core:**


- In the context of multi-core processors, concurrent execution takes advantage of the available cores. Different transactions can be assigned to different cores, allowing parallel processing and improved performance.
- Even in a single-core environment, the DBMS can switch between executing different tasks quickly, creating an illusion of concurrency through techniques like time slicing.

3. **Disk Access and CPU Utilization:**


- Disk accesses are relatively slow compared to memory and CPU operations. By allowing the CPU to work on other tasks while waiting for disk I/O operations to complete, the system can achieve better overall utilization
of computing resources.
- While one transaction is waiting for data to be retrieved from disk, the CPU can switch to executing another transaction that doesn't depend on disk access.

4. **Latency (Average Response Time per Transaction):**


6
- Concurrent execution helps in reducing the average response time per transaction. While one transaction is waiting for a resource (e.g., disk I/O or a lock), other transactions can continue to make progress.
- Parallel execution of transactions minimizes the impact of individual slow transactions on the overall response time.
Example 1
• Consider three queries:
• Q1: UPDATE Account SET savings = savings + 100 WHERE customer_id = 1;
• Q2: UPDATE Account SET savings = savings - 300 WHERE customer_id = 2;
• Q3: UPDATE Account SET savings = savings - 400 WHERE customer_id = 3;
• Now consider this new query:
• Q4: SELECT MIN(savings) from Account;
• Suppose the initial savings were 900, 1000 and 1000:
• Q4 returns 900 if ran before Q1, Q2, and Q3
• Query returns 1000 if ran after Q1
• Query returns 700 If ran after Q1 and Q2
• Query returns 600 if ran after Q1, Q2 and Q3

7
Example 1
• Consider three queries:
• Q1: UPDATE Account SET savings = savings + 100 WHERE customer_id = 1;
• Q2: UPDATE Account SET savings = savings - 300 WHERE customer_id = 2;
• Q3: UPDATE Account SET savings = savings - 400 WHERE customer_id = 3;
• Now consider this new query:
• Q4: SELECT MIN(savings) from Account;
• Suppose the initial savings were 900, 1000 and 1000:
• Q4 returns 900 if ran before Q1, Q2, and Q3
• Query returns 1000 if ran after Q1
• Query returns 700 If ran after Q1 and Q2
• Query returns 600 if ran after Q1, Q2 and Q3

8
Example 1
• Consider three queries:
• Q1: UPDATE Account SET savings = savings + 100 WHERE customer_id = 1;
• Q2: UPDATE Account SET savings = savings - 300 WHERE customer_id = 2;
• Q3: UPDATE Account SET savings = savings - 400 WHERE customer_id = 3;
• Now consider this new query:
• Q4: SELECT MIN(savings) from Account;
• Suppose the initial savings were 900, 1000 and 1000:
• Q4 returns 900 if ran before Q1, Q2, and Q3
The order of operations is important
• Query returns 1000 if ran after Q1
The user will need a way to specify the order
• Query returns 700 If ran after Q1 and Q2
• Query returns 600 if ran after Q1, Q2 and Q3

9
Example 2

• Consider these two queries:


• Q1: UPDATE Account SET savings = savings * 1.06 WHERE customer_id = 1;
• Q2: UPDATE Account SET savings = savings - 100 WHERE customer_id = 1;
• Suppose savings is initially = 1000, What would happen if:
• Q1 executes before Q2: savings = 1060, then savings = 960
• Q2 executes before Q1: savings = 900, then savings = 954
• There is no guarantee that Q1 will execute before Q2 or vice-versa
if both are submitted together. However, the net effect must be
equivalent to these two transactions running serially in some order.

Solution: use transactions!


10
Transactions

• A transaction (“Xact”) is the DBMS’s abstract view of a user


program: a sequence of reads and writes. Examples: money
transfer, trip package reservation, etc.
• A user’s program may carry out many operations on the data
retrieved from the database, but the DBMS is only concerned
about what data is read/written from/to the database.
• Xact Manager controls execution of transactions

In database management systems (DBMS), a transaction refers to a logical unit of work that is performed against a
database. It is a sequence of one or more database operations, such as reads or writes, that are executed as a
single, indivisible unit. The primary goal of a transaction is to ensure the consistency, integrity, and isolation of the
database.

11
ACID Properties

A transaction is atomic, meaning that it is treated as a single, indivisible unit of work. Either all the changes made by
the transaction are committed to the database, or none of them are. If any part of the transaction fails, the entire
transaction is rolled back to its original state.

Transactions bring the database from one consistent state to another. The integrity constraints
and business rules of the database are maintained, ensuring that the database remains in a valid
state throughout the transaction.

Transactions are isolated from each other, meaning that the intermediate state of a transaction is not
visible to other transactions until it is committed. This isolation prevents interference and conflicts
between concurrently executing transactions.

Once a transaction is committed, its effects on the database persist even in the event of a system
failure. The changes made by committed transactions are durable and survive system crashes or
restarts.

12
Databases
Concurrency Control
Chapter 17

18
Concurrency Control

• Executing one transaction at a time


• Safe but slow
• To improve performance,
• Transactions must be interleaved
• Risk of anomalies,
• Need to define and ensure correctness

19
Transaction Schedules

• A schedule is a sequence of actions on


data from one of more transactions
• Actions are: Begin, Read (R), Write (W),
Commit (C) and Abort (A)
• Two conventions:

20
Transaction Schedules
Convention 1
• A schedule is a sequence of actions on T1 T2
data from one of more transactions BEGIN
R(A)
• Actions are: Begin, Read (R), Write (W), W(A)
Commit (C) and Abort (A) R(B)

• Two conventions: W(B)


COMMIT
BEGIN
R(A)
W(A)
R(B)
W(B)
COMMIT

21
Transaction Schedules
Convention 1
• A schedule is a sequence of actions on T1 T2
data from one of more transactions BEGIN
R(A)
• Actions are: Begin, Read (R), Write (W), W(A)
Commit (C) and Abort (A) R(B)

• Two conventions: W(B)


COMMIT
Convention 2 BEGIN
R1(A) W1(A) R1(B) W1(B) R2(A) W2(A) R2(B) W2(B) R(A)
W(A)
Convention 2 only includes committed transactions. Does R(B)
not include begin/commit/abort unless it is necessary.
W(B)
COMMIT

22
Serial Schedules
Example
• A serial schedule is a schedule where T1 T2
each transaction runs from beginning to BEGIN
end without any interleaving with other R(A)
transactions W(A)

• Two schedules are equivalent if: R(B)


W(B)
• Involve same transactions COMMIT
• The order of actions, for each individual BEGIN
transaction, is the same R(A)
• The final state of the DB is the same after both W(A)
schedules R(B)
WBB)
COMMIT

23
Serial Schedules
Schedule 1 Schedule 2
• A serial schedule is a schedule where T1 T2 T1 T2
each transaction runs from beginning to BEGIN BEGIN
end without any interleaving with other R(A) R(A)
transactions W(A) W(A)

• Two schedules are equivalent if: R(B) BEGIN


W(B) R(A)
• They involve same transactions COMMIT W(A)
• The order of actions, for each individual BEGIN R(B)
transaction, is the same R(A) W(B)
• The final state of the DB is the same after both W(A) COMMIT
schedules R(B) R(B)
W(B) W(B)
COMMIT COMMIT

24
Serial Schedules
• A schedule S is serializable if S is equivalent to
some serial schedule.
• Consider the two transactions below: In summary, a serial schedule is a
straightforward, sequential execution of
• T1: transfer 100 dhs from A to B transactions, while a serializable schedule
• T2: Add 6% to both A and B allows for concurrency among transactions but
ensures that the final result is equivalent to
• Serial schedule 1 when T1 is followed by T2: some serial execution. Serializable schedules
• A = 1.06 * (A - 100) are crucial for balancing the need for
concurrency in database systems with the
• B = 1.06 * (B + 100) requirement for maintaining data consistency
and integrity.
• Serial schedule 2 when T2 is followed by T1:
• A = (1.06 * A) - 100
• B = (1.06 * B) + 100
• Different results but both valid serial schedules!

25
Conflicting Operations
• Challenging to determine if two schedules leave the DB in the same final
state
• Instead of strict equivalence check, adopt a more conservative one:
• No false positives, but possible false negatives
• Sacrifice concurrency for correctness
• Use notion of “conflicting” operations (read/write)
• Two operations conflict if they:
• Belong to different transactions
• Apply on the same object
• At least one of them is a write
• The order of non-conflicting operations does not impact the final state of the
DB
• Only focus on conflicting operations

28
Example

• A schedule S is conflict serializable if


• S can be transformed into a serial schedule by swapping consecutive
non-conflicting operations of different transactions
• Example:
T1: R1(A) W1(A) R1(B) W1(B)
T2: R2(A) W2(A) R2(B) W2(B)

29
Example

• A schedule S is conflict serializable if


• S can be transformed into a serial schedule by swapping consecutive
non-conflicting operations of different transactions
• Example:
T1: R1(A) W1(A) R1(B) W1(B)
T2: R2(A) W2(A) R2(B) W2(B)

T1: R1(A) W1(A) R1(B) W1(B)


T2: R2(A) W2(A) R2(B) W2(B)

30
Example

• A schedule S is conflict serializable if


• S can be transformed into a serial schedule by swapping consecutive
non-conflicting operations of different transactions
• Example:
T1: R1(A) W1(A) R1(B) W1(B)
T2: R2(A) W2(A) R2(B) W2(B)

T1: R1(A) W1(A) R1(B) W1(B)


T2: R2(A) W2(A) R2(B) W2(B)

31
Example

• A schedule S is conflict serializable if


• S can be transformed into a serial schedule by swapping consecutive
non-conflicting operations of different transactions
• Example:
T1: R1(A) W1(A) R1(B) W1(B)
T2: R2(A) W2(A) R2(B) W2(B)

T1: R1(A) W1(A) R1(B) W1(B)


T2: R2(A) W2(A) R2(B) W2(B)

32
Example

• A schedule S is conflict serializable if


• S can be transformed into a serial schedule by swapping consecutive
non-conflicting operations of different transactions
• Example:
T1: R1(A) W1(A) R1(B) W1(B)
T2: R2(A) W2(A) R2(B) W2(B)

T1: R1(A) W1(A) R1(B) W1(B)


T2: R2(A) W2(A) R2(B) W2(B)

33
Example

• A schedule that is not conflict serializable:

T1: R1(A) W1(A) R1(B) W1(B)


T2: R2(A) W2(A) R2(B) W2(B)

34
Dependency Graph

• Dependency graph: Ti Tj
• One node per Xact;
• Edge from Ti to Tj if :
• An operation Oi of Ti conflicts with an operation Oj of Tj
• Oi appears earlier in the schedule than Oj
• Theorem: Schedule is conflict serializable if and only if its
dependency graph is acyclic

35
Example
• A schedule that is not conflict serializable:

T1: R1(A) W1(A) R1(B) W1(B)


T2: R2(A) W2(A) R2(B) W2(B)

A
T1 T2 Dependency graph

• The cycle in the graph reveals the problem. The output of T1


depends on T2, and vice-versa.

36
Example
• A schedule that is not conflict serializable:

T1: R1(A) W1(A) R1(B) W1(B)


T2: R2(A) W2(A) R2(B) W2(B)

A
T1 T2 Dependency graph

B
• The cycle in the graph reveals the problem. The output of T1
depends on T2, and vice-versa.

37
Example
• A schedule that is not conflict serializable:

T1: R1(A) W1(A) R1(B) W1(B)


T2: R2(A) W2(A) R2(B) W2(B)

A
T1 T2 Dependency graph

B
• The cycle in the graph reveals the problem. The output of T1
depends on T2, and vice-versa.

38
View Serializability
• Schedules S1 and S2 are view equivalent if:
• Same initial reads: If Ti reads initial value of A in S1, then Ti also reads initial
value of A in S2
• Same dependent reads: If Ti reads value of A written by Tj in S1, then Ti also
reads value of A written by Tj in S2
• Same winning writes: If Ti writes final value of A in S1, then Ti also writes final
value of A in S2
• View Serializability allows conflict serializable schedules + blind
writes

T1: R1(A) W1(A) T1: R1(A) W1(A)


T2: W2(A) T2: W2(A)
T3: W3(A) T3: W3(A)

39
Summary
• Transactions are important:
• Concurrency
• Recovery
• Xact Properties
Summary
• Transactions are important:
• Concurrency
• Recovery
• Xact Properties

Approach #1: Logging Approach #2: Shadow Paging


DBMS logs all actions so that it can undo the The DBMS makes copies of pages modified by
actions of aborted transactions. Most the transactions and transactions make
common approach. changes to those copies.
Summary
• Transactions are important:
• Concurrency
• Recovery
• Xact Properties

Notion #1: Database Consistency Notion #2: Transaction Consistency


DBMS logs all actions so that it can undo the The DBMS makes copies of pages modified
actions of aborted transactions. by the transactions and transactions make
changes to those copies.
Summary
• Transactions are important:
• Concurrency
• Recovery
• Xact Properties

Approach #1: Pessimistic Concurrency Control Approach #2: Optimistic Concurrency Control
Avoid anomalies to happen Correct anomalies when they happen
Example: Locking Examples: timestamp, multi-version,
validation, snapshot isolation
Most common approach to enforce serializability in DB because it is best for environments with high
contention
Summary
• Types of Conflicts
• Read-Write Conflicts (“Unrepeatable Reads”)
T1: R(A), R(A), W(A), C
T2: R(A), W(A), C

• Write-Read Conflicts (“Dirty Reads”)


T1: R(A), W(A), R(B), W(B), Abort
T2: R(A), W(A), C

• Write-Write conflict (“Lost Updates”)


T1: W(A), W(B), C
T2: W(A), W(B), C
Read-Write Conflict:
Occurs when one transaction reads a data item that is simultaneously being modified (written) by another transaction.
The reading transaction may not see the most recent changes made by the writing transaction, leading to potential inconsistencies.

Summary
Write-Read Conflict:

• Types of Conflicts Occurs when one transaction writes to a data item that is being read by another transaction simultaneously.
The reading transaction may not see the changes made by the writing transaction, leading to inconsistencies.

Write-Write Conflict:

• Read-Write Conflicts (“Unrepeatable


Occurs when two transactions attempt to write to the same data item concurrently.
This conflict can lead to one of the writes being overwritten by the other, resulting in the loss of data consistency
.
Reads”) Read-Read Conflict:
Often considered less problematic, this conflict occurs when two transactions attempt to read the same data concurrently.
Read-read conflicts typically do not lead to data inconsistencies, but they are still relevant in the context of isolation between transactions.

• Write-Read Conflicts (“Dirty Reads”)


To manage these conflicts and ensure data consistency, concurrency control mechanisms are employed. Common techniques include:

Locking: Transactions acquire locks on data items, preventing other transactions from accessing or modifying the same data concurrently.

• Write-Write conflict (“Lost Updates”)


Timestamps: Each transaction is assigned a timestamp, and transactions are ordered based on these timestamps to control access to
data.

• Notions to guarantee Isolation Levels: Setting the isolation level of transactions determines the degree to which they are isolated from each other, minimizing
conflicts.
All Serializable Schedules
concurrency correctness: By addressing these conflicts, database systems aim to provide a balance between allowing concurrent execution for improved
performance and maintaining the consistency and integrity of the data.

+blind write
• Serializability View Serializable Schedules
• Conflict Serializability Conflict Serializable Schedules
• View Serializability Serial Schedules

45
Summary
• Types of Conflicts
• Read-Write Conflicts (“Unrepeatable
Reads”)
• Write-Read Conflicts (“Dirty Reads”)
• Write-Write conflict (“Lost Updates”)
• Notions to guarantee
All Serializable Schedules
concurrency correctness:
• Serializability View Serializable Schedules
• Conflict Serializability Conflict Serializable Schedules
• View Serializability Serial Schedules

Most commonly used because it can be enforced efficiently 46


Locking For CC
• Locks enable a DBMS to dynamically generate a
serializable schedule
• without knowing each transaction’s read/write set
ahead of time!
• Locks protect database objects during concurrent
access when there are multiple readers and writers.
• It is the role of the lock manager, within the DBMS,
to decide if a transaction can acquire a lock or not.

47
Locking
• Two main types of locks:
• Shared Lock (S-LOCK): allows multiple transactions to read the same
object at the same time.
• Exclusive Lock (X-LOCK): allows a transaction to modify an object.
S-Lock X-Lock
Lock Compatibility Matrix S-Lock
X-Lock
• Any number of transactions can hold an S-LOCK on an object
• But, if a transaction holds an X-LOCK on an object, no other transaction
may hold any lock on this object
48
Two-Phase Locking (2PL)
• Two main types of locks: Shared Lock (S-LOCK),
Exclusive Lock (X-LOCK)
• Xact cannot get new locks after releasing any lock

49
Two-Phase Locking (2PL)
• The most common approach to enforce conflict serializability
• It can be proved that the transactions can be serialized in the
order of their lock points (i.e. the point where a transaction
acquired its final lock).
• Pessimistic Approach:
• Obtains locks to prevent conflicts
• Some locks may have been unnecessary
• Risk of deadlocks: a deadlock is a situation in which two or more
transactions are waiting for one another to give up locks.
• Risk of cascading aborts: abort of one transaction requires abort of
another

50
Two-Phase Locking (2PL)

• Example of a deadlock

T1: X1(B) R1(B) W1(B) X1(A)


T2: S2(A) R2 (A) S2 (B)

• Example of a cascading abort (a.k.a. rollback)

T1: R1(A) W1(A) Abort


T2: R2(A) W2(A)

51
Strict Two-Phase Locking (2PL)

• Avoids cascading aborts


• Same as 2PL except all locks are released at once when the
transaction completes (commit or abort)

52
Lock Management
• Lock and unlock requests are handled by the lock manager
• Lock table entry:
• Number of transactions currently holding a lock
• Type of lock held (shared or exclusive)
• Pointer to queue of lock requests
• Locking and unlocking have to be atomic operations
• Lock upgrade: transaction that holds a shared lock can be upgraded to hold
an exclusive lock

53
Deadlocks

• Deadlock: Cycle of transactions waiting for locks to be released by


each other.
• Two ways of dealing with deadlocks:
• Deadlock avoidance
• Deadlock detection and resolution

54
Deadlock Avoidance

• Assign priorities based on timestamps. Assume Ti wants a lock


that Tj holds. Two policies are possible:
• Wait-Die: It Ti has higher priority, Ti waits for Tj; otherwise Ti aborts
• Wound-wait: If Ti has higher priority, Tj aborts; otherwise Ti waits
• If a transaction re-starts, make sure it has its original timestamp

55
Deadlock Detection

• Create a waits-for graph:


• Nodes are transactions
• There is an edge from Ti to Tj if Ti is waiting for Tj to release a lock
• Periodically check for cycles in the waits-for graph

56
Deadlock Detection (Continued)
Example:

T1: S(A), R(A), S(B)


T2: X(B),W(B) X(C)
T3: S(C), R(C) X(A)
T4: X(B)

T1 T2 T1 T2

T4 T3 T3 T3

57
Multiple-Granularity Locks

• Hard to decide what granularity to lock (tuples vs. pages vs.


tables).
• Shouldn’t have to decide!
• Data “containers” are nested:
Database
Database Lock Hierarchy:
Tables 1. Database level (Slightly Rare)
contains 2. Table level (Very Common)
Pages 3. Page level (Common)
4. Tuple level (Very Common)
5. Attribute level (Rare)
Tuples

Attribute
58
Solution: New Lock Modes, Protocol

• Allow Xacts to lock at each level, but with a special protocol using
new “intention” locks:

Before locking an item, Xact IS S SIX IX X


must set “intention locks” IS     X
on all its ancestors. S   X X X
SIX  X X X X
For unlock, go from specific
IX  X X  X
to general (i.e., bottom-up). X X X X X X
SIX mode: Like S & IX at
the same time.
59
Multiple Granularity Lock Protocol

• Each Xact starts from the root of the hierarchy.


• To get S or IS lock on a node, must hold IS or IX on parent node.
• What if Xact holds SIX on parent? S on parent?
• To get X or IX or SIX on a node, must hold IX or SIX on parent
node.
• Must release locks in bottom-up order.

Protocol is correct in that it is equivalent to directly setting


locks at the leaf levels of the hierarchy.

60
Examples

• T1 scans R, and updates a few tuples:


• T1 gets an SIX lock on R, then repeatedly gets an
S lock on tuples of R, and occasionally upgrades to
X on the tuples.
• T2 uses an index to read only part of R: IS S SIX IX X
• T2 gets an IS lock on R, and repeatedly IS     X
gets an S lock on tuples of R. S  X
 X X
• T3 reads all of R: SIX X X
 X X
• T3 gets an S lock on R.
IX  X X  X
• OR, T3 could behave like T2; can use lock
escalation to decide which. X X X X X X

61
Dynamic Databases

• If we relax the assumption that the DB is a fixed collection of


objects, even Strict 2PL will not assure serializability:
• T1 locks all pages containing sailor records with rating = 1, and finds
oldest sailor (say, age = 71).
• Next, T2 inserts a new sailor; rating = 1, age = 96.
• T2 also deletes oldest sailor with rating = 2 (and, say, age = 80), and
commits.
• T1 now locks all pages containing sailor records with rating = 2, and finds
oldest (say, age = 63).
• No consistent DB state where T1 is “correct”!

62
The Problem

• T1 implicitly assumes that it has locked the set of all sailor records
with rating = 1.
• Assumption only holds if no sailor records are added while T1 is
executing!
• Need some mechanism to enforce this assumption. (Index locking and
predicate locking.)
• Example shows that conflict serializability guarantees serializability
only if the set of objects is fixed!
• Phantom Problem!

63
The Phantom Problem

• A “phantom” is a tuple that is invisible during part of a transaction


execution but not invisible during the entire execution
• Example: – T1: reads list of sailors – T2: inserts a new sailor – T1:
re-reads: a new sailor appears !
• In a static database:
• Conflict serializability implies serializability
• In a dynamic database, this may fail due to phantoms
• Strict 2PL guarantees conflict serializability, but not serializability

64
The Phantom Problem

• Dealing With Phantoms


• Lock the entire table, or
• Lock the index entry for ‘rating=1’ – If index is available
• Or use predicate locks – A lock on an arbitrary predicate

65
Isolation Levels in SQL

• 1. “Dirty reads”
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
• 2. “Committed reads”
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
• 3. “Repeatable reads”
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ
• 4. Serializable transactions
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE

66
Isolation Level: Dirty Reads

• “Long duration” WRITE locks


• Strict 2PL
• No READ locks
• Read-only transactions are never delayed
• Possible problem: dirty and inconsistent reads

67
Isolation Level: Read Committed

• “Long duration” WRITE locks


• Strict 2PL
• “Short duration” READ locks
• Only acquire lock while reading (not 2PL)
• Possible problem:
• Unrepeatable reads: when reading same element twice,
may get two different values

68
Isolation Level: Repeatable Read

• “Long duration” WRITE locks


• Strict 2PL
• “Long duration” READ locks
• Strict 2PL
• Possible problem:
• This is not serializable yet!

69
Isolation Level: Serializable

• “Long duration” WRITE locks


• Strict 2PL
• “Long duration” READ locks
• Strict 2PL
• This level also deals with phantoms

70
Summary (Contd.)
• Correctness criteria for isolation is “serializability”
• We use “conflict serializability”: more conservative but easier to enforce
• Two Phase Locking and Strict 2PL:
• Locks implement the notions of conflicts
• Locks are managed by the lock manager
• Deadlocks can happen: we can either prevent them or detect them
• Multiple granularity locking:
• Allows flexible tradeoff between the lock scope in DB and the number of lock entries in the
lock table
• Additional
• S2PL is pessimistic, there exists optimistic CC protolcols, e.g. Timestamp CC
• Index latching, phantoms, etc.

71

You might also like