8000 Fix serialization anomalies due to race conditions on INSERT. · home201448/postgres@50ca917 · GitHub
[go: up one dir, main page]

Skip to content

Commit 50ca917

Browse files
committed
Fix serialization anomalies due to race conditions on INSERT.
On insert the CheckForSerializableConflictIn() test was performed before the page(s) which were going to be modified had been locked (with an exclusive buffer content lock). If another process acquired a relation SIReadLock on the heap and scanned to a page on which an insert was going to occur before the page was so locked, a rw-conflict would be missed, which could allow a serialization anomaly to be missed. The window between the check and the page lock was small, so the bug was generally not noticed unless there was high concurrency with multiple processes inserting into the same table. This was reported by Peter Bailis as bug #11732, by Sean Chittenden as bug #13667, and by others. The race condition was eliminated in heap_insert() by moving the check down below the acquisition of the buffer lock, which had been the very next statement. Because of the loop locking and unlocking multiple buffers in heap_multi_insert() a check was added after all inserts were completed. The check before the start of the inserts was left because it might avoid a large amount of work to detect a serialization anomaly before performing the all of the inserts and the related WAL logging. While investigating this bug, other SSI bugs which were even harder to hit in practice were noticed and fixed, an unnecessary check (covered by another check, so redundant) was removed from heap_update(), and comments were improved. Back-patch to all supported branches. Kevin Grittner and Thomas Munro
1 parent 21e634e commit 50ca917

File tree

2 files changed

+69
-31
lines changed

2 files changed

+69
-31
lines changed

src/backend/access/heap/heapam.c

Lines changed: 65 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -2163,26 +2163,31 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
21632163
*/
21642164
heaptup = heap_prepare_insert(relation, tup, xid, cid, options);
21652165

2166+
/*
2167+
* Find buffer to insert this tuple into. If the page is all visible,
2168+
* this will also pin the requisite visibility map page.
2169+
*/
2170+
buffer = RelationGetBufferForTuple(relation, heaptup->t_len,
2171+
InvalidBuffer, options, bistate,
2172+
&vmbuffer, NULL);
2173+
21662174
/*
21672175
* We're about to do the actual insert -- but check for conflict first, to
21682176
* avoid possibly having to roll back work we've just done.
21692177
*
2178+
* This is safe without a recheck as long as there is no possibility of
2179+
* another process scanning the page between this check and the insert
2180+
* being visible to the scan (i.e., an exclusive buffer content lock is
2181+
* continuously held from this point until the tuple insert is visible).
2182+
*
21702183
* For a heap insert, we only need to check for table-level SSI locks. Our
21712184
* new tuple can't possibly conflict with existing tuple locks, and heap
21722185
* page locks are only consolidated versions of tuple locks; they do not
2173-
* lock "gaps" as index page locks do. So we don't need to identify a
2174-
* buffer before making the call.
2186+
* lock "gaps" as index page locks do. So we don't need to specify a
2187+
* buffer when making the call, which makes for a faster check.
21752188
*/
21762189
CheckForSerializableConflictIn(relation, NULL, InvalidBuffer);
21772190

2178-
/*
2179-
* Find buffer to insert this tuple into. If the page is all visible,
2180-
* this will also pin the requisite visibility map page.
2181-
*/
2182-
buffer = RelationGetBufferForTuple(relation, heaptup->t_len,
2183-
InvalidBuffer, options, bistate,
2184-
&vmbuffer, NULL);
2185-
21862191
/* NO EREPORT(ERROR) from here till changes are logged */
21872192
START_CRIT_SECTION();
21882193

@@ -2436,13 +2441,26 @@ heap_multi_insert(Relation relation, HeapTuple *tuples, int ntuples,
24362441

24372442
/*
24382443
* We're about to do the actual inserts -- but check for conflict first,
2439-
* to avoid possibly having to roll back work we've just done.
2444+
* to minimize the possibility of having to roll back work we've just
2445+
* done.
24402446
*
2441-
* For a heap insert, we only need to check for table-level SSI locks. Our
2442-
* new tuple can't possibly conflict with existing tuple locks, and heap
2447+
* A check here does not definitively prevent a serialization anomaly;
2448+
* that check MUST be done at least past the point of acquiring an
2449+
* exclusive buffer content lock on every buffer that will be affected,
2450+
* and MAY be done after all inserts are reflected in the buffers and
2451+
* those locks are released; otherwise there race condition. Since
2452+
* multiple buffers can be locked and unlocked in the loop below, and it
2453+
* would not be feasible to identify and lock all of those buffers before
2454+
* the loop, we must do a final check at the end.
2455+
*
2456+
* The check here could be omitted with no loss of correctness; it is
2457+
* present strictly as an optimization.
2458+
*
2459+
* For heap inserts, we only need to check for table-level SSI locks. Our
2460+
* new tuples can't possibly conflict with existing tuple locks, and heap
24432461
* page locks are only consolidated versions of tuple locks; they do not
2444-
* lock "gaps" as index page locks do. So we don't need to identify a
2445-
* buffer before making the call.
2462+
* lock "gaps" as index page locks do. So we don't need to specify a
2463+
* buffer when making the call, which makes for a faster check.
24462464
*/
24472465
CheckForSerializableConflictIn(relation, NULL, InvalidBuffer);
24482466

@@ -2621,6 +2639,22 @@ heap_multi_insert(Relation relation, HeapTuple *tuples, int ntuples,
26212639
ndone += nthispage;
26222640
}
26232641

2642+
/*
2643+
* We're done with the actual inserts. Check for conflicts again, to
2644+
* ensure that all rw-conflicts in to these inserts are detected. Without
2645+
* this final check, a sequential scan of the heap may have locked the
2646+
* table after the "before" check, missing one opportunity to detect the
2647+
* conflict, and then scanned the table before the new tuples were there,
2648+
* missing the other chance to detect the conflict.
2649+
*
2650+
* For heap inserts, we only need to check for table-level SSI locks. Our
2651+
* new tuples can't possibly conflict with existing tuple locks, and heap
2652+
* page locks are only consolidated versions of tuple locks; they do not
2653+
* lock "gaps" as index page locks do. So we don't need to specify a
2654+
* buffer when making the call.
2655+
*/
2656+
CheckForSerializableConflictIn(relation, NULL, InvalidBuffer);
2657+
26242658
/*
26252659
* If tuples are cachable, mark them for invalidation from the caches in
26262660
* case we abort. Note it is OK to do this after releasing the buffer,
@@ -2934,6 +2968,11 @@ heap_delete(Relation relation, ItemPointer tid,
29342968
/*
29352969
* We're about to do the actual delete -- check for conflict first, to
29362970
* avoid possibly having to roll back work we've just done.
2971+
*
2972+
* This is safe without a recheck as long as there is no possibility of
2973+
* another process scanning the page between this check and the delete
2974+
* being visible to the scan (i.e., an exclusive buffer content lock is
2975+
* continuously held from this point until the tuple delete is visible).
29372976
*/
29382977
CheckForSerializableConflictIn(relation, &tp, buffer);
29392978

@@ -3561,12 +3600,6 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
35613600
goto l2;
35623601
}
35633602

3564-
/*
3565-
* We're about to do the actual update -- check for conflict first, to
3566-
* avoid possibly having to roll back work we've just done.
3567-
*/
3568-
CheckForSerializableConflictIn(relation, &oldtup, buffer);
3569-
35703603
/* Fill in transaction status data */
35713604

35723605
/*
@@ -3755,14 +3788,20 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
37553788
}
37563789

37573790
/*
3758-
* We're about to create the new tuple -- check for conflict first, to
3791+
* We're about to do the actual update -- check for conflict first, to
37593792
* avoid possibly having to roll back work we've just done.
37603793
*
3761-
* NOTE: For a tuple insert, we only need to check for table locks, since
3762-
* predicate locking at the index level will cover ranges for anything
3763-
* except a table scan. Therefore, only provide the relation.
3794+
* This is safe without a recheck as long as there is no possibility of
3795+
* another process scanning the pages between this check and the update
3796+
* being visible to the scan (i.e., exclusive buffer content lock(s) are
3797+
* continuously held from this point until the tuple update is visible).
3798+
*
3799+
* For the new tuple the only check needed is at the relation level, but
3800+
* since both tuples are in the same relation and the check for oldtup
3801+
* will include checking the relation level, there is no benefit to a
3802+
* separate check for the new tuple.
37643803
*/
3765-
CheckForSerializableConflictIn(relation, NULL, InvalidBuffer);
3804+
CheckForSerializableConflictIn(relation, &oldtup, buffer);
37663805

37673806
/*
37683807
* At this point newbuf and buffer are both pinned and locked, and newbuf

src/backend/storage/lmgr/predicate.c

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3217,22 +3217,21 @@ ReleasePredicateLocks(bool isCommit)
32173217
return;
32183218
}
32193219

3220+
LWLockAcquire(SerializableXactHashLock, LW_EXCLUSIVE);
3221+
32203222
Assert(!isCommit || SxactIsPrepared(MySerializableXact));
32213223
Assert(!isCommit || !SxactIsDoomed(MySerializableXact));
32223224
Assert(!SxactIsCommitted(MySerializableXact));
32233225
Assert(!SxactIsRolledBack(MySerializableXact));
32243226

32253227
/* may not be serializable during COMMIT/ROLLBACK PREPARED */
3226-
if (MySerializableXact->pid != 0)
3227-
Assert(IsolationIsSerializable());
3228+
Assert(MySerializableXact->pid == 0 || IsolationIsSerializable());
32283229

32293230
/* We'd better not already be on the cleanup list. */
32303231
Assert(!SxactIsOnFinishedList(MySerializableXact));
32313232

32323233
topLevelIsDeclaredReadOnly = SxactIsReadOnly(MySerializableXact);
32333234

3234-
LWLockAcquire(SerializableXactHashLock, LW_EXCLUSIVE);
3235-
32363235
/*
32373236
* We don't hold XidGenLock lock here, assuming that TransactionId is
32383237
* atomic!
@@ -4369,7 +4368,7 @@ CheckTableForSerializableConflictIn(Relation relation)
43694368
LWLockAcquire(SerializablePredicateLockListLock, LW_EXCLUSIVE);
43704369
for (i = 0; i < NUM_PREDICATELOCK_PARTITIONS; i++)
43714370
LWLockAcquire(PredicateLockHashPartitionLockByIndex(i), LW_SHARED);
4372-
LWLockAcquire(SerializableXactHashLock, LW_SHARED);
4371+
LWLockAcquire(SerializableXactHashLock, LW_EXCLUSIVE);
43734372

43744373
/* Scan through target list */
43754374
hash_seq_init(&seqstat, PredicateLockTargetHash);

0 commit comments

Comments
 (0)
0