8000 nbtree VACUUM: cope with right sibling link corruption. · postgres/postgres@a72b503 · GitHub
[go: up one dir, main page]

Skip to content

Commit a72b503

Browse files
nbtree VACUUM: cope with right sibling link corruption.
Avoid "right sibling's left-link doesn't match" errors when vacuuming a corrupt nbtree index. Just LOG the issue and press on. That way VACUUM will have a decent chance of finishing off all required processing for the index (and for the table as a whole). This error was seen in the field from time to time (it's more than a theoretical risk), so giving VACUUM the ability to press on like this has real value. Nothing short of a REINDEX is expected to fix the underlying index corruption, so giving up (by throwing an error) risks making a bad situation far worse. Anything that blocks forward progress by VACUUM like this might go unnoticed for a long time. This could eventually lead to a wraparound/xidStopLimit outage. Note that _bt_unlink_halfdead_page() has always been able to bail on page deletion when the target page's left sibling page was in an inconsistent state. It now does the same thing (returns false to back out of the second phase of deletion) when it notices sibling link corruption in the target page's right sibling page. This is similar to the work from commit 5b861ba (later backpatched as commit 43e409c), which taught nbtree to press on with vacuuming an index when page deletion fails to "re-find" a downlink in the target page's parent page. The "re-find" check seems to make VACUUM bail on page deletion more often in practice, but there is no reason to take any chances here. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/CAH2-Wzko2q2kP1+UvgJyP9g0mF4hopK0NtQZcxwvMv9_ytGhkQ@mail.gmail.com Backpatch: 11- (all supported versions).
1 parent 6f1cf2e commit a72b503

File tree

1 file changed

+36
-11
lines changed

1 file changed

+36
-11
lines changed

src/backend/access/nbtree/nbtpage.c

Lines changed: 36 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1836,13 +1836,6 @@ _bt_unlink_halfdead_page(Relation rel, Buffer leafbuf, BlockNumber scanblkno,
18361836
leftsib = opaque->btpo_next;
18371837
_bt_relbuf(rel, lbuf);
18381838

1839-
/*
1840-
* It'd be good to check for interrupts here, but it's not easy to
1841-
* do so because a lock is always held. This block isn't
1842-
* frequently reached, so hopefully the consequences of not
1843-
* checking interrupts aren't too bad.
1844-
*/
1845-
18461839
if (leftsib == P_NONE)
18471840
{
18481841
elog(LOG, "no left sibling (concurrent deletion?) of block %u in \"%s\"",
@@ -1861,6 +1854,9 @@ _bt_unlink_halfdead_page(Relation rel, Buffer leafbuf, BlockNumber scanblkno,
18611854
}
18621855
return false;
18631856
}
1857+
1858+
CHECK_FOR_INTERRUPTS();
1859+
18641860
lbuf = _bt_getbuf(rel, leftsib, BT_WRITE);
18651861
page = BufferGetPage(lbuf);
18661862
opaque = (BTPageOpaque) PageGetSpecialPointer(page);
@@ -1921,11 +1917,40 @@ _bt_unlink_halfdead_page(Relation rel, Buffer leafbuf, BlockNumber scanblkno,
19211917
rbuf = _bt_getbuf(rel, rightsib, BT_WRITE);
19221918
page = BufferGetPage(rbuf);
19231919
opaque = (BTPageOpaque) PageGetSpecialPointer(page);
1920+
1921+
/*
1922+
* Validate target's right sibling page. Its left link must point back to
1923+
* the target page.
1924+
*/
19241925
if (opaque->btpo_prev != target)
1925-
elog(ERROR, "right sibling's left-link doesn't match: "
1926-
"block %u links to %u instead of expected %u in index \"%s\"",
1927-
rightsib, opaque->btpo_prev, target,
1928-
RelationGetRelationName(rel));
1926+
{
1927+
/*
1928+
* This is known to fail in the field; sibling link corruption is
1929+
* relatively common. Press on with vacuuming rather than just
1930+
* throwing an ERROR (same approach used for left-sibling's-right-link
1931+
* validation check a moment ago).
1932+
*/
1933+
ereport(LOG,
1934+
(errcode(ERRCODE_INDEX_CORRUPTED),
1935+
errmsg_internal("right sibling's left-link doesn't match: "
1936+
"right sibling %u of target %u with leafblkno %u "
1937+
"and scanblkno %u spuriously links to non-target %u "
1938+
"on level %u of index \"%s\"",
1939+
rightsib, target, leafblkno,
1940+
scanblkno, opaque->btpo_prev,
1941+
targetlevel, RelationGetRelationName(rel))));
1942+
1943+
/* Must release all pins and locks on failure exit */
1944+
if (BufferIsValid(lbuf))
1945+
_bt_relbuf(rel, lbuf);
1946+
_bt_relbuf(rel, rbuf);
1947+
_bt_relbuf(rel, buf);
1948+
if (target != leafblkno)
1949+
_bt_relbuf(rel, leafbuf);
1950+
1951+
return false;
1952+
}
1953+
19291954
rightsib_is_rightmost = P_RIGHTMOST(opaque);
19301955
*rightsib_empty = (P_FIRSTDATAKEY(opaque) > PageGetMaxOffsetNumber(page));
19311956

0 commit comments

Comments
 (0)
0