10000 YA attempt at taming worst-case behavior of get_actual_variable_range. · postgres/postgres@b96a096 · GitHub
[go: up one dir, main page]

Skip to content

Commit b96a096

Browse files
committed
YA attempt at taming worst-case behavior of get_actual_variable_range.
We've made multiple attempts at preventing get_actual_variable_range from taking an unreasonable amount of time (3ca930f, fccebe4). But there's still an issue for the very first planning attempt after deletion of a large number of extremal-valued tuples. While that planning attempt will set "killed" bits on the tuples it visits and thereby reduce effort for next time, there's still a lot of work it has to do to visit the heap and then set those bits. It's (usually?) not worth it to do that much work at plan time to have a slightly better estimate, especially in a context like this where the table contents are known to be mutating rapidly. Therefore, let's bound the amount of work to be done by giving up after we've visited 100 heap pages. Giving up just means we'll fall back on the extremal value recorded in pg_statistic, so it shouldn't mean that planner estimates suddenly become worthless. Note that this means we'll still gradually whittle down the problem by setting a few more index "killed" bits in each planning attempt; so eventually we'll reach a good state (barring further deletions), even in the absence of VACUUM. Simon Riggs, per a complaint from Jakub Wartak (with cosmetic adjustments by me). Back-patch to all supported branches. Discussion: https://postgr.es/m/CAKZiRmznOwi0oaV=4PHOCM4ygcH4MgSvt8=5cu_vNCfc8FSUug@mail.gmail.com
1 parent 46def52 commit b96a096

File tree

1 file changed

+40
-5
lines changed

1 file changed

+40
-5
lines changed

src/backend/utils/adt/selfuncs.c

Lines changed: 40 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -5655,7 +5655,7 @@ get_variable_range(PlannerInfo *root, VariableStatData *vardata, Oid sortop,
56555655
* and fetching its low and/or high values.
56565656
* If successful, store values in *min and *max, and return true.
56575657
* (Either pointer can be NULL if that endpoint isn't needed.)
5658-
* If no data available, return false.
5658+
* If unsuccessful, return false.
56595659
*
56605660
* sortop is the "<" comparison operator to use.
56615661
*/
@@ -5779,11 +5779,11 @@ get_actual_variable_range(PlannerInfo *root, VariableStatData *vardata,
57795779
}
57805780
else
57815781
{
5782-
/* If min not requested, assume index is nonempty */
5782+
/* If min not requested, still want to fetch max */
57835783
have_data = true;
57845784
}
57855785

5786-
/* If max is requested, and we didn't find the index is empty */
5786+
/* If max is requested, and we didn't already fail ... */
57875787
if (max && have_data)
57885788
{
57895789
/* scan in the opposite direction; all else is the same */
@@ -5814,14 +5814,17 @@ get_actual_variable_range(PlannerInfo *root, VariableStatData *vardata,
58145814

58155815
/*
58165816
* Get one endpoint datum (min or max depending on indexscandir) from the
5817-
* specified index. Return true if successful, false if index is empty.
5817+
* specified index. Return true if successful, false if not.
58185818
* On success, endpoint value is stored to *endpointDatum (and copied into
58195819
* outercontext).
58205820
*
58215821
* scankeys is a 1-element scankey array set up to reject nulls.
58225822
* typLen/typByVal describe the datatype of the index's first column.
58235823
* (We could compute these values locally, but that would mean computing them
58245824
* twice when get_actual_variable_range needs both the min and the max.)
5825+
*
5826+
* Failure occurs either when the index is empty, or we decide that it's
5827+
* taking too long to find a suitable tuple.
58255828
*/
58265829
static bool
58275830
get_actual_variable_endpoint(Relation heapRel,
@@ -5837,6 +5840,8 @@ get_actual_variable_endpoint(Relation heapRel,
58375840
SnapshotData SnapshotNonVacuumable;
58385841
IndexScanDesc index_scan;
58395842
Buffer vmbuffer = InvalidBuffer;
5843+
BlockNumber last_heap_block = InvalidBlockNumber;
5844+
int n_visited_heap_pages = 0;
58405845
ItemPointer tid;
58415846
Datum values[INDEX_MAX_KEYS];
58425847
bool isnull[INDEX_MAX_KEYS];
@@ -5878,6 +5883,12 @@ get_actual_variable_endpoint(Relation heapRel,
58785883
* might get a bogus answer that's not close to the index extremal value,
58795884
* or could even be NULL. We avoid this hazard because we take the data
58805885
* from the index entry not the heap.
5886+
*
5887+
* Despite all this care, there are situations where we might find many
5888+
* non-visible tuples near the end of the index. We don't want to expend
5889+
* a huge amount of time here, so we give up once we've read too many heap
5890+
* pages. When we fail for that reason, the caller will end up using
5891+
* whatever extremal value is recorded in pg_statistic.
58815892
*/
58825893
InitNonVacuumableSnapshot(SnapshotNonVacuumable, RecentGlobalXmin);
58835894

@@ -5891,13 +5902,37 @@ get_actual_variable_endpoint(Relation heapRel,
58915902
/* Fetch first/next tuple in specified direction */
58925903
while ((tid = index_getnext_tid(index_scan, indexscandir)) != NULL)
58935904
{
5905+
BlockNumber block = ItemPointerGetBlockNumber(tid);
5906+
58945907
if (!VM_ALL_VISIBLE(heapRel,
5895-
ItemPointerGetBlockNumber(tid),
5908+
block,
58965909
&vmbuffer))
58975910
{
58985911
/* Rats, we have to visit the heap to check visibility */
58995912
if (index_fetch_heap(index_scan) == NULL)
5913+
{
5914+
/*
5915+
* No visible tuple for this index entry, so we need to
5916+
* advance to the next entry. Before doing so, count heap
5917+
* page fetches and give up if we've done too many.
5918+
*
5919+
* We don't charge a page fetch if this is the same heap page
5920+
* as the previous tuple. This is on the conservative side,
5921+
* since other recently-accessed pages are probably still in
5922+
* buffers too; but it's good enough for this heuristic.
5923+
*/
5924+
#define VISITED_PAGES_LIMIT 100
5925+
5926+
if (block != last_heap_block)
5927+
{
5928+
last_heap_block = block;
5929+
n_visited_heap_pages++;
5930+
if (n_visited_heap_pages > VISITED_PAGES_LIMIT)
5931+
break;
5932+
}
5933+
59005934
continue; /* no visible tuple, try next index entry */
5935+
}
59015936

59025937
/*
59035938
* We don't care whether there's more than one visible tuple in

0 commit comments

Comments
 (0)
0