8000 Add some more defenses against silly estimates to gincostestimate(). · ccneo/postgres@8e79b24 · GitHub
[go: up one dir, main page]

Skip to content

Commit 8e79b24

Browse files
committed
Add some more defenses against silly estimates to gincostestimate().
A report from Andy Colson showed that gincostestimate() was not being nearly paranoid enough about whether to believe the statistics it finds in the index metapage. The problem is that the metapage stats (other than the pending-pages count) are only updated by VACUUM, and in the worst case could still reflect the index's original empty state even when it has grown to many entries. We attempted to deal with that by scaling up the stats to match the current index size, but if nEntries is zero then scaling it up still gives zero. Moreover, the proportion of pages that are entry pages vs. data pages vs. pending pages is unlikely to be estimated very well by scaling if the index is now orders of magnitude larger than before. We can improve matters by expanding the use of the rule-of-thumb estimates I introduced in commit 7fb008c: if the index has grown by more than a cutoff amount (here set at 4X growth) since VACUUM, then use the rule-of-thumb numbers instead of scaling. This might not be exactly right but it seems much less likely to produce insane estimates. I also improved both the scaling estimate and the rule-of-thumb estimate to account for numPendingPages, since it's reasonable to expect that that is accurate in any case, and certainly pages that are in the pending list are not either entry or data pages. As a somewhat separate issue, adjust the estimation equations that are concerned with extra fetches for partial-match searches. These equations suppose that a fraction partialEntries / numEntries of the entry and data pages will be visited as a consequence of a partial-match search. Now, it's physically impossible for that fraction to exceed one, but our estimate of partialEntries is mostly bunk, and our estimate of numEntries isn't exactly gospel either, so we could arrive at a silly value. In the example presented by Andy we were coming out with a value of 100, leading to insane cost estimates. Clamp the fraction to one to avoid that. Like the previous patch, back-patch to all supported branches; this problem can be demonstrated in one form or another in all of them.
1 parent 7adbde2 commit 8e79b24

File tree

1 file changed

+52
-29
lines changed

1 file changed

+52
-29
lines changed

src/backend/utils/adt/selfuncs.c

Lines changed: 52 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -6897,6 +6897,7 @@ gincostestimate(PG_FUNCTION_ARGS)
68976897
numEntries;
68986898
GinQualCounts counts;
68996899
bool matchPossible;
6900+
double partialScale;
69006901
double entryPagesFetched,
69016902
dataPagesFetched,
69026903
dataPagesFetchedBySel;
@@ -6923,42 +6924,59 @@ gincostestimate(PG_FUNCTION_ARGS)
69236924
memset(&ginStats, 0, sizeof(ginStats));
69246925
}
69256926

6926-
if (ginStats.nTotalPages > 0 && ginStats.nEntryPages > 0 && numPages > 0)
6927+
/*
6928+
* Assuming we got valid (nonzero) stats at all, nPendingPages can be
6929+
* trusted, but the other fields are data as of the last VACUUM. We can
6930+
* scale them up to account for growth since then, but that method only
6931+
* goes so far; in the worst case, the stats might be for a completely
6932+
* empty index, and scaling them will produce pretty bogus numbers.
6933+
* Somewhat arbitrarily, set the cutoff for doing scaling at 4X growth; if
6934+
* it's grown more than that, fall back to estimating things only from the
6935+
* assumed-accurate index size. But we'll trust nPendingPages in any case
6936+
* so long as it's not clearly insane, ie, more than the index size.
6937+
*/
6938+
if (ginStats.nPendingPages < numPages)
6939+
numPendingPages = ginStats.nPendingPages;
6940+
else
6941+
numPendingPages = 0;
6942+
6943+
if (numPages > 0 && ginStats.nTotalPages <= numPages &&
6944+
ginStats.nTotalPages > numPages / 4 &&
6945+
ginStats.nEntryPages > 0 && ginStats.nEntries > 0)
69276946
{
69286947
/*
6929-
* We got valid stats. nPendingPages can be trusted, but the other
6930-
* fields are data as of the last VACUUM. Scale them by the ratio
6931-
* numPages / nTotalPages to account for growth since then.
6948+
* OK, the stats seem close enough to sane to be trusted. But we
6949+
* still need to scale them by the ratio numPages / nTotalPages to
6950+
* account for growth since the last VACUUM.
69326951
*/
69336952
double scale = numPages / ginStats.nTotalPages;
69346953

6935-
numEntryPages = ginStats.nEntryPages;
6936-
numDataPages = ginStats.nDataPages;
6937-
numPendingPages = ginStats.nPendingPages;
6938-
numEntries = ginStats.nEntries;
6939-
6940-
numEntryPages = ceil(numEntryPages * scale);
6941-
numDataPages = ceil(numDataPages * scale);
6942-
numEntries = ceil(numEntries * scale);
6954+
numEntryPages = ceil(ginStats.nEntryPages * scale);
6955+
numDataPages = ceil(ginStats.nDataPages * scale);
6956+
numEntries = ceil(ginStats.nEntries * scale);
69436957
/* ensure we didn't round up too much */
6944-
numEntryPages = Min(numEntryPages, numPages);
6945-
numDataPages = Min(numDataPages, numPages - numEntryPages);
6958+
numEntryPages = Min(numEntryPages, numPages - numPendingPages);
6959+
numDataPages = Min(numDataPages,
6960+
numPages - numPendingPages - numEntryPages);
69466961
}
69476962
else
69486963
{
69496964
/*
6950-
* It's a hypothetical index, or perhaps an index created pre-9.1 and
6951-
* never vacuumed since upgrading. Invent some plausible internal
6952-
* statistics based on the index page count. We estimate that 90% of
6953-
* the index is entry pages, and the rest is data pages. Estimate 100
6954-
* entries per entry page; this is rather bogus since it'll depend on
6955-
* the size of the keys, but it's more robust than trying to predict
6956-
* the number of entries per heap tuple.
6965+
* We might get here because it's a hypothetical index, or an index
6966+
* created pre-9.1 and never vacuumed since upgrading (in which case
6967+
* its stats would read as zeroes), or just because it's grown too
6968+
* much since the last VACUUM for us to put our faith in scaling.
6969+
*
6970+
* Invent some plausible internal statistics based on the index page
6971+
* count (and clamp that to at least 10 pages, just in case). We
6972+
* estimate that 90% of the index is entry pages, and the rest is data
6973+
* pages. Estimate 100 entries per entry page; this is rather bogus
6974+
* since it'll depend on the size of the keys, but it's more robust
6975+
* than trying to predict the number of entries per heap tuple.
69576976
*/
69586977
numPages = Max(numPages, 10);
6959-
numEntryPages = floor(numPages * 0.90);
6960-
numDataPages = numPages - numEntryPages;
6961-
numPendingPages = 0;
6978+
numEntryPages = floor((numPages - numPendingPages) * 0.90);
6979+
numDataPages = numPages - numPendingPages - numEntryPages;
69626980
numEntries = floor(numEntryPages * 100);
69636981
}
69646982

@@ -7084,16 +7102,21 @@ gincostestimate(PG_FUNCTION_ARGS)
70847102
/*
70857103
* Add an estimate of entry pages read by partial match algorithm. It's a
70867104
* scan over leaf pages in entry tree. We haven't any useful stats here,
7087-
* so estimate it as proportion.
7105+
* so estimate it as proportion. Because counts.partialEntries is really
7106+
* pretty bogus (see code above), it's possible that it is more than
7107+
* numEntries; clamp the proportion to ensure sanity.
70887108
*/
7089-
entryPagesFetched += ceil(numEntryPages * counts.partialEntries / numEntries);
7109+
partialScale = counts.partialEntries / numEntries;
7110+
partialScale = Min(partialScale, 1.0);
7111+
7112+
entryPagesFetched += ceil(numEntryPages * partialScale);
70907113

70917114
/*
70927115
* Partial match algorithm reads all data pages before doing actual scan,
7093-
* so it's a startup cost. Again, we haven't any useful stats here, so,
7094-
* estimate it as proportion
7116+
* so it's a startup cost. Again, we haven't any useful stats here, so
7117+
* estimate it as proportion.
70957118
*/
7096-
dataPagesFetched = ceil(numDataPages * counts.partialEntries / numEntries);
7119+
dataPagesFetched = ceil(numDataPages * partialScale);
70977120

70987121
/*
70997122
* Calculate cache effects if more than one scan due to nestloops or array

0 commit comments

Comments
 (0)
0