8000 Redefine pg_class.reltuples to be -1 before the first VACUUM or ANALYZE. · postgrespro/postgres@3d351d9 · GitHub
[go: up one dir, main page]

Skip to content

Commit 3d351d9

Browse files
committed
Redefine pg_class.reltuples to be -1 before the first VACUUM or ANALYZE.
Historically, we've considered the state with relpages and reltuples both zero as indicating that we do not know the table's tuple density. This is problematic because it's impossible to distinguish "never yet vacuumed" from "vacuumed and seen to be empty". In particular, a user cannot use VACUUM or ANALYZE to override the planner's normal heuristic that an empty table should not be believed to be empty because it is probably about to get populated. That heuristic is a good safety measure, so I don't care to abandon it, but there should be a way to override it if the table is indeed intended to stay empty. Hence, represent the initial state of ignorance by setting reltuples to -1 (relpages is still set to zero), and apply the minimum-ten-pages heuristic only when reltuples is still -1. If the table is empty, VACUUM or ANALYZE (but not CREATE INDEX) will override that to reltuples = relpages = 0, and then we'll plan on that basis. This requires a bunch of fiddly little changes, but we can get rid of some ugly kluges that were formerly needed to maintain the old definition. One notable point is that FDWs' GetForeignRelSize methods will see baserel->tuples = -1 when no ANALYZE has been done on the foreign table. That seems like a net improvement, since those methods were formerly also in the dark about what baserel->tuples = 0 really meant. Still, it is an API change. I bumped catversion because code predating this change would get confused by seeing reltuples = -1. Discussion: https://postgr.es/m/F02298E0-6EF4-49A1-BCB6-C484794D9ACC@thebuild.com
1 parent 9511fb3 commit 3d351d9

File tree

20 files changed

+77
-69
lines changed

20 files changed

+77
-69
lines changed

contrib/file_fdw/file_fdw.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -996,7 +996,7 @@ estimate_size(PlannerInfo *root, RelOptInfo *baserel,
996996
/*
997997
* Estimate the number of tuples in the file.
998998
*/
999-
if (baserel->pages > 0)
999+
if (baserel->tuples >= 0 && baserel->pages > 0)
10001000
{
10011001
/*
10021002
* We have # of pages and # of tuples from pg_class (that is, from a

contrib/pgstattuple/pgstatapprox.c

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -195,6 +195,9 @@ statapprox_heap(Relation rel, output_type *stat)
195195
stat->tuple_count = vac_estimate_reltuples(rel, nblocks, scanned,
196196
stat->tuple_count);
197197

198+
/* It's not clear if we could get -1 here, but be safe. */
199+
stat->tuple_count = Max(stat->tuple_count, 0);
200+
198201
/*
199202
* Calculate percentages if the relation has one or more pages.
200203
*/

contrib/postgres_fdw/postgres_fdw.c

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -692,15 +692,14 @@ postgresGetForeignRelSize(PlannerInfo *root,
692692
else
693693
{
694694
/*
695-
* If the foreign table has never been ANALYZEd, it will have relpages
696-
* and reltuples equal to zero, which most likely has nothing to do
697-
* with reality. We can't do a whole lot about that if we're not
695+
* If the foreign table has never been ANALYZEd, it will have
696+
* reltuples < 0, meaning "unknown". We can't do much if we're not
698697
* allowed to consult the remote server, but we can use a hack similar
699698
* to plancat.c's treatment of empty relations: use a minimum size
700699
* estimate of 10 pages, and divide by the column-datatype-based width
701700< 8000 /code>
* estimate to get the corresponding number of tuples.
702701
*/
703-
if (baserel->pages == 0 && baserel->tuples == 0)
702+
if (baserel->tuples < 0)
704703
{
705704
baserel->pages = 10;
706705
baserel->tuples =

doc/src/sgml/catalogs.sgml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1977,6 +1977,10 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
19771977
the planner. It is updated by <command>VACUUM</command>,
19781978
<command>ANALYZE</command>, and a few DDL commands such as
19791979
<command>CREATE INDEX</command>.
1980+
If the table has never yet been vacuumed or
1981+
analyzed, <structfield>reltuples</structfield>
1982+
contains <literal>-1</literal> indicating that the row count is
1983+
unknown.
19801984
</para></entry>
19811985
</row>
19821986

doc/src/sgml/fdwhandler.sgml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -130,7 +130,8 @@ GetForeignRelSize(PlannerInfo *root,
130130
(The initial value is
131131
from <structname>pg_class</structname>.<structfield>reltuples</structfield>
132132
which represents the total row count seen by the
133-
last <command>ANALYZE</command>.)
133+
last <command>ANALYZE</command>; it will be <literal>-1</literal> if
134+
no <command>ANALYZE</command> has been done on this foreign table.)
134135
</para>
135136

136137
<para>

src/backend/access/gin/ginvacuum.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -727,7 +727,7 @@ ginvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
727727
* entries. This is bogus if the index is partial, but it's real hard to
728728
* tell how many distinct heap entries are referenced by a GIN index.
729729
*/
730-
stats->num_index_tuples = info->num_heap_tuples;
730+
stats->num_index_tuples = Max(info->num_heap_tuples, 0);
731731
stats->estimated_count = info->estimated_count;
732732

733733
/*

src/backend/access/heap/vacuumlazy.c

Lines changed: 15 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -208,7 +208,8 @@ typedef struct LVShared
208208
* live tuples in the index vacuum case or the new live tuples in the
209209
* index cleanup case.
210210
*
211-
* estimated_count is true if reltuples is an estimated value.
211+
* estimated_count is true if reltuples is an estimated value. (Note that
212+
* reltuples could be -1 in this case, indicating we have no idea.)
212213
*/
213214
double reltuples;
214215
bool estimated_count;
@@ -567,31 +568,19 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
567568
/*
568569
* Update statistics in pg_class.
569570
*
570-
* A corner case here is that if we scanned no pages at all because every
571-
* page is all-visible, we should not update relpages/reltuples, because
572-
* we have no new information to contribute. In particular this keeps us
573-
* from replacing relpages=reltuples=0 (which means "unknown tuple
574-
* density") with nonzero relpages and reltuples=0 (which means "zero
575-
* tuple density") unless there's some actual evidence for the latter.
571+
* In principle new_live_tuples could be -1 indicating that we (still)
572+
* don't know the tuple count. In practice that probably can't happen,
573+
* since we'd surely have scanned some pages if the table is new and
574+
* nonempty.
576575
*
577-
* It's important that we use tupcount_pages and not scanned_pages for the
578-
* check described above; scanned_pages counts pages where we could not
579-
* get cleanup lock, and which were processed only for frozenxid purposes.
580-
*
581-
* We do update relallvisible even in the corner case, since if the table
582-
* is all-visible we'd definitely like to know that. But clamp the value
583-
* to be not more than what we're setting relpages to.
576+
* For safety, clamp relallvisible to be not more than what we're setting
577+
* relpages to.
584578
*
585579
* Also, don't change relfrozenxid/relminmxid if we skipped any pages,
586580
* since then we don't know for certain that all tuples have a newer xmin.
587581
*/
588582
new_rel_pages = vacrelstats->rel_pages;
589583
new_live_tuples = vacrelstats->new_live_tuples;
590-
if (vacrelstats->tupcount_pages == 0 && new_rel_pages > 0)
591-
{
592-
new_rel_pages = vacrelstats->old_rel_pages;
593-
new_live_tuples = vacrelstats->old_live_tuples;
594-
}
595584

596585
visibilitymap_count(onerel, &new_rel_allvisible, NULL);
597586
if (new_rel_allvisible > new_rel_pages)
@@ -612,7 +601,7 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
612601
/* report results to the stats collector, too */
613602
pgstat_report_vacuum(RelationGetRelid(onerel),
614603
onerel->rd_rel->relisshared,
615-
new_live_tuples,
604+
Max(new_live_tuples, 0),
616605
vacrelstats->new_dead_tuples);
617606
pgstat_progress_end_command();
618607

@@ -1695,9 +1684,12 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
16951684
vacrelstats->tupcount_pages,
16961685
live_tuples);
16971686

1698-
/* also compute total number of surviving heap entries */
1687+
/*
1688+
* Also compute the total number of surviving heap entries. In the
1689+
* (unlikely) scenario that new_live_tuples is -1, take it as zero.
1690+
*/
16991691
vacrelstats->new_rel_tuples =
1700-
vacrelstats->new_live_tuples + vacrelstats->new_dead_tuples;
1692+
Max(vacrelstats->new_live_tuples, 0) + vacrelstats->new_dead_tuples;
17011693

17021694
/*
17031695
* Release any remaining pin on visibility map page.
@@ -2434,7 +2426,7 @@ lazy_cleanup_all_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
24342426
* dead_tuples, and update running statistics.
24352427
*
24362428
* reltuples is the number of heap tuples to be passed to the
2437-
* bulkdelete callback.
2429+
* bulkdelete callback. It's always assumed to be estimated.
24382430
*/
24392431
static void
24402432
lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,

src/backend/access/nbtree/nbtree.c

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -853,6 +853,7 @@ _bt_vacuum_needs_cleanup(IndexVacuumInfo *info)
853853
prev_num_heap_tuples = metad->btm_last_cleanup_num_heap_tuples;
854854

855855
if (cleanup_scale_factor <= 0 ||
856+
info->num_heap_tuples < 0 ||
856857
prev_num_heap_tuples <= 0 ||
857858
(info->num_heap_tuples - prev_num_heap_tuples) /
858859
prev_num_heap_tuples >= cleanup_scale_factor)

src/backend/access/table/tableam.c

Lines changed: 9 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -701,18 +701,14 @@ table_block_relation_estimate_size(Relation rel, int32 *attr_widths,
701701
* doesn't happen instantaneously, and it won't happen at all for cases
702702
* such as temporary tables.)
703703
*
704-
* We approximate "never vacuumed" by "has relpages = 0", which means this
705-
* will also fire on genuinely empty relations. Not great, but
706-
* fortunately that's a seldom-seen case in the real world, and it
707-
* shouldn't degrade the quality of the plan too much anyway to err in
708-
* this direction.
704+
* We test "never vacuumed" by seeing whether reltuples < 0.
709705
*
710706
* If the table has inheritance children, we don't apply this heuristic.
711707
* Totally empty parent tables are quite common, so we should be willing
712708
* to believe that they are empty.
713709
*/
714710
if (curpages < 10 &&
715-
relpages == 0 &&
711+
reltuples < 0 &&
716712
!rel->rd_rel->relhassubclass)
717713
curpages = 10;
718714

@@ -727,17 +723,17 @@ table_block_relation_estimate_size(Relation rel, int32 *attr_widths,
727723
}
728724

729725
/* estimate number of tuples from previous tuple density */
730-
if (relpages > 0)
726+
if (reltuples >= 0 && relpages > 0)
731727
density = reltuples / (double) relpages;
732728
else
733729
{
734730
/*
735-
* When we have no data because the relation was truncated, estimate
736-
* tuple width from attribute datatypes. We assume here that the
737-
* pages are completely full, which is OK for tables (since they've
738-
* presumably not been VACUUMed yet) but is probably an overestimate
739-
* for indexes. Fortunately get_relation_info() can clamp the
740-
* overestimate to the parent table's size.
731+
* When we have no data because the relation was never yet vacuumed,
732+
* estimate tuple width from attribute datatypes. We assume here that
733+
* the pages are completely full, which is OK for tables but is
734+
* probably an overestimate for indexes. Fortunately
735+
* get_relation_info() can clamp the overestimate to the parent
736+
* table's size.
741737
*
742738
* Note: this code intentionally disregards alignment considerations,
743739
* because (a) that would be gilding the lily considering how crude

src/backend/catalog/heap.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1015,7 +1015,7 @@ AddNewRelationTuple(Relation pg_class_desc,
10151015
case RELKIND_TOASTVALUE:
10161016
/* The relation is real, but as yet empty */
10171017
new_rel_reltup->relpages = 0;
1018-
new_rel_reltup->reltuples = 0;
1018+
new_rel_reltup->reltuples = -1;
10191019
new_rel_reltup->relallvisible = 0;
10201020
break;
10211021
case RELKIND_SEQUENCE:
@@ -1027,7 +1027,7 @@ AddNewRelationTuple(Relation pg_class_desc,
10271027
default:
10281028
/* Views, etc, have no disk storage */
10291029
new_rel_reltup->relpages = 0;
1030-
new_rel_reltup->reltuples = 0;
1030+
new_rel_reltup->reltuples = -1;
10311031
new_rel_reltup->relallvisible = 0;
10321032
break;
10331033
}

0 commit comments

Comments
 (0)
0