10000 doc: add section about heap-only tuples (HOT) · postgres/postgres@99c1f24 · GitHub
[go: up one dir, main page]

Skip to content

Commit 99c1f24

Browse files
committed
doc: add section about heap-only tuples (HOT)
Reported-by: Jonathan S. Katz Discussion: https://postgr.es/m/c59ffbd5-96ac-a5a5-a401-14f627ca1405@postgresql.org Backpatch-through: 11
1 parent 41fde30 commit 99c1f24

File tree

8 files changed

+86
-10
lines changed

8 files changed

+86
-10
lines changed

doc/src/sgml/acronyms.sgml

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -299,9 +299,7 @@
299299
<term><acronym>HOT</acronym></term>
300300
<listitem>
301301
<para>
302-
<ulink
303-
url="https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/access/heap/README.HOT;hb=HEAD">Heap-Only
304-
Tuples</ulink>
302+
<link linkend="storage-hot">Heap-Only Tuples</link>
305303
</para>
306304
</listitem>
307305
</varlistentry>

doc/src/sgml/catalogs.sgml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3818,7 +3818,8 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
38183818
<entry>
38193819
If true, queries must not use the index until the <structfield>xmin</structfield>
38203820
of this <structname>pg_index</structname> row is below their <symbol>TransactionXmin</symbol>
3821-
event horizon, because the table may contain broken HOT chains with
3821+
event horizon, because the table may contain broken <link
3822+
linkend="storage-hot">HOT chains</link> with
38223823
incompatible rows that they can see
38233824
</entry>
38243825
</row>

doc/src/sgml/config.sgml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3439,7 +3439,8 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
34393439
<listitem>
34403440
<para>
34413441
Specifies the number of transactions by which <command>VACUUM</command> and
3442-
<acronym>HOT</acronym> updates will defer cleanup of dead row versions. The
3442+
<link linkend="storage-hot"><acronym>HOT</acronym> updates</link>
3443+
will defer cleanup of dead row versions. The
34433444
default is zero transactions, meaning that dead row versions can be
34443445
removed as soon as possible, that is, as soon as they are no longer
34453446
visible to any open transaction. You may wish to set this to a

doc/src/sgml/indexam.sgml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,8 @@
3737
extant versions of the same logical row; to an index, each tuple is
3838
an independent object that needs its own index entry. Thus, an
3939
update of a row always creates all-new index entries for the row, even if
40-
the key values did not change. (HOT tuples are an exception to this
40+
the key values did not change. (<link linkend="storage-hot">HOT
41+
tuples</link> are an exception to this
4142
statement; but indexes do not deal with those, either.) Index entries for
4243
dead tuples are reclaimed (by vacuuming) when the dead tuples themselves
4344
are reclaimed.

doc/src/sgml/indices.sgml

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -104,7 +104,9 @@ CREATE INDEX test1_id_index ON test1 (id);
104104

105105
<para>
106106
After an index is created, the system has to keep it synchronized with the
107-
table. This adds overhead to data manipulation operations.
107+
table. This adds overhead to data manipulation operations. Indexes can
108+
also prevent the creation of <link linkend="storage-hot">heap-only
109+
tuples</link>.
108110
Therefore indexes that are seldom or never used in queries
109111
should be removed.
110112
</para>
@@ -726,7 +728,7 @@ CREATE INDEX people_names ON people ((first_name || ' ' || last_name));
726728
<para>
727729
Index expressions are relatively expensive to maintain, because the
728730
derived expression(s) must be computed for each row insertion
729-
and non-HOT update. However, the index expressions are
731+
and <link linkend="storage-hot">non-HOT update.</link> However, the index expressions are
730732
<emphasis>not</emphasis> recomputed during an indexed search, since they are
731733
already stored in the index. In both examples above, the system
732734
sees the query as just <literal>WHERE indexedcolumn = 'constant'</literal>

doc/src/sgml/monitoring.sgml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2640,7 +2640,8 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
26402640
<row>
26412641
<entry><structfield>n_tup_upd</structfield></entry>
26422642
<entry><type>bigint</type></entry>
2643-
<entry>Number of rows updated (includes HOT updated rows)</entry>
2643+
<entry>Number of rows updated (includes <link
2644+
linkend="storage-hot">HOT updated rows</link> updated rows)</entry>
26442645
</row>
26452646
<row>
26462647
<entry><structfield>n_tup_del</structfield></entry>

doc/src/sgml/ref/create_table.sgml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1262,7 +1262,9 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
12621262
to the indicated percentage; the remaining space on each page is
12631263
reserved for updating rows on that page. This gives <command>UPDATE</command>
12641264
a chance to place the updated copy of a row on the same page as the
1265-
original, which is more efficient than placing it on a different page.
1265+
original, which is more efficient than placing it on a different
1266+
page, and makes <link linkend="storage-hot">heap-only tuple
1267+
updates</link> more likely.
12661268
For a table whose entries are never updated, complete packing is the
12671269
best choice, but in heavily updated tables smaller fillfactors are
12681270
appropriate. This parameter cannot be set for TOAST tables.

doc/src/sgml/storage.sgml

Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1046,4 +1046,74 @@ data. Empty in ordinary tables.</entry>
10461046
</sect2>
10471047
</sect1>
10481048

1049+
<sect1 id="storage-hot">
1050+
1051+
<title>Heap-Only Tuples (<acronym>HOT</acronym>)</title>
1052+
1053+
<para>
1054+
To allow for high concurrency, <productname>PostgreSQL</productname>
1055+
uses <link linkend="mvcc-intro">multiversion concurrency
1056+
control</link> (<acronym>MVCC</acronym>) to store rows. However,
1057+
<acronym>MVCC</acronym> has some downsides for update queries.
1058+
Specifically, updates require new versions of rows to be added to
1059+
tables. This can also require new index entries for each updated row,
1060+
and removal of old versions of rows and their index entries can be
1061+
expensive.
1062+
</para>
1063+
1064+
<para>
1065+
To help reduce the overhead of updates,
1066+
<productname>PostgreSQL</productname> has an optimization called
1067+
heap-only tuples (<acronym>HOT</acronym>). This optimization is
1068+
possible when:
1069+
1070+
<itemizedlist>
1071+
<listitem>
1072+
<para>
1073+
The update does not modify any columns referenced by the table's
1074+
indexes, including expression and partial indexes.
1075+
</para>
1076+
</listitem>
1077+
<listitem>
1078+
<para>
1079+
There is sufficient free space on the page containing the old row
1080+
for the updated row.
1081+
</para>
1082+
</listitem>
1083+
</itemizedlist>
1084+
1085+
In such cases, heap-only tuples provide two optimizations:
1086+
1087+
<itemizedlist>
1088+
<listitem>
1089+
<para>
1090+
New index entries are not needed to represent updated rows.
1091+
</para>
1092+
</listitem>
1093+
<listitem>
1094+
<para>
1095+
Old versions of updated rows can be completely removed during normal
1096+
operation, including <command>SELECT</command>s, instead of requiring
1097+
periodic vacuum operations. (This is possible because indexes
1098+
do not reference their <link linkend="storage-page-layout">page
1099+
item identifiers</link>.)
1100+
</para>
1101+
</listitem>
1102+
</itemizedlist>
1103+
</para>
1104+
1105+
<para>
1106+
In summary, heap-only tuple updates can only be created
1107+
if columns used by indexes are not updated. You can
1108+
increase the likelihood of sufficient page space for
1109+
<acronym>HOT</acronym> updates by decreasing a table's <link
1110+
linkend="sql-createtable"><literal>fillfactor</literal></link>.
1111+
If you don't, <acronym>HOT</acronym> updates will still happen because
1112+
new rows will naturally migrate to new pages and existing pages with
1113+
sufficient free space for new row versions. The system view <link
1114+
linkend="pg-stat-all-tables-view">pg_stat_all_tables</link>
1115+
allows monitoring of the occurrence of HOT and non-HOT updates.
1116+
</para>
1117+
</sect1>
1118+
10491119
</chapter>

0 commit comments

Comments
 (0)
0