8000 Fix assorted bugs in CREATE/DROP INDEX CONCURRENTLY. · danielcode/postgres@94c014b · GitHub
[go: up one dir, main page]

Skip to content

Commit 94c014b

Browse files
committed
Fix assorted bugs in CREATE/DROP INDEX CONCURRENTLY.
Commit 8cb5365, which introduced DROP INDEX CONCURRENTLY, managed to break CREATE INDEX CONCURRENTLY via a poor choice of catalog state representation. The pg_index state for an index that's reached the final pre-drop stage was the same as the state for an index just created by CREATE INDEX CONCURRENTLY. This meant that the (necessary) change to make RelationGetIndexList ignore about-to-die indexes also made it ignore freshly-created indexes; which is catastrophic because the latter do need to be considered in HOT-safety decisions. Failure to do so leads to incorrect index entries and subsequently wrong results from queries depending on the concurrently-created index. To fix, make the final state be indisvalid = true and indisready = false, which is otherwise nonsensical. This is pretty ugly but we can't add another column without forcing initdb, and it's too late for that in 9.2. (There's a cleaner fix in HEAD.) In addition, change CREATE/DROP INDEX CONCURRENTLY so that the pg_index flag changes they make without exclusive lock on the index are made via heap_inplace_update() rather than a normal transactional update. The latter is not very safe because moving the pg_index tuple could result in concurrent SnapshotNow scans finding it twice or not at all, thus possibly resulting in index corruption. This is a pre-existing bug in CREATE INDEX CONCURRENTLY, which was copied into the DROP code. In addition, fix various places in the code that ought to check to make sure that the indexes they are manipulating are valid and/or ready as appropriate. These represent bugs that have existed since 8.2, since a failed CREATE INDEX CONCURRENTLY could leave a corrupt or invalid index behind, and we ought not try to do anything that might fail with such an index. Also fix RelationReloadIndexInfo to ensure it copies all the pg_index columns that are allowed to change after initial creation. Previously we could have been left with stale values of some fields in an index relcache entry. It's not clear whether this actually had any user-visible consequences, but it's at least a bug waiting to happen. In addition, do some code and docs review for DROP INDEX CONCURRENTLY; some cosmetic code cleanup but mostly addition and revision of comments. Portions of this need to be back-patched even further, but I'll work on that separately. Problem reported by Amit Kapila, diagnosis by Pavan Deolasee, fix by Tom Lane and Andres Freund.
1 parent ffc3172 commit 94c014b

File tree

16 files changed

+443
-258
lines changed

16 files changed

+443
-258
lines changed

contrib/tcn/tcn.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -140,8 +140,8 @@ triggered_change_notification(PG_FUNCTION_ARGS)
140140
if (!HeapTupleIsValid(indexTuple)) /* should not happen */
141141
elog(ERROR, "cache lookup failed for index %u", indexoid);
142142
index = (Form_pg_index) GETSTRUCT(indexTuple);
143-
/* we're only interested if it is the primary key */
144-
if (index->indisprimary)
143+
/* we're only interested if it is the primary key and valid */
144+
if (index->indisprimary && IndexIsValid(index))
145145
{
146146
int numatts = index->indnatts;
147147

doc/src/sgml/catalogs.sgml

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3389,11 +3389,12 @@
33893389
<entry><type>bool</type></entry>
33903390
<entry></entry>
33913391
<entry>
3392-
If true, the index is currently valid for queries. False means the
3392+
If both this flag and <structfield>indisready</structfield> are
3393+
true, the index is currently valid for queries. False means the
33933394
index is possibly incomplete: it must still be modified by
33943395
<command>INSERT</>/<command>UPDATE</> operations, but it cannot safely
33953396
be used for queries. If it is unique, the uniqueness property is not
3396-
true either.
3397+
guaranteed true either.
33973398
</entry>
33983399
</row>
33993400

doc/src/sgml/ref/drop_index.sgml

Lines changed: 18 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -40,34 +40,33 @@ DROP INDEX [ CONCURRENTLY ] [ IF EXISTS ] <replaceable class="PARAMETER">name</r
4040

4141
<variablelist>
4242
<varlistentry>
43-
<term><literal>IF EXISTS</literal></term>
43+
<term><literal>CONCURRENTLY</literal></term>
4444
<listitem>
4545
<para>
46-
Do not throw an error if the index does not exist. A notice is issued
47-
in this case.
46+
Drop the index without locking out concurrent selects, inserts, updates,
47+
and deletes on the index's table. A normal <command>DROP INDEX</>
48+
acquires exclusive lock on the table, blocking other accesses until the
49+
index drop can be completed. With this option, the command instead
50+
waits until conflicting transactions have completed.
51+
</para>
52+
<para>
53+
There are several caveats to be aware of when using this option.
54+
Only one index name can be specified, and the <literal>CASCADE</> option
55+
is not supported. (Thus, an index that supports a <literal>UNIQUE</> or
56+
<literal>PRIMARY KEY</> constraint cannot be dropped this way.)
57+
Also, regular <command>DROP INDEX</> commands can be
58+
performed within a transaction block, but
59+
<command>DROP INDEX CONCURRENTLY</> cannot.
4860
</para>
4961
</listitem>
5062
</varlistentry>
5163

5264
<varlistentry>
53-
<term><literal>CONCURRENTLY</literal></term>
65+
<term><literal>IF EXISTS</literal></term>
5466
<listitem>
5567
<para>
56-
When this option is used, <productname>PostgreSQL</> will drop the
57-
index without taking any locks that prevent concurrent selects, inserts,
58-
updates, or deletes on the table; whereas a standard index drop
59-
waits for a lock that locks out everything on the table until it's done.
60-
Concurrent drop index is a two stage process. First, we mark the index
61-
both invalid and not ready then commit the change. Next we wait until
62-
there are no users locking the table who can see the index.
63-
</para>
64-
<para>
65-
There are several caveats to be aware of when using this option.
66-
Only one index name can be specified if the <literal>CONCURRENTLY</literal>
67-
parameter is specified. Regular <command>DROP INDEX</> command can be
68-
performed within a transaction block, but
69-
<command>DROP INDEX CONCURRENTLY</> cannot.
70-
The CASCADE option is not supported when dropping an index concurrently.
68+
Do not throw an error if the index does not exist. A notice is issued
69+
in this case.
7170
</para>
7271
</listitem>
7372
</varlistentry>

src/backend/access/heap/README.HOT

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -386,6 +386,37 @@ from the index, as well as ensuring that no one can see any inconsistent
386386
rows in a broken HOT chain (the first condition is stronger than the
387387
second). Finally, we can mark the index valid for searches.
388388

389+
Note that we do not need to set pg_index.indcheckxmin in this code path,
390+
because we have outwaited any transactions that would need to avoid using
391+
the index. (indcheckxmin is only needed because non-concurrent CREATE
392+
INDEX doesn't want to wait; its stronger lock would create too much risk of
393+
deadlock if it did.)
394+
395+
396+
DROP INDEX CONCURRENTLY
397+
-----------------------
398+
399+
DROP INDEX CONCURRENTLY is sort of the reverse sequence of CREATE INDEX
400+
CONCURRENTLY. We first mark the index as not indisvalid, and then wait for
401+
any transactions that could be using it in queries to end. (During this
402+
time, index updates must still be performed as normal, since such
403+
transactions might expect freshly inserted tuples to be findable.)
404+
Then, we clear indisready and set indisvalid, and again wait for transactions
405+
that could be updating the index to end. Finally we can drop the index
406+
normally (though taking only ShareUpdateExclusiveLock on its parent table).
407+
408+
The reason we need the unobvious choice of setting indisvalid is that after
409+
the second wait step begins, we don't want transactions to be touching the
410+
index at all; otherwise they might suffer errors if the DROP finally
411+
commits while they are reading catalog entries for the index. If we set
412+
both indisvalid and indisready false, this state would be indistinguishable
413+
from the first stage of CREATE INDEX CONCURRENTLY --- but in that state, we
414+
*do* want transactions to examine the index, since they must consider it in
415+
HOT-safety checks. So we use this nonsensical combination instead. Note
416+
this means that places that might otherwise test just indisvalid must now
417+
test both flags. If possible, use the IndexIsValid() macro instead.
418+
(9.3 and later have a saner representation for this state.)
419+
389420

390421
Limitations and Restrictions
391422
----------------------------

src/backend/catalog/dependency.c

Lines changed: 20 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -991,7 +991,6 @@ deleteOneObject(const ObjectAddress *object, Relation depRel, int flags)
991991
int nkeys;
992992
SysScanDesc scan;
993993
HeapTuple tup;
994-
Oid depRelOid = depRel->rd_id;
995994

996995
/* DROP hook of the objects being removed */
997996
if (object_access_hook)
@@ -1004,34 +1003,33 @@ deleteOneObject(const ObjectAddress *object, Relation depRel, int flags)
10041003
}
10051004

10061005
/*
1007-
* Close depRel if we are doing a drop concurrently. The individual
1008-
* deletion has to commit the transaction and we don't want dangling
1009-
* references.
1006+
* Close depRel if we are doing a drop concurrently. The object deletion
1007+
* subroutine will commit the current transaction, so we can't keep the
1008+
* relation open across doDeletion().
10101009
*/
10111010
if (flags & PERFORM_DELETION_CONCURRENTLY)
10121011
heap_close(depRel, RowExclusiveLock);
10131012

10141013
/*
10151014
* Delete the object itself, in an object-type-dependent way.
10161015
*
1017-
* Do this before removing outgoing dependencies as deletions can be
1018-
* happening in concurrent mode. That will only happen for a single object
1019-
* at once and if so the object will be invalidated inside a separate
1020-
* transaction and only dropped inside a transaction thats in-progress when
1021-
* doDeletion returns. This way no observer can see dangling dependency
1022-
* entries.
1016+
* We used to do this after removing the outgoing dependency links, but it
1017+
* seems just as reasonable to do it beforehand. In the concurrent case
1018+
* we *must* do it in this order, because we can't make any transactional
1019+
* updates before calling doDeletion() --- they'd get committed right
1020+
* away, which is not cool if the deletion then fails.
10231021
*/
10241022
doDeletion(object, flags);
10251023

10261024
/*
1027-
* Reopen depRel if we closed it before
1025+
* Reopen depRel if we closed it above
10281026
*/
10291027
if (flags & PERFORM_DELETION_CONCURRENTLY)
1030-
depRel = heap_open(depRelOid, RowExclusiveLock);
1028+
depRel = heap_open(DependRelationId, RowExclusiveLock);
10311029

10321030
/*
1033-
* Then remove any pg_depend records that link from this object to
1034-
* others. (Any records linking to this object should be gone already.)
1031+
* Now remove any pg_depend records that link from this object to others.
1032+
* (Any records linking to this object should be gone already.)
10351033
*
10361034
* When dropping a whole object (subId = 0), remove all pg_depend records
10371035
* for its sub-objects too.
@@ -1250,15 +1248,23 @@ AcquireDeletionLock(const ObjectAddress *object, int flags)
12501248
{
12511249
if (object->classId == RelationRelationId)
12521250
{
1251+
/*
< 9D06 /td>1252+
* In DROP INDEX CONCURRENTLY, take only ShareUpdateExclusiveLock on
1253+
* the index for the moment. index_drop() will promote the lock once
1254+
* it's safe to do so. In all other cases we need full exclusive
1255+
* lock.
1256+
*/
12531257
if (flags & PERFORM_DELETION_CONCURRENTLY)
12541258
LockRelationOid(object->objectId, ShareUpdateExclusiveLock);
12551259
else
12561260
LockRelationOid(object->objectId, AccessExclusiveLock);
12571261
}
12581262
else
1263+
{
12591264
/* assume we should lock the whole object not a sub-object */
12601265
LockDatabaseObject(object->classId, object->objectId, 0,
12611266
AccessExclusiveLock);
1267+
}
12621268
}
12631269

12641270
/*

0 commit comments

Comments
 (0)
0