8000 Improve SP-GiST opclass API to better support unlabeled nodes. · patchsoft/postgres@d2ddee6 · GitHub
[go: up one dir, main page]

Skip to content

Commit d2ddee6

Browse files
committed
Improve SP-GiST opclass API to better support unlabeled nodes.
Previously, the spgSplitTuple action could only create a new upper tuple containing a single labeled node. This made it useless for opclasses that prefer to work with fixed sets of nodes (labeled or otherwise), which meant that restrictive prefixes could not be used with such node definitions. Change the output field set for the choose() method to allow it to specify any valid node set for the new upper tuple, and to specify which of these nodes to place the modified lower tuple in. In addition to its primary use for fixed node sets, this feature could allow existing opclasses that use variable node sets to skip a separate spgAddNode action when splitting a tuple, by setting up the node needed for the incoming value as part of the spgSplitTuple action. However, care would have to be taken to add the extra node only when it would not make the tuple bigger than before. (spgAddNode can enlarge the tuple, spgSplitTuple can't.) This is a prerequisite for an upcoming SP-GiST inet opclass, but is being committed separately to increase the visibility of the API change. In passing, improve the documentation about the traverse-values feature that was added by commit ccd6eb4. Emre Hasegeli, with cosmetic adjustments and documentation rework by me Discussion: <CAE2gYzxtth9qatW_OAqdOjykS0bxq7AYHLuyAQLPgT7H9ZU0Cw@mail.gmail.com>
1 parent 86f3169 commit d2ddee6

File tree

4 files changed

+115
-63
lines changed

4 files changed

+115
-63
lines changed

doc/src/sgml/spgist.sgml

Lines changed: 65 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -114,7 +114,7 @@
114114
</row>
115115
<row>
116116
<entry><literal>box_ops</></entry>
117-
<entry>box</entry>
117+
<entry><type>box</></entry>
118118
<entry>
119119
<literal>&lt;&lt;</>
120120
<literal>&amp;&lt;</>
@@ -183,11 +183,14 @@
183183
Inner tuples are more complex, since they are branching points in the
184184
search tree. Each inner tuple contains a set of one or more
185185
<firstterm>nodes</>, which represent groups of similar leaf values.
186-
A node contains a downlink that leads to either another, lower-level inner
187-
tuple, or a short list of leaf tuples that all lie on the same index page.
188-
Each node has a <firstterm>label</> that describes it; for example,
186+
A node contains a downlink that leads either to another, lower-level inner
187+
tuple, or to a short list of leaf tuples that all lie on the same index page.
188+
Each node normally has a <firstterm>label</> that describes it; for example,
189189
in a radix tree the node label could be the next character of the string
190-
value. Optionally, an inner tuple can have a <firstterm>prefix</> value
190+
value. (Alternatively, an operator class can omit the node labels, if it
191+
works with a fixed set of nodes for all inner tuples;
192+
see <xref linkend="spgist-null-labels">.)
193+
Optionally, an inner tuple can have a <firstterm>prefix</> value
191194
that describes all its members. In a radix tree this could be the common
192195
prefix of the represented strings. The prefix value is not necessarily
193196
really a prefix, but can be any data needed by the operator class;
@@ -202,7 +205,8 @@
202205
tuple, so the <acronym>SP-GiST</acronym> core provides the possibility for
203206
operator classes to manage level counting while descending the tree.
204207
There is also support for incrementally reconstructing the represented
205-
value when that is needed.
208+
value when that is needed, and for passing down additional data (called
209+
<firstterm>traverse values</>) during a tree descent.
206210
</para>
207211

208212
<note>
@@ -343,10 +347,13 @@ typedef struct spgChooseOut
343347
} addNode;
344348
struct /* results for spgSplitTuple */
345349
{
346-
/* Info to form new inner tuple with one node */
350+
/* Info to form new upper-level inner tuple with one child tuple */
347351
bool prefixHasPrefix; /* tuple should have a prefix? */
348352
Datum prefixPrefixDatum; /* if so, its value */
349-
Datum nodeLabel; /* node's label */
353+
int prefixNNodes; /* number of nodes */
354+
Datum *prefixNodeLabels; /* their labels (or NULL for
355+
* no labels) */
356+
int childNodeN; /* which node gets child tuple */
350357

351358
/* Info to form new lower-level inner tuple with all old nodes */
352359
bool postfixHasPrefix; /* tuple should have a prefix? */
@@ -416,29 +423,33 @@ typedef struct spgChooseOut
416423
set <structfield>resultType</> to <literal>spgSplitTuple</>.
417424
This action moves all the existing nodes into a new lower-level
418425
inner tuple, and replaces the existing inner tuple with a tuple
419-
having a single node that links to the new lower-level inner tuple.
426+
having a single downlink pointing to the new lower-level inner tuple.
420427
Set <structfield>prefixHasPrefix</> to indicate whether the new
421428
upper tuple should have a prefix, and if so set
422429
<structfield>prefixPrefixDatum</> to the prefix value. This new
423430
prefix value must be sufficiently less restrictive than the original
424-
to accept the new value to be indexed, and it should be no longer
425-
than the original prefix.
426-
Set <structfield>nodeLabel</> to the label to be used for the
427-
node that will point to the new lower-level inner tuple.
431+
to accept the new value to be indexed.
432+
Set <structfield>prefixNNodes</> to the number of nodes needed in the
433+
new tuple, and set <structfield>prefixNodeLabels</> to a palloc'd array
434+
holding their labels, or to NULL if node labels are not required.
435+
Note that the total size of the new upper tuple must be no more
436+
than the total size of the tuple it is replacing; this constrains
437+
the lengths of the new prefix and new labels.
438+
Set <structfield>childNodeN</> to the index (from zero) of the node
439+
that will downlink to the new lower-level inner tuple.
428440
Set <structfield>postfixHasPrefix</> to indicate whether the new
429441
lower-level inner tuple should have a prefix, and if so set
430442
<structfield>postfixPrefixDatum</> to the prefix value. The
431-
combination of these two prefixes and the additional label must
432-
have the same meaning as the original prefix, because there is
433-
no opportunity to alter the node labels that are moved to the new
434-
lower-level tuple, nor to change any child index entries.
443+
combination of these two prefixes and the downlink node's label
444+
(if any) must have the same meaning as the original prefix, because
445+
there is no opportunity to alter the node labels that are moved to
446+
the new lower-level tuple, nor to change any child index entries.
435447
After the node has been split, the <function>choose</function>
436448
function will be called again with the replacement inner tuple.
437-
That call will usually result in an <literal>spgAddNode</> result,
438-
since presumably the node label added in the split step will not
439-
match the new value; so after that, there will be a third call
440-
that finally returns <literal>spgMatchNode</> and allows the
441-
insertion to descend to the leaf level.
449+
That call may return an <literal>spgAddNode</> result, if no suitable
450+
node was created by the <literal>spgSplitTuple</> action. Eventually
451+
<function>choose</function> must return <literal>spgMatchNode</> to
452+
allow the insertion to descend to the next level.
442453
</para>
443454
</listitem>
444455
</varlistentry>
@@ -492,9 +503,8 @@ typedef struct spgPickSplitOut
492503
<structfield>prefixDatum</> to the prefix value.
493504
Set <structfield>nNodes</> to indicate the number of nodes that
494505
the new inner tuple will contain, and
495-
set <structfield>nodeLabels</> to an array of their label values.
496-
(If the nodes do not require labels, set <structfield>nodeLabels</>
497-
to NULL; see <xref linkend="spgist-null-labels"> for details.)
506+
set <structfield>nodeLabels</> to an array of their label values,
507+
or to NULL if node labels are not required.
498508
Set <structfield>mapTuplesToNodes</> to an array that gives the index
499509
(from zero) of the node that each leaf tuple should be assigned to.
500510
Set <structfield>leafTupleDatums</> to an array of the values to
@@ -561,7 +571,7 @@ typedef struct spgInnerConsistentIn
561571

562572
Datum reconstructedValue; /* value reconstructed at parent */
563573
void *traversalValue; /* opclass-specific traverse value */
564-
MemoryContext traversalMemoryContext;
574+
MemoryContext traversalMemoryContext; /* put new traverse values here */
565575
int level; /* current level (counting from zero) */
566576
bool returnData; /* original data must be returned? */
567577

@@ -580,7 +590,6 @@ typedef struct spgInnerConsistentOut
580590
int *levelAdds; /* increment level by this much for each */
581591
Datum *reconstructedValues; /* associated reconstructed values */
582592
void **traversalValues; /* opclass-specific traverse values */
583-
584593
} spgInnerConsistentOut;
585594
</programlisting>
586595

@@ -599,6 +608,11 @@ typedef struct spgInnerConsistentOut
599608
parent tuple; it is <literal>(Datum) 0</> at the root level or if the
600609
<function>inner_consistent</> function did not provide a value at the
601610
parent level.
611+
<structfield>traversalValue</> is a pointer to any traverse data
612+
passed down from the previous call of <function>inner_consistent</>
613+
on the parent index tuple, or NULL at the root level.
614+
<structfield>traversalMemoryContext</> is the memory context in which
615+
to store output traverse values (see below).
602616
<structfield>level</> is the current inner tuple's level, starting at
603617
zero for the root level.
604618
<structfield>returnData</> is <literal>true</> if reconstructed data is
@@ -615,9 +629,6 @@ typedef struct spgInnerConsistentOut
615629
inner tuple, and
616630
<structfield>nodeLabels</> is an array of their label values, or
617631
NULL if the nodes do not have labels.
618-
<structfield>traversalValue</> is a pointer to data that
619-
<function>inner_consistent</> gets when called on child nodes from an
620-
outer call of <function>inner_consistent</> on parent nodes.
621632
</para>
622633

623634
<para>
@@ -633,17 +644,19 @@ typedef struct spgInnerConsistentOut
633644
<structfield>reconstructedValues</> to an array of the values
634645
reconstructed for each child node to be visited; otherwise, leave
635646
<structfield>reconstructedValues</> as NULL.
647+
If it is desired to pass down additional out-of-band information
648+
(<quote>traverse values</>) to lower levels of the tree search,
649+
set <structfield>traversalValues</> to an array of the appropriate
650+
traverse values, one for each child node to be visited; otherwise,
651+
leave <structfield>traversalValues</> as NULL.
636652
Note that the <function>inner_consistent</> function is
637653
responsible for palloc'ing the
638-
<structfield>nodeNumbers</>, <structfield>levelAdds</> and
639-
<structfield>reconstructedValues</> arrays.
640-
Sometimes accumulating some information is needed, while
641-
descending from parent to child node was happened. In this case
642-
<structfield>traversalValues</> array keeps pointers to
643-
specific data you need to accumulate for every child node.
644-
Memory for <structfield>traversalValues</> should be allocated in
645-
the default context, but each element of it should be allocated in
646-
<structfield>traversalMemoryContext</>.
654+
<structfield>nodeNumbers</>, <structfield>levelAdds</>,
655+
<structfield>reconstructedValues</>, and
656+
<structfield>traversalValues</> arrays in the current memory context.
657+
However, any output traverse values pointed to by
658+
the <structfield>traversalValues</> array should be allocated
659+
in <structfield>traversalMemoryContext</>.
647660
</para>
648661
</listitem>
649662
</varlistentry>
@@ -670,8 +683,8 @@ typedef struct spgLeafConsistentIn
670683
ScanKey scankeys; /* array of operators and comparison values */
671684
int nkeys; /* length of array */
672685

673-
void *traversalValue; /* opclass-specific traverse value */
674686
Datum reconstructedValue; /* value reconstructed at parent */
687+
void *traversalValue; /* opclass-specific traverse value */
675688
int level; /* current level (counting from zero) */
676689
bool returnData; /* original data must be returned? */
677690

@@ -700,6 +713,9 @@ typedef struct spgLeafConsisten F438 tOut
700713
parent tuple; it is <literal>(Datum) 0</> at the root level or if the
701714
<function>inner_consistent</> function did not provide a value at the
702715
parent level.
716+
<structfield>traversalValue</> is a pointer to any traverse data
717+
passed down from the previous call of <function>inner_consistent</>
718+
on the parent index tuple, or NULL at the root level.
703719
<structfield>level</> is the current leaf tuple's level, starting at
704720
zero for the root level.
705721
<structfield>returnData</> is <literal>true</> if reconstructed data is
@@ -797,7 +813,10 @@ typedef struct spgLeafConsistentOut
797813
point. In such a case the code typically works with the nodes by
798814
number, and there is no need for explicit node labels. To suppress
799815
node labels (and thereby save some space), the <function>picksplit</>
800-
function can return NULL for the <structfield>nodeLabels</> array.
816+
function can return NULL for the <structfield>nodeLabels</> array,
817+
and likewise the <function>choose</> function can return NULL for
818+
the <structfield>prefixNodeLabels</> array during
819+
a <literal>spgSplitTuple</> action.
801820
This will in turn result in <structfield>nodeLabels</> being NULL during
802821
subsequent calls to <function>choose</> and <function>inner_consistent</>.
803822
In principle, node labels could be used for some inner tuples and omitted
@@ -807,10 +826,7 @@ typedef struct spgLeafConsistentOut
807826
<para>
808827
When working with an inner tuple having unlabeled nodes, it is an error
809828
for <function>choose</> to return <literal>spgAddNode</>, since the set
810-
of nodes is supposed to be fixed in such cases. Also, there is no
811-
provision for generating an unlabeled node in <literal>spgSplitTuple</>
812-
actions, since it is expected that an <literal>spgAddNode</> action will
813-
be needed as well.
829+
of nodes is supposed to be fixed in such cases.
814830
</para>
815831
</sect2>
816832

@@ -859,11 +875,10 @@ typedef struct spgLeafConsistentOut
859875

860876
<para>
861877
The <productname>PostgreSQL</productname> source distribution includes
862-
several examples of index operator classes for
863-
<acronym>SP-GiST</acronym>. The core system currently provides radix
864-
trees over text columns and two types of trees over points: quad-tree and
865-
k-d tree. Look into <filename>src/backend/access/spgist/</> to see the
866-
code.
878+
several examples of index operator classes for <acronym>SP-GiST</acronym>,
879+
as described in <xref linkend="spgist-builtin-opclasses-table">. Look
880+
into <filename>src/backend/access/spgist/</>
881+
and <filename>src/backend/utils/adt/</> to see the code.
867882
</para>
868883

869884
</sect1>

src/backend/access/spgist/spgdoinsert.c

Lines changed: 32 additions & 7 deletions
< 10000 td data-grid-cell-id="diff-c79f5392f44ec1d2d8b2365c0e8e0c60ab7415533e2d07d359eddfc2029d1f8b-1711-1721-0" data-selected="false" role="gridcell" style="background-color:var(--diffBlob-additionNum-bgColor, var(--diffBlob-addition-bgColor-num));text-align:center" tabindex="-1" valign="top" class="focusable-grid-cell diff-line-number position-relative left-side">
Original file line numberDiff line numberDiff line change
@@ -1705,17 +1705,40 @@ spgSplitNodeAction(Relation index, SpGistState *state,
17051705
/* Should not be applied to nulls */
17061706
Assert(!SpGistPageStoresNulls(current->page));
17071707

1708+
/* Check opclass gave us sane values */
1709+
if (out->result.splitTuple.prefixNNodes <= 0 ||
1710+
out->result.splitTuple.prefixNNodes > SGITMAXNNODES)
1711+
elog(ERROR, "invalid number of prefix nodes: %d",
1712+
out->result.splitTuple.prefixNNodes);
1713+
if (out->result.splitTuple.childNodeN < 0 ||
1714+
out->result.splitTuple.childNodeN >=
1715+
out->result.splitTuple.prefixNNodes)
1716+
elog(ERROR, "invalid child node number: %d",
1717+
out->result.splitTuple.childNodeN);
1718+
17081719
/*
1709-
* Construct new prefix tuple, containing a single node with the specified
1710-
* label. (We'll update the node's downlink to point to the new postfix
1711-
* tuple, below.)
1720+
* Construct new prefix tuple with requested number of nodes. We'll fill
1721+
* in the childNodeN'th node's downlink below.
17121722
*/
1713-
node = spgFormNodeTuple(state, out->result.splitTuple.nodeLabel, false);
1723+
nodes = (SpGistNodeTuple *) palloc(sizeof(SpGistNodeTuple) *
1724+
out->result.splitTuple.prefixNNodes);
1725+
1726+
for (i = 0; i < out->result.splitTuple.prefixNNodes; i++)
1727+
{
1728+
Datum label = (Datum) 0;
1729+
bool labelisnull;
1730+
1731+
labelisnull = (out->result.splitTuple.prefixNodeLabels == NULL);
1732+
if (!labelisnull)
1733+
label = out->result.splitTuple.prefixNodeLabels[i];
1734+
nodes[i] = spgFormNodeTuple(state, label, labelisnull);
1735+
}
17141736

17151737
prefixTuple = spgFormInnerTuple(state,
17161738
out->result.splitTuple.prefixHasPrefix,
17171739
out->result.splitTuple.prefixPrefixDatum,
1718-
1, &node);
1740+
out->result.splitTuple.prefixNNodes,
1741+
nodes);
17191742

17201743
/* it must fit in the space that innerTuple now occupies */
17211744
if (prefixTuple->size > innerTuple->size)
@@ -1807,10 +1830,12 @@ spgSplitNodeAction(Relation index, SpGistState *state,
18071830
* the postfix tuple first.) We have to update the local copy of the
18081831
* prefixTuple too, because that's what will be written to WAL.
18091832
*/
1810-
spgUpdateNodeLink(prefixTuple, 0, postfixBlkno, postfixOffset);
1833+
spgUpdateNodeLink(prefixTuple, out->result.splitTuple.childNodeN,
1834+
postfixBlkno, postfixOffset);
18111835
prefixTuple = (SpGistInnerTuple) PageGetItem(current->page,
18121836
PageGetItemId(current->page, current->offnum));
1813-
spgUpdateNodeLink(prefixTuple, 0, postfixBlkno, postfixOffset);
1837+
spgUpdateNodeLink(prefixTuple, out->result.splitTuple.childNodeN,
1838+
postfixBlkno, postfixOffset);
18141839

18151840
MarkBufferDirty(current->buffer);
18161841

src/backend/access/spgist/spgtextproc.c

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -212,9 +212,14 @@ spg_text_choose(PG_FUNCTION_ARGS)
212212
out->result.splitTuple.prefixPrefixDatum =
213213
formTextDatum(prefixStr, commonLen);
214214
}
215-
out->result.splitTuple.nodeLabel =
215+
out->result.splitTuple.prefixNNodes = 1;
216+
out->result.splitTuple.prefixNodeLabels =
217+
(Datum *) palloc(sizeof(Datum));
218+
out->result.splitTuple.prefixNodeLabels[0] =
216219
Int16GetDatum(*(unsigned char *) (prefixStr + commonLen));
217220

221+
out->result.splitTuple.childNodeN = 0;
222+
218223
if (prefixSize - commonLen == 1)
219224
{
220225
out->result.splitTuple.postfixHasPrefix = false;
@@ -280,7 +285,10 @@ spg_text_choose(PG_FUNCTION_ARGS)
280285
out->resultType = spgSplitTuple;
281286
out->result.splitTuple.prefixHasPrefix = in->hasPrefix;
282287
out->result.splitTuple.prefixPrefixDatum = in->prefixDatum;
283-
out->result.splitTuple.nodeLabel = Int16GetDatum(-2);
288+
out->result.splitTuple.prefixNNodes = 1;
289+
out->result.splitTuple.prefixNodeLabels = (Datum *) palloc(sizeof(Datum));
290+
out->result.splitTuple.prefixNodeLabels[0] = Int16GetDatum(-2);
291+
out->result.splitTuple.childNodeN = 0;
284292
out->result.splitTuple.postfixHasPrefix = false;
285293
}
286294
else

src/include/access/spgist.h

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -90,10 +90,13 @@ typedef struct spgChooseOut
9090
} addNode;
9191
struct /* results for spgSplitTuple */
9292
{
93-
/* Info to form new inner tuple with one node */
93+
/* Info to form new upper-level inner tuple with one child tuple */
9494
bool prefixHasPrefix; /* tuple should have a prefix? */
9595
Datum prefixPrefixDatum; /* if so, its value */
96-
Datum nodeLabel; /* node's label */
96+
int prefixNNodes; /* number of nodes */
97+
Datum *prefixNodeLabels; /* their labels (or NULL for
98+
* no labels) */
99+
int childNodeN; /* which node gets child tuple */
97100

98101
/* Info to form new lower-level inner tuple with all old nodes */
99102
bool postfixHasPrefix; /* tuple should have a prefix? */
@@ -134,7 +137,8 @@ typedef struct spgInnerConsistentIn
134137

135138
Datum reconstructedValue; /* value reconstructed at parent */
136139
void *traversalValue; /* opclass-specific traverse value */
137-
MemoryContext traversalMemoryContext;
140+
MemoryContext traversalMemoryContext; /* put new traverse values
141+
* here */
138142
int level; /* current level (counting from zero) */
139143
bool returnData; /* original data must be returned? */
140144

@@ -163,8 +167,8 @@ typedef struct spgLeafConsistentIn
163167
ScanKey scankeys; /* array of operators and comparison values */
164168
int nkeys; /* length of array */
165169

166-
void *traversalValue; /* opclass-specific traverse value */
167170
Datum reconstructedValue; /* value reconstructed at parent */
171+
void *traversalValue; /* opclass-specific traverse value */
168172
int level; /* current level (counting from zero) */
169173
bool returnData; /* original data must be returned? */
170174

0 commit comments

Comments
 (0)
0