8000 Fix ts_headline() edge cases for empty query and empty search text. · postgres/postgres@bc428b1 · GitHub
[go: up one dir, main page]

Skip to content

Commit bc428b1

Browse files
committed
Fix ts_headline() edge cases for empty query and empty search text.
tsquery's GETQUERY() macro is only safe to apply to a tsquery that is known non-empty; otherwise it gives a pointer to garbage. Before commit 5a617d7, ts_headline() avoided this pitfall, but only in a very indirect, nonobvious way. (hlCover could not reach its TS_execute call, because if the query contains no lexemes then hlFirstIndex would surely return -1.) After that commit, it fell into the trap, resulting in weird errors such as "unrecognized operator" and/or valgrind complaints. In HEAD, fix this by not calling TS_execute_locations() at all for an empty query. In the back branches, add a defensive check to hlCover() --- that's not fixing any live bug, but I judge the code a bit too fragile as-is. Also, both mark_hl_fragments() and mark_hl_words() were careless about the possibility of empty search text: in the cases where no match has been found, they'd end up telling mark_fragment() to mark from word indexes 0 to 0 inclusive, even when there is no word 0. This is harmless since we over-allocated the prs->words array, but it does annoy valgrind. Fix so that the end index is -1 and thus mark_fragment() will do nothing in such cases. Bottom line is that this fixes a live bug in HEAD, but in the back branches it's only getting rid of a valgrind nitpick. Back-patch anyway. Per report from Alexander Lakhin. Discussion: https://postgr.es/m/c27f642d-020b-01ff-ae61-086af287c4fd@gmail.com
1 parent eac34f7 commit bc428b1

File tree

3 files changed

+33
-2
lines changed

3 files changed

+33
-2
lines changed

src/backend/tsearch/wparser_def.c

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2046,6 +2046,9 @@ hlCover(HeadlineParsedText *prs, TSQuery query, int max_cover,
20462046
nextpmax;
20472047
hlCheck ch;
20482048

2049+
if (query->size <= 0)
2050+
return false; /* empty query matches nothing */
2051+
20492052
/*
20502053
* We look for the earliest, shortest substring of prs->words that
20512054
* satisfies the query. Both the pmin and pmax indices must be words
@@ -2350,7 +2353,8 @@ mark_hl_fragments(HeadlineParsedText *prs, TSQuery query, bool highlightall,
23502353
/* show the first min_words words if we have not marked anything */
23512354
if (num_f <= 0)
23522355
{
2353-
startpos = endpos = curlen = 0;
2356+
startpos = curlen = 0;
2357+
endpos = -1;
23542358
for (i = 0; i < prs->curwords && curlen < min_words; i++)
23552359
{
23562360
if (!NONWORDTOKEN(prs->words[i].type))
@@ -2505,7 +2509,7 @@ mark_hl_words(HeadlineParsedText *prs, TSQuery query, bool highlightall,
25052509
if (bestlen < 0)
25062510
{
25072511
curlen = 0;
2508-
pose = 0;
2512+
pose = -1;
25092513
for (i = 0; i < prs->curwords && curlen < min_words; i++)
25102514
{
25112515
if (!NONWORDTOKEN(prs->words[i].type))

src/test/regress/expected/tsearch.out

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1515,6 +1515,27 @@ to_tsquery('english','Lorem') && phraseto_tsquery('english','ullamcorper urna'),
15151515
<b>Lorem</b> ipsum <b>urna</b>. Nullam nullam <b>ullamcorper</b> <b>urna</b>
15161516
(1 row)
15171517

1518+
-- Edge cases with empty query
1519+
SELECT ts_headline('english',
1520+
'', ''::tsquery);
1521+
NOTICE: text-search query doesn't contain lexemes: ""
1522+
LINE 2: '', ''::tsquery);
1523+
^
1524+
ts_headline
1525+
-------------
1526+
1527+
(1 row)
1528+
1529+
SELECT ts_headline('english',
1530+
'foo bar', ''::tsquery);
1531+
NOTICE: text-search query doesn't contain lexemes: ""
1532+
LINE 2: 'foo bar', ''::tsquery);
1533+
^
1534+
ts_headline
1535+
-------------
1536+
foo bar
1537+
(1 row)
1538+
15181539
--Rewrite sub system
15191540
CREATE TABLE test_tsquery (txtkeyword TEXT, txtsample TEXT);
15201541
\set ECHO none

src/test/regress/sql/tsearch.sql

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -451,6 +451,12 @@ SELECT ts_headline('english',
451451
to_tsquery('english','Lorem') && phraseto_tsquery('english','ullamcorper urna'),
452452
'MaxFragments=100, MaxWords=100, MinWords=1');
453453

454+
-- Edge cases with empty query
455+
SELECT ts_headline('english',
456+
'', ''::tsquery);
457+
SELECT ts_headline('english',
458+
'foo bar', ''::tsquery);
459+
454460
--Rewrite sub system
455461

456462
CREATE TABLE test_tsquery (txtkeyword TEXT, txtsample TEXT);

0 commit comments

Comments
 (0)
0