8000 Fix performance problem when building a lossy tidbitmap. · hackingwu/postgres@6016005 · GitHub
[go: up one dir, main page]

Skip to content

Commit 6016005

Browse files
committed
Fix performance problem when building a lossy tidbitmap.
As pointed out by Sergey Koposov, repeated invocations of tbm_lossify can make building a large tidbitmap into an O(N^2) operation. To fix, make sure we remove more than the minimum amount of information per call, and add a fallback path to behave sanely if we're unable to fit the bitmap within the requested amount of memory. This has been wrong since the tidbitmap code was written, so back-patch to all supported branches.
1 parent 44631ee commit 6016005

File tree

1 file changed

+19
-3
lines changed

1 file changed

+19
-3
lines changed

src/backend/nodes/tidbitmap.c

Lines changed: 19 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -884,8 +884,11 @@ tbm_lossify(TIDBitmap *tbm)
884884
/*
885885
* XXX Really stupid implementation: this just lossifies pages in
886886
* essentially random order. We should be paying some attention to the
887-
* number of bits set in each page, instead. Also it might be a good idea
888-
* to lossify more than the minimum number of pages during each call.
887+
* number of bits set in each page, instead.
888+
*
889+
* Since we are called as soon as nentries exceeds maxentries, we should
890+
* push nentries down to significantly less than maxentries, or else we'll
891+
* just end up doing this again very soon. We shoot for maxentries/2.
889892
*/
890893
Assert(!tbm->iterating);
891894
Assert(tbm->status == TBM_HASH);
@@ -906,7 +909,7 @@ tbm_lossify(TIDBitmap *tbm)
906909
/* This does the dirty work ... */
907910
tbm_mark_page_lossy(tbm, page->blockno);
908911

909-
if (tbm->nentries <= tbm->maxentries)
912+
if (tbm->nentries <= tbm->maxentries / 2)
910913
{
911914
/* we have done enough */
912915
hash_seq_term(&status);
@@ -919,6 +922,19 @@ tbm_lossify(TIDBitmap *tbm)
919922
* not care whether we visit lossy chunks or not.
920923
*/
921924
}
925+
926+
/*
927+
* With a big bitmap and small work_mem, it's possible that we cannot
928+
* get under maxentries. Again, if that happens, we'd end up uselessly
929+
* calling tbm_lossify over and over. To prevent this from becoming a
930+
* performance sink, force maxentries up to at least double the current
931+
* number of entries. (In essence, we're admitting inability to fit
932+
* within work_mem when we do this.) Note that this test will not fire
933+
* if we broke out of the loop early; and if we didn't, the current
934+
* number of entries is simply not reducible any further.
935+
*/
936+
if (tbm->nentries > tbm->maxentries / 2)
937+
tbm->maxentries = Min(tbm->nentries, (INT_MAX - 1) / 2) * 2;
922938
}
923939

924940
/*

0 commit comments

Comments
 (0)
0