-
Notifications
You must be signed in to change notification settings - Fork 52
Description
Although this would be beneficial for all int
use cases, I am currently motivated by range
/ enumerate
/ itertools.count
performance.
What dominates performance in all 3 mentioned utilities is PyLongObject
creation:
# PY3.12
from collections import deque
consume = deque(maxlen=0).extend
INT_RANGE = list(range(100_000))
%timeit consume(range(100_000)) # 2.3 ms
%timeit consume(iter(INT_RANGE)) # 0.6 ms
Also, there are FREELISTs:
S1="
import collections as coll
BHOLE = coll.deque(maxlen=0).extend
"
S2="
import collections as coll
BHOLE = coll.deque(maxlen=0).extend
a = list(range(100_000)) # This consumes FREELIST
"
PYEXE="./cpython/main/python.exe"
$PYEXE -m timeit -s $S1 'BHOLE(range(100_000))' # 1.2 ms
$PYEXE -m timeit -s $S2 'BHOLE(range(100_000))' # 1.2 ms
$PYEXE -m timeit -s $S2 'BHOLE(iter(a))' # 0.3 ms
I checked that FREELIST is utilised when using S1
and is not when using S1
.
python/cpython#126865 suggests 10-20% performance improvement, but I can not see any big difference in performance when objects are re-used versus when they are not. Either way, although 10-20% improvement is great, this would provide improvement of different order.
E.g. performance difference when using pre-stored ints.
S="
from collections import deque
from itertools import chain
chain = chain.from_iterable
consume = deque(maxlen=0).extend
"
$PYEXE -m timeit -s $S "consume(range(100_000))" # 1.2 ms
$PYEXE -m timeit -s $S "consume(chain([range(256)] * (100_000 // 256)))" # 0.5 ms
So I would dare to suggest increasing _PY_NSMALLPOSINTS
to 1000
or 10_000
.
It would provide 4x better performance.
It would be extra 10 or 100 KB.
Given extensive usage of mentioned objects (and int
in general), I suspect this could have observable benefit at quite reasonable cost.
If this is feasible, exact number needs to be determined. It should not be too hard to find out what sort of number would cover a certain percentage of use cases.