-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Description
Description
In many places in the Synapse code, some variation of the following code exists:
if foo in ("bar", "baz", "qux"):
# ... or ...
if foo in ["bar", "baz", "qux"]:
# ... or ...
STUFF = ["bar", "baz", "qux"]
if foo in STUFF:
These use tuple and list literals respectively, while the resulting tuples/lists are only ever used for presence checking of particular entries. Some quick microbenchmarking suggested, however, that set literals would be significantly faster here (relatively speaking), for both hits and misses:
$ python3 -m timeit '"nyet" in { "foo", "bar", "baz", 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5 }'
10000000 loops, best of 5: 21.8 nsec per loop
$ python3 -m timeit '"baz" in { "foo", "bar", "baz", 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5 }'
20000000 loops, best of 5: 18.1 nsec per loop
$ python3 -m timeit '"nyet" in [ "foo", "bar", "baz", 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5 ]'
1000000 loops, best of 5: 276 nsec per loop
$ python3 -m timeit '"baz" in [ "foo", "bar", "baz", 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5 ]'
5000000 loops, best of 5: 54 nsec per loop
$ python3 -m timeit '"nyet" in ( "foo", "bar", "baz", 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5 )'
1000000 loops, best of 5: 264 nsec per loop
$ python3 -m timeit '"baz" in ( "foo", "bar", "baz", 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5 )'
5000000 loops, best of 5: 53.8 nsec per loop
Even for very small collections:
$ python3 -m timeit '"bar" in { "foo", "bar" }'
20000000 loops, best of 5: 18.6 nsec per loop
$ python3 -m timeit '"bar" in [ "foo", "bar" ]'
10000000 loops, best of 5: 35.6 nsec per loop
$ python3 -m timeit '"bar" in ( "foo", "bar" )'
10000000 loops, best of 5: 36.1 nsec per loop
... pretty much only being slower - and even then, only marginally slower - when literally the first element in the collection is a hit:
$ python3 -m timeit '"foo" in { "foo", "bar" }'
20000000 loops, best of 5: 17.8 nsec per loop
$ python3 -m timeit '"foo" in [ "foo", "bar" ]'
20000000 loops, best of 5: 13.9 nsec per loop
$ python3 -m timeit '"foo" in ( "foo", "bar" )'
20000000 loops, best of 5: 14.8 nsec per loop
While I have not analyzed it in detail, I strongly suspect that at least some of these checks are in a hot path, where they could provide a significant performance improvement - and I expect that blindly changing all non-iterated list/tuple literals to set literals across the codebase, could provide a significant performance improvement with very little work.
It appears that the performance benefit comes from list/tuple literals with constant values being compiled to tuples, whereas set literals with constant values are compiled to frozenset
s, paying the entire set-building cost at compile time while incurring no runtime overhead.
Steps to reproduce
N/A
Version information
Current develop
HEAD.