-
-
Notifications
You must be signed in to change notification settings - Fork 32.1k
bpo-45829: Specialize BINARY_SUBSCR for __getitem__ implemented in Python. #29592
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-45829: Specialize BINARY_SUBSCR for __getitem__ implemented in Python. #29592
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really cool! A few notes:
@@ -0,0 +1,2 @@ | |||
Specialize BINARY_SUBSCR for classes with a ``__getitem__`` method |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specialize BINARY_SUBSCR for classes with a ``__getitem__`` method | |
Specialize :opcode:`BINARY_SUBSCR` for classes with a ``__getitem__`` method |
Python/ceval.c
Outdated
assert(cache->adaptive.original_oparg == 0); | ||
oparg = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this just to clarify that the oparg is still unused (even though we use cache entries)?
Python/specialize.c
Outdated
int flags = code->co_flags; | ||
if (flags & (CO_GENERATOR | CO_COROUTINE | CO_ASYNC_GENERATOR)) { | ||
return SPEC_FAIL_GENERATOR; | ||
return -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return -1; |
Python/specialize.c
Outdated
if (code->co_nfreevars) { | ||
return SPEC_FAIL_FREE_VARS; | ||
} | ||
return 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really like this, since 0
is a valid failure code. Perhaps return -1
on success and check for that instead (it's sort of a weird interface, but whatever).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add a success code, for clarity.
In many other C programs, yes 0
is a failure.
However, in CPython, 0
is usually a success, as -1
(or any negative number in some cases) is a failure.
E.g. PyObject_RichCompareBool
return 0
for False, and -1
for an error.
compile.c
is an unfortunate counter example, where some functions return 0
for a failure and others return 0
for a success.
_Py_IDENTIFIER(__getitem__); | ||
|
||
static int | ||
function_kind(PyCodeObject *code) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
function_kind(PyCodeObject *code) { | |
function_spec_fail_kind(PyCodeObject *code) { |
@@ -4914,7 +4945,7 @@ MISS_WITH_CACHE(LOAD_GLOBAL) | |||
MISS_WITH_CACHE(LOAD_METHOD) | |||
MISS_WITH_CACHE(CALL_FUNCTION) | |||
MISS_WITH_CACHE(BINARY_OP) | |||
MISS_WITH_OPARG_COUNTER(BINARY_SUBSCR) | |||
MISS_WITH_CACHE(BINARY_SUBSCR) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing it's still worth keeping the logic for oparg counters, just in case we implement a new family that can uses it in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might as well delete it. It's in the git history.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have the same questions as Brandt, but otherwise this LGTM.
While not in pyperformance, I'm excited for code using typing
, since that module heavily uses __getitem__
(and __class_getitem__
). Awesome!
Type hints are run-once code (at least they should be) so won't be affected. |
Agreed that it's true for the vast majority. But recently I noticed variable annotations sometimes appear in hot code paths and they might be affected. |
Specializes
BINARY_SUBSCR
for classes with a__getitem__
method implemented in Python by making the call directly in the instruction using the same mechanism asCALL_FUNCTION_PY_SIMPLE
.A respectable 1% speedup including a 20% speedup for one benchmark that makes heavy use of
__getitem__
.https://bugs.python.org/issue45829