-
-
Notifications
You must be signed in to change notification settings - Fork 32k
gh-100554: Add Py_tp_vectorcall
slot to set PyTypeObject.tp_vectorcall
using the PyType_FromSpec
function family.
#123332
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Doc/c-api/type.rst
Outdated
* The field :c:member:`~PyTypeObject.tp_vectorcall` can be set since | ||
Python 3.14. On older versions, use :c:member:`~PyTypeObject.tp_new` | ||
and/or :c:member:`~PyTypeObject.tp_init`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use .. versionchanged
, like the 3.9 & 3.11 PyBufferProcs
entries a few lines down.
Modules/_testcapi/heaptype.c
Outdated
static PyType_Slot HeapCTypeVectorcall_slots[] = { | ||
{Py_tp_vectorcall, heapctype_vectorcall}, | ||
{0, 0}, | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks like a good example of how not to do it, since it leaves tp_call
from the base's metaclass:
>>> import _testcapi
>>> _testcapi.HeapCTypeVectorcall.__call__(_testcapi.HeapCTypeVectorcall, 'Sub', (), {})
<class '__main__.Sub'>
As per the docs, a class supporting vectorcall must also implement tp_call
with the same semantics. Here, the overridden vectorcall function would need to match the behaviour of PyType_Type->tp_call
(including any changes in future versions).
So, you do need to override tp_new
, and things get even more tricky if you allow subclasses/instances of HeapCTypeVectorcall
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, the point of this PR is to be able to do exactly the same as what tp_call
does, just more efficiently, so this difference in behavior would not arise in real-world usage.
But that is less straightforward to test from the outside, hence the the contrived example of returning 123
, which is not even an instance of the type.
(To keep the test reasonably compact, I would prefer not to deal with metaclasses on top)
I will override tp_new and make the type non-inheritable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm worried about leading users to a trap, and I'd prefer to see a full solution for the use case, to get an idea about where in the docs to put warnings, and what kinds of issues this opens.
Putting this in limited API means we should be extra careful about not exposing implementation details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to implement a more representative version but wonder about the following: how should the vectorcall version communicate to the test suite that it was used? If the equivalence requirement you stated is interpreted strictly, it should not be possible to detect any difference.
For example, would it be OK I return two different instances from each mode of construction and then add a comment saying that these are only different for test-related purposes, and that the contract of these operations is that they should be interchangeable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's fine for tests.
@encukou I incorporated the suggestions from your review, please let me know if this is okay now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm convinced! :)
Thank you for the patch!
I sent a PR to your branch with suggested docs changes: wjakob#1
Additions to the limited API should get a C-API WG approval; I opened capi-workgroup/decisions#41
Doc/c-api/type.rst
Outdated
:c:member:`~PyTypeObject.tp_new` and/or | ||
:c:member:`~PyTypeObject.tp_init`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, you should use tp_new and/or tp_init on all versions; in 3.14 you can optimize after you have them.
(I removed this in my PR.)
What shall I do about the (In the past, when I rebased PRs or changed history in other ways, it was mentioned to me that this breaks the review process that some authors use) |
The C API WG vote is next. To solve the conflict, merge in the main branch with a merge commit (or use the “resolve conflicts” button on GitHub). |
Done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The vote is finally done!
The code is good to go in, let me just fix up some flaws in my suggestions for the docs.
Awesome, thank you!! |
Following the merge of python/cpython#123332, type slot 82 is reserved for type vector calls. This means that we can safely intercept this value if provided to nanobind even on older versions of Python where the feature isn't supported yet. In this case, the metaclass-based implementation is used.
This commit adds the
Py_tp_vectorcall
slot to setPyTypeObject.tp_vectorcall
using thePyType_FromSpec()
function family.For context, this field is used to intercept type instantiation (
MyType(args..)
) and implement it using the vector call protocol. It provides access to one of the last missing non-internalPyTypeObject
fields that cannot be set using thePyType_FromSpec()
API.The rationale for this change is that it can significantly improve the efficiency of C extensions built using the limited API. The runtime of a microbenchmark that repeatedly constructs a type with keyword arguments goes from 2.017 sec to 1.030 sec, a ~2x change on an extremely hot code path. All of this wasted time is spent creating temporary objects to go from one calling convention to another, and it can be entirely avoided by enabling the extension to provide a vector call-based constructor.
Having access to this flag is especially useful for binding frameworks that declare custom types written in another programming language (e.g. C++ or Rust) because a large set of downstream projects immediately reap the benefits. The original issue #100554 was created by @davidhewitt, who develops PyO3. I am the developer of nanobind and a coauthor of pybind11.
I would like to propose this feature for Python 3.14. Likely more controversial: I would argue that there is a case even for including it in 3.13, which is technically in the feature-freeze phase.
Fixes issue #100554.
tp_vectorcall
for heap types #100554📚 Documentation preview 📚: https://cpython-previews--123332.org.readthedocs.build/