-
-
Notifications
You must be signed in to change notification settings - Fork 26k
ENH Adds FeatureHasher support to pypy #23023
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH Adds FeatureHasher support to pypy #23023
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. C++ rocks.
@@ -64,6 +68,42 @@ cdef class StdVectorSentinelIntP(StdVectorSentinel): | |||
return ITYPECODE | |||
|
|||
|
|||
cdef class StdVectorSentinelInt32(StdVectorSentinel): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is starting to look like tempita material 😄
THe pypy run did not get it through :( |
Currently, PyPy already timeouts on main. I increased the timeout duration for PyPy in this PR so the test has a chance to complete. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
A green pypy CI would be nice. After that we should revert c008f80 (don't we?) and then merge. |
Reference Issues/PRs
Fixes #11540
What does this implement/fix? Explain your changes.
This PR enables FeatureHasher to work on pypy by using c++ vectors in the transform call. Running this benchmark:
Benchmark
Here are my results:
Any other comments?
Int32 and Int64 were added to the vector sentinel to support the hasher.