-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
ENH: allocate working buffers outside ufunc's inner loop #11510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@mattip - In my work wrapping Although I wonder: at least in my case the reason I needed a temporary array was that the functions I'm wrapping expect input arrays that are contiguous, and that is not necessarily what one gets passed. Is it the same for you? Is there perhaps a flag to ask the iterator to provide that? p.s. I was puzzled too about what |
I was thinking of something that would allocate before starting to iterate over the loops and free afterwards, but that second part is exactly what is missing. Indeed I would like to get a contiguous piece of memory for the inner loop, but
The solution used for masked data is an example of the kind of hack I would like to avoid - there is special code at the end of Maybe we need to add an |
I think we could extend as long as we do it before 1.16.0. And perhaps write it such that we can also solve the masked loops (have to admit I do not yet fully understand what happens there). |
@seberg did any of these ideas make it into the ufunc refactoring? |
It is possible now, yes if you write it the way that the string ufuncs are written in my PR (also good to merge ;)). It will be slightly awkward right now, since there is no way to get the shapes early on currently (could be an API addition though). So you need to EDIT: I am happy to walk through in detail of how to do it with anyone who wants to look into this. |
Working on matmul in #11133, and comparing to the
linalg
inner loops, I ran into a need for a working buffer much like linalg. Inumath_linalg.c.src
each iteration of the inner loopmallocs
/frees
the working memory. There seems to be no generic support for passing in a working buffer allocated once for the ufunc call.The
PyUFuncGenericFunction
signature has ainnerloopdata
argument, but I could find no examples of its use in linalg. In the actual inner loops inumath_linalg.c.src
and elsewhere it is marked asNPY_UNUSED(func)
The only place I could find a use for this argument is in
unmasked_ufunc_loop_as_masked
where it is used to hold a structure, not a function.The text was updated successfully, but these errors were encountered: