-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
WIP: ENH: Expand gufunc signature to allow flexible dimension specs #11132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
c2afb3c
to
b785515
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattip - this is quite wonderful! I did find a few possible problems. It would also be good to add tests actually using the cross product (I guess matmul
will do for the flexible ones!)
ufunc->core_offsets = PyArray_malloc(sizeof(int) * ufunc->nargs); | ||
if (ufunc->core_num_dims == NULL || ufunc->core_dim_ixs == NULL | ||
|| ufunc->core_offsets == NULL) { | ||
/* The next two items will be shrunk later */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
next three
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removing szs makes it two again, so "fixed"
numpy/core/src/umath/ufunc_object.c
Outdated
|
||
if (!_is_alpha_underscore(signature[i]) && | ||
(frozen_size = _get_size(signature + i)) < 0) { | ||
parse_error = "expect dimension name or non-zero frozen size"; | ||
goto fail; | ||
} | ||
while (j < ufunc->core_num_dim_ix) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This while
loop is not necessarily correct for frozen dimensions, since a number does not have to be given exactly the same way (say, a signature of (03)->(3)
. How about the following:
for (j = 0; j < ufunc->core_num_dim_ix; j++) {
if (frozen_size >= 0 ? frozen_size == ufunc->core_dim_szs[j]) :
_is_same_name(signature+i, var_names[j])) {
break;
}
}
(Could also be a while
, but then at least include the test for (in)equality in the condition!)
Though perhaps it is better to just split the frozen and normal paths altogether (see above)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed by removing code
numpy/core/src/umath/ufunc_object.c
Outdated
goto fail; | ||
} | ||
|
||
if (!_is_alpha_underscore(signature[i]) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think here and below it might make more sense to separate out more:
if (_is_alpha...) {
check on var_names
}
else {
frozen_size = _get_...
if (frozen_size < 0) {
parse error
}
check on same frozen size
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed by removing frozen size code
@@ -2066,41 +2126,63 @@ _parse_axes_arg(PyUFuncObject *ufunc, int core_num_dims[], PyObject *axes, | |||
*/ | |||
static int | |||
_get_coredim_sizes(PyUFuncObject *ufunc, PyArrayObject **op, | |||
int *core_num_dims, int * flexible_activated, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extraneous space after *
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed by removing function
numpy/core/src/umath/ufunc_object.c
Outdated
npy_intp* core_dim_sizes, int **remap_axis) { | ||
int i; | ||
int nin = ufunc->nin; | ||
int nout = ufunc->nout; | ||
int nop = nin + nout; | ||
|
||
for (i = 0; i < ufunc->core_num_dim_ix; ++i) { | ||
core_dim_sizes[i] = -1; | ||
/* support fixed-size dim names */ | ||
core_dim_sizes[i] = ufunc->core_dim_szs[i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe call the ufunc part core_dim_sizes
as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed by removing ufunc->core_dim_szs
numpy/core/src/umath/ufunc_object.c
Outdated
@@ -2411,16 +2581,27 @@ PyUFunc_GeneralizedFunction(PyUFuncObject *ufunc, | |||
if (i >= nin) { | |||
int dim_offset = ufunc->core_offsets[i]; | |||
int num_dims = core_num_dims[i]; | |||
int has_flexible=0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spaces around =
Also, can we call this num_flexible
, since that is what it ends up being?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
numpy/core/src/umath/ufunc_object.c
Outdated
int core_index = ufunc->core_dim_ixs[dim_offset + idim]; | ||
if (flexible_activated[core_index]) { | ||
/* skip it */ | ||
has_flexible ++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no space before ++
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
numpy/core/src/umath/ufunc_object.c
Outdated
@@ -4671,6 +4853,7 @@ PyUFunc_FromFuncAndDataAndSignature(PyUFuncGenericFunction *func, void **data, | |||
ufunc->core_dim_ixs = NULL; | |||
ufunc->core_offsets = NULL; | |||
ufunc->core_signature = NULL; | |||
ufunc->core_dim_szs = NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should also initialize core_dim_flexible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
numpy/core/src/umath/ufunc_object.c
Outdated
@@ -5017,6 +5200,7 @@ ufunc_dealloc(PyUFuncObject *ufunc) | |||
{ | |||
PyArray_free(ufunc->core_num_dims); | |||
PyArray_free(ufunc->core_dim_ixs); | |||
PyArray_free(ufunc->core_dim_szs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should also free ->core_dim_flexible
or we'll have a memory leak...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
numpy/core/tests/test_ufunc.py
Outdated
assert_equal(num_dims, (2, 2, 2)) | ||
assert_equal(ixs, (0, 1, 1, 2, 0, 2)) | ||
assert_equal(szs, (-1, -1, -1)) | ||
assert_equal(flex, (1, 0, 1)) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add tests for the fixed sizes too? Including travesties like (+0003)->(3)
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removing fixed sizes to a different pull request
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"fixed" by removing fixed sizes
I am waiting to proceed until there is consensus around the general validity of the approach. How do you feel about changing the signature and PyUFuncObject c-compatibility vs. PR #11061 with the wrapper function? |
I'd really like the fixed dimensions independently of I'm slightly less sure about the In any case, I do think I like this better than the multiple signatures idea. |
More to the point: for |
I haven't looked over this in detail, but... If we're trying to avoid a scenario where Let's not add special cases unless they also work for these cases too. |
|
|
p.s. If the |
/* | ||
* for each core_num_dim_ix, 1 for flexible (signature has ?) 0 otherwise | ||
*/ | ||
int *core_dim_flexible; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation suggestion: I think we might as well make this a set of flags, so that we can re-use them for future expansions.
In fact, one could then add not just a flag UFUNC_COREDIM_FLEXIBLE
but also a UFUNC_COREDIM_FIXED
- for that case, core_num_dim_idx
core_dim_ixs
could just be set to the size, since there is no need to track the index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going ahead and try this - I guess one can see it as my contribution to the BIDS sprint!
It is not clear to me how to refactor these to use the |
b785515
to
f3d1622
Compare
removed fix-dimension gufunc signatures and fixed review comments |
@mattip, @eric-wieser - I should really have read all the comments up here, but didn't have time tonight, yet still was thinking about all this. Some specific thought:
|
@mattip - I am some way to the per-core (for each operand) flags (well, tests pass again). |
And am realizing that my description above is pretty poor - we're not really about broadcasting here (size=1), but rather about ignoring (size=0). |
@mattip - as you can see in #11175, I moved on from your earlier version which included the frozen dimensions. I think all of this is a great idea, and should work. However, the PRs are becoming quite big, and obviously with your reverting of the frozen dimensions, we've also diverged. I think it may be best to proceed in smaller steps, and took the liberty of going ahead with the first two, which seemed uncontroversial (I tried to cherry-pick, etc., so you get appropriate credit):
For flexible in particular, my main question is whether you think going the route of flags is a good idea. Implementation-wise, it doesn't matter so much, but it gives more freedom. On the implementation: #11175 ends up being quite a bit simpler (I think) than the proof of concept here. But again it may be useful to proceed in smaller steps, e.g., first doing the parser and then the consequence. |
The flag proposal looks better. Closing this in favor of PR #11175 |
OK, happy you like it. What I think I'll do is to squash our respective commits in #11175, and then rebase to get rid of the merge conflicts. |
The discussion of issue #9028 yielded two different possible directions for allowing
ndarray
subclasses to override__matmul__
via__array_ufunc__
. This is option 1 - extend the core signature of a generalizedufunc
to allow a single ufuncmatmul
which will implement vector-vector, vector-matrix, matrix-vector, and matrix-matrix multiplication. In this PR, the syntax of a signature is expanded to allow(n?, k),(k, m?)->(n?, m?)
syntax, where the?
means "this dimension is optional" [0],[1]. I also incorporated PR #5015 where signatures can have a fixed dimension(3),(3)->(3)
.What changed?
reserved1
to become aversion
field for downstream consumers of the C-API.ufunc
instantiation now parses the expanded signatureufunc
with flexible or fixed dimensions, extra checks are done to ensure consistencyWhat is yet to be done?
matmul
a trueufunc
, I probably have chages there that belong here and visa-versaWhat is next?
matmul
a trueufunc
ufunc
with a wrapper function that can be used to convert any function into aufunc
.[0]: Note that, following the PEP for matrix multiplication, there is no broadcasting over the optional dimensions, in the example given if the first operand has 2 or more dimensions, the second-to-the-last will be 'n', no
newaxis
will be inserted instead.[1] The syntax '(n,k),(k,m)->(n,m);(n,k),(k)->(n);(k),(k,m)->(m);(k),(k)->()` was also considered and rejected as allowing too much flexibility in stacking alternate core signatures.