-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
np.linalg.solve documentation suggests ambigous behaviour #15349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The
is illegal. |
A hint from the comments: Lines 394 to 399 in d9b1e32
This seems like potentially dangerous behavior. |
Originates from 04ad33e#diff-6858bbac5c5c59bdddf245c69d6a547cR359, cc @pv. |
Hmmm, that exact match may be a bit tricky. Another logic we have is to do it like matmul, and make it: |
Concerning documentation:
Yes I get, that this is the current behavior. But I don't think that this is obvious for new users. I looked a bit around in the documentation, and there is hardly any usage of the Ellipsis. However, it often used when indexing arrays as well as when using As the Ellipsis has a different meaning in the documentation (be the same thing actually) as in code (any additional dimension, just should be broadcastable), further explanation is necessary. But of course, it might be that I just think strangely and it's obvious to everyone else. Concerning broadcasting: |
I didn't really have time to put much thought into it, but this seems to me like it would be a good place to give the user control by a keyword argument, something like if b_is_vector:
gufunc = _umath_linalg.solve1
else:
gufunc = _umath_linalg.solve skipping the checks, whether it can be determined automatically. |
@DerWeh do you want to make a PR with this, I think it would be good to explore at least. The correct route here will be to give a DeprecationWarning, we should not change this over night. Make it |
Another ship that has already sailed, but for future reference and for those unfamiliar with gufunc behavior: if we could use the gufunc |
NumPy's solve() does not handle the ambiguity around x2 being 1-D vector vs. an n-D stack of matrices in the way that the standard specifies. Namely, x2 should be treated as a 1-D vector iff it is 1-dimensional, and a stack of matrices in all other cases. This workaround is borrowed from array-api-strict. See numpy/numpy#15349 and data-apis/array-api#285. Note that this workaround only works for NumPy. CuPy currently does not support stacked vectors for solve() (see https://github.com/cupy/cupy/blob/main/cupy/cublas.py#L43), and the workaround in cupy.array_api.linalg does not seem to actually function.
Previously the np.linalg.solve documentation stated: a : (..., M, M) array_like Coefficient matrix. b : {(..., M,), (..., M, K)}, array_like however, this is inherently ambiguous. For example, if a has shape (2, 2, 2) and b has shape (2, 2), b could be treated as a (2,) stack of (2,) column vectors, in which case the result should have shape (2, 2), or as a single 2x2 matrix, in which case, the result should have shape (2, 2, 2). NumPy resolved this ambiguity in a confusing way, which was to treat b as (..., M) whenever b.ndim == a.ndim - 1, and as (..., M, K) otherwise. A much more consistent way to handle this ambiguity is to treat b as a single vector if and only if it is 1-dimensional, i.e., use b : {(M,), (..., M, K)}, array_like This is consistent with, for instance, the matmul operator, which only uses the special 1-D vector logic if an operand is exactly 1-dimensional, and treats the operands as (stacks of) 2-D matrices otherwise. This updates np.linalg.solve() to use this behavior. This is a backwards compatibility break, as any instance where the b array has more than one dimension and exactly one fewer dimension than the a array will now use the matrix logic, potentially returning a different result with a different shape. The previous behavior can be manually emulated with something like np.solve(a, b[..., None])[..., 0] since b as a (M,) vector is equivalent to b as a (M, 1) matrix (or the user could manually import and use the internal solve1() gufunc which implements the b-as-vector logic). This change aligns the solve() function with the array API, which resolves this broadcasting ambiguity in this way. See https://data-apis.org/array-api/latest/extensions/generated/array_api.linalg.solve.html#array_api.linalg.solve and data-apis/array-api#285. Fixes numpy#15349 Fixes numpy#25583
The docstring shows of
numpy.linalg.solve
:which leads to ambiguous conclusions. Let's stick to first choice. This would mean for
a.shape = (M, M)
,b.shape = (N, M)
I should expect a result ofx.shape = (N, M)
.Of course, it cannot really work, as there is always an uncertainty, if
b
should be interpreted as(..., M)
or(..., M, K)
(after all, what should we do ifM==K
?). Nevertheless, the documentation should be more explicit, as new users will be easily confused.I personally would remove the ambiguity completely. One idea would simply be adding an argument e.g.
b_is_matrix: bool
to specify which signature(..., M)
or(..., M, K)
should apply.The other idea would be, to remove
(..., M, K)
completely, as it can be obtained trivially by passingb.T
(ornp.swapaxes(b, -1, -2)
) and using broadcasting.Reproducing code example:
Error message:
Numpy/Python version information:
1.18.1 3.6.1 (default, Nov 11 2017, 14:50:47)
[GCC 4.9.2]
The text was updated successfully, but these errors were encountered: