8000 np.linalg.solve documentation suggests ambigous behaviour · Issue #15349 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content
8000

np.linalg.solve documentation suggests ambigous behaviour #15349

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
DerWeh opened this issue Jan 19, 2020 · 8 comments · Fixed by #25914
Closed

np.linalg.solve documentation suggests ambigous behaviour #15349

DerWeh opened this issue Jan 19, 2020 · 8 comments · Fixed by #25914
Labels
01 - Enhancement 07 - Deprecation 62 - Python API Changes or additions to the Python API. Mailing list should usually be notified. component: numpy.linalg

Comments

@DerWeh
Copy link
Contributor
DerWeh commented Jan 19, 2020

The docstring shows of numpy.linalg.solve:

a : (…, M, M) array_like
b : {(…, M,), (…, M, K)}, array_like

which leads to ambiguous conclusions. Let's stick to first choice. This would mean for a.shape = (M, M), b.shape = (N, M) I should expect a result of x.shape = (N, M).

Of course, it cannot really work, as there is always an uncertainty, if b should be interpreted as (..., M) or (..., M, K) (after all, what should we do if M==K?). Nevertheless, the documentation should be more explicit, as new users will be easily confused.

I personally would remove the ambiguity completely. One idea would simply be adding an argument e.g. b_is_matrix: bool to specify which signature (..., M) or (..., M, K) should apply.
The other idea would be, to remove (..., M, K) completely, as it can be obtained trivially by passing b.T (or np.swapaxes(b, -1, -2)) and using broadcasting.

Reproducing code example:

import numpy as np

a = np.ones((1, 1))
b = b = np.ones((2, 1))
np.linalg.solve(a, b)

Error message:

ValueError: solve: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (m,m),(m,n)->(m,n) (size 2 is different from 1)

Numpy/Python version information:

1.18.1 3.6.1 (default, Nov 11 2017, 14:50:47)
[GCC 4.9.2]

@ZisIsNotZis
Copy link

The ... in a and b should be the same thing actually. Therefore,

a.shape = (M, M), b.shape = (N, M)

is illegal. a already indicates that ...=empty, therefore b have to start with M and might have an extra axis after this.

@eric-wieser
Copy link
Member
eric-wieser commented Jan 20, 2020

A hint from the comments:

# We use the b = (..., M,) logic, only if the number of extra dimensions
# match exactly
if b.ndim == a.ndim - 1:
gufunc = _umath_linalg.solve1
else:
gufunc = _umath_linalg.solve
.

This seems like potentially dangerous behavior.

@eric-wieser
Copy link
Member

Originates from 04ad33e#diff-6858bbac5c5c59bdddf245c69d6a547cR359, cc @pv.

@seberg
Copy link
Member
seberg commented Jan 27, 2020

Hmmm, that exact match may be a bit tricky. Another logic we have is to do it like matmul, and make it: {(M,), (..., M, K)}, array_like. I think that would be a subset of the current behaviour, so it could work. To be friendly to current users, we could add a keyword in principle I guess.

@DerWeh
Copy link
Contributor Author
DerWeh commented Jan 28, 2020
8000

Concerning documentation:

The ... in a and b should be the same thing actually. Therefore,

a.shape = (M, M), b.shape = (N, M)

is illegal. a already indicates that ...=empty, therefore b have to start with M and might have an extra axis after this.

Yes I get, that this is the current behavior. But I don't think that this is obvious for new users. I looked a bit around in the documentation, and there is hardly any usage of the Ellipsis.

However, it often used when indexing arrays as well as when using np.einsum.
My first naive line of thought when I read the docstring was, that it should behave like in np.einsum.
However, in np.einsum no exact match is needed.

As the Ellipsis has a different meaning in the documentation (be the same thing actually) as in code (any additional dimension, just should be broadcastable), further explanation is necessary.

But of course, it might be that I just think strangely and it's obvious to everyone else.

Concerning broadcasting:
@seberg this also sounds like a reasonable option. I don't have a strong opinion here, I just feel like the current version results for me often in ugly code, as I want to have the inputs broadcasted.

@DerWeh
Copy link
Contributor Author
DerWeh commented Jan 28, 2020

A hint from the comments:

# We use the b = (..., M,) logic, only if the number of extra dimensions
# match exactly
if b.ndim == a.ndim - 1:
gufunc = _umath_linalg.solve1
else:
gufunc = _umath_linalg.solve

.

This seems like potentially dangerous behavior.

I didn't really have time to put much thought into it, but this seems to me like it would be a good place to give the user control by a keyword argument, something like b_is_matrix or b_is_vector defaulting to None which would be the current behavior of guessing. If set explicitly by the user, we just do

 if  b_is_vector: 
     gufunc = _umath_linalg.solve1 
 else: 
     gufunc = _umath_linalg.solve

skipping the checks, whether it can be determined automatically.

@seberg
Copy link
Member
seberg commented Jan 28, 2020

@DerWeh do you want to make a PR with this, I think it would be good to explore at least. The correct route here will be to give a DeprecationWarning, we should not change this over night. Make it VisibleDeprecationWarning if you think new users are probably writing a bug (but I doubt that is true here).

@WarrenWeckesser
Copy link
Member

Another ship that has already sailed, but for future reference and for those unfamiliar with gufunc behavior: if we could use the gufunc solve1 (from numpy.linalg._umath_linalg) with signature (m,m),(m)->(m) as the public solve function, it would handle broadcasting of stacked matrices and stacked b vectors automatically. The only quirk would be that, by default, the "stacked" b vectors would have to be given with shape (K, M). That is, the default behavior would require the b vectors be in the rows of the input array. If a user's input is more naturally generated with shape (M, K), they could just transpose the input, or use the axes parameter of the gufunc. This would amount to @DerWeh's suggestion to "remove (..., M, K) completely, as it can be obtained trivially by passing b.T (or np.swapaxes(b, -1, -2)) and using broadcasting", and would expose the other features of the gufunc API as a bonus.

asmeurer added a commit to asmeurer/array-api-compat that referenced this issue Feb 22, 2024
NumPy's solve() does not handle the ambiguity around x2 being 1-D vector vs.
an n-D stack of matrices in the way that the standard specifies. Namely, x2
should be treated as a 1-D vector iff it is 1-dimensional, and a stack of
matrices in all other cases. This workaround is borrowed from
array-api-strict.

See numpy/numpy#15349 and
data-apis/array-api#285.

Note that this workaround only works for NumPy. CuPy currently does not
support stacked vectors for solve() (see
https://github.com/cupy/cupy/blob/main/cupy/cublas.py#L43), and the workaround
in cupy.array_api.linalg does not seem to actually function.
asmeurer added a commit to asmeurer/numpy that referenced this issue Mar 1, 2024
Previously the np.linalg.solve documentation stated:

a : (..., M, M) array_like
    Coefficient matrix.
b : {(..., M,), (..., M, K)}, array_like

however, this is inherently ambiguous. For example, if a has shape (2, 2, 2)
and b has shape (2, 2), b could be treated as a (2,) stack of (2,) column
vectors, in which case the result should have shape (2, 2), or as a single 2x2
matrix, in which case, the result should have shape (2, 2, 2).

NumPy resolved this ambiguity in a confusing way, which was to treat b as
(..., M) whenever b.ndim == a.ndim - 1, and as (..., M, K) otherwise.

A much more consistent way to handle this ambiguity is to treat b as a single
vector if and only if it is 1-dimensional, i.e., use

b : {(M,), (..., M, K)}, array_like

This is consistent with, for instance, the matmul operator, which only uses
the special 1-D vector logic if an operand is exactly 1-dimensional, and
treats the operands as (stacks of) 2-D matrices otherwise.

This updates np.linalg.solve() to use this behavior.

This is a backwards compatibility break, as any instance where the b array has
more than one dimension and exactly one fewer dimension than the a array will
now use the matrix logic, potentially returning a different result with a
different shape. The previous behavior can be manually emulated with something
like

np.solve(a, b[..., None])[..., 0]

since b as a (M,) vector is equivalent to b as a (M, 1) matrix (or the user
could manually import and use the internal solve1() gufunc which implements
the b-as-vector logic).

This change aligns the solve() function with the array API, which resolves
this broadcasting ambiguity in this way. See
https://data-apis.org/array-api/latest/extensions/generated/array_api.linalg.solve.html#array_api.linalg.solve
and data-apis/array-api#285.

Fixes numpy#15349
Fixes numpy#25583
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
01 - Enhancement 07 - Deprecation 62 - Python API Changes or additions to the Python API. Mailing list should usually be notified. component: numpy.linalg
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants
0