8000 ENH: broadcast lstsq · Issue #8720 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

ENH: broadcast lstsq #8720

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
nschloe opened this issue Mar 1, 2017 · 16 comments · Fixed by #9986
Open

ENH: broadcast lstsq #8720

nschloe opened this issue Mar 1, 2017 · 16 comments · Fixed by #9986
Labels
01 - Enhancement component: numpy.linalg triaged Issue/PR that was discussed in a triage meeting

Comments

@nschloe
Copy link
Contributor
nschloe commented Mar 1, 2017

Many linalg functions are already broadcastable, lstsq isn't.

A workaround is via svd which is already broadcasted:

u, s, v = numpy.linalg.svd(A, full_matrices=False)
uTb = numpy.einsum('ijk,ij->ik', u, b)
xx = numpy.einsum('ijk, ij->ik', v, uTb / s)
@eric-wieser
Copy link
Member

That workaround doesn't look right to me. A truly broadcastable solution would have ...s in those einsums

@nschloe
Copy link
Contributor Author
nschloe commented Mar 1, 2017

Depends on the ordering perhaps. Check

A = numpy.random.rand(7, 3, 2)
b = numpy.random.rand(7, 3)
for k in range(7):
    x, res, rank, sigma = numpy.linalg.lstsq(A[k], b[k])
    print(x)

print

u, s, v = numpy.linalg.svd(A, full_matrices=False)
uTb = numpy.einsum('ijk,ij->ik', u, b)
xx = numpy.einsum('ijk, ij->ik', v, uTb / s)
print(xx)

@eric-wieser
Copy link
Member
eric-wieser commented Mar 1, 2017

My point being that your code only broadcasts to a stack (1d) of matrices, not a grid (2d) or higher. The other linalg functions that broadcast do so for any dimension. I would guess that '...jk,...j->...k' is what you want

Either way, this would be much better to fix in the C code by moving lstsq into the C code. Unfortunately, lstsq is currently not implemented quite right in python, due to a broken lapack bundled with numpy #8376

@eric-wieser
Copy link
Member
eric-wieser commented Mar 7, 2017

Working on this at the moment in https://github.com/eric-wieser/numpy/tree/lstsq-gufunc, which depends on both #3861 and #8649 (edit: both now merged!)

What shape should residuals be? Right now its size depends on the values in A, not just its size

@mhvk
Copy link
Contributor
mhvk commented Mar 25, 2017

What shape should residuals be? Right now its size depends on the values in A, not just its size

Tricky. I fear the only sensible thing may be to let the shape depend on b only (which annoyingly is ordered opposite of what you'd really want for broadcasting) and then use np.nan for the rank-too-small-cases. You may be stuck with creating a new function, and turn this into a to-be-deprecated front-end...

@eric-wieser
Copy link
Member
eric-wieser commented Mar 25, 2017

I fear the only sensible thing may be to let the shape depend on b only

I think depending on the shapes of both A and B is reasonable. The problem is the distinction between the true rank, and the usual rank of a matrix of a given shape.

So the only case in which behaviour would change, is that when passed a matrix with M > N but rank < N, it would return a (K,)-shape array of nans, not a (0,)-shape array.

I can't think of any code that would be able to handle that (0,) shape array in anything but if residuals.

Edit: if residuals is deprecated as of #9718 anyway.

@eric-wieser
Copy link
Member
eric-wieser commented Nov 8, 2017

and then use np.nan for the rank-too-small-cases.

Actually, I think 0 makes more sense, and might be what happens anyway

You may be stuck with creating a new function,

Suggested name? least_sq?

@mhvk
Copy link
Contributor
mhvk commented Nov 8, 2017

Maybe just least_squares? And then you will use it inside lstsq, right?

@eric-wieser
Copy link
Member

This is now super close to achievable - the gufunc under the hood is fully vectorized as of #10890 - we just need to decide if we can change residuals (a compatibility break, maybe)

@mhvk
Copy link
Contributor
mhvk commented Apr 20, 2018

@eric-wieser - I'm actually a bit confused: if in linalg.lstsq, one just changes instances of axis indices on a and b from 0/1 to -2/-1, wouldn't the broadcasting already work exactly as the code is right now? (I'm not saying this solves everything, as I'd like to run quantities through it, but at least it will broadcast, no?)

@eric-wieser
Copy link
Member
eric-wieser commented Apr 20, 2018

The issue is these lines:

numpy/numpy/linalg/linalg.py

Lines 2049 to 2051 in 12114c7

# as documented
if rank != n or m <= n:
resids = array([], result_real_t)

There's no meaninful way to broadcast this, because rank becomes an array.

Arguably it wasn't ever meaningful for residuals.shape to be (0,)

@mhvk
Copy link
Contributor
mhvk commented Apr 20, 2018

Ah, duh, of course rank can be different for every matrix. So, I guess the options are to

  1. Just take out those lines; unlikely anybody would really check the residuals that way anyway (and arguably it was a design mistake to ever do this);
  2. Return an empty array if np.all(rank != n) or m <= n - which is backward compatible (but has little else going for it);
  3. Provide some way to select between 1 and 2.

@eric-wieser
Copy link
Member

Tagging with triage-review to make a decision about the three options proposed by @mhvk, since this was brought up again in #4438 (comment).

@mhvk
Copy link
Contributor
mhvk commented Mar 18, 2020

I'd go for option 1, maybe relatively soon, so that we can see if downstream tests break (quite unlikely, I think).

@eric-wieser
Copy link
Member

Option 1 is implemented in #15777.

@mattip
Copy link
Member
mattip commented Mar 25, 2020

@charris: could you take a look at #15777?

@mattip mattip added triaged Issue/PR that was discussed in a triage meeting and removed triage review Issue/PR to be discussed at the next triage meeting labels Mar 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
01 - Enhancement component: numpy.linalg triaged Issue/PR that was discussed in a triage meeting
Projects
None yet
4 participants
0