8000 ENH: handle empty matrices in qr decomposition by convexset · Pull Request #11593 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

ENH: handle empty matrices in qr decomposition #11593

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Jul 31, 2018

Conversation

convexset
Copy link
Contributor

handle empty matrices in qr decomposition; added tests

if _isEmpty2d(a):
k = min(m, n)
if mode == 'reduced':
# ‘reduced’ : returns q, r with dimensions (M, K), (K, N) (default)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The forward quotes are unicode, causing a test failure

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh... actually intended to remove those lines; leaving them there and fixing the quotes.

@mattip
Copy link
Member
mattip commented Jul 20, 2018

Needs at least a passing mention in the docs, and in the release notes. It would be cool to add the gufunc signatures for all the linalg routines to the docstrings too, but that is probably a different issue.

@mattip mattip added 01 - Enhancement component: numpy.linalg 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes labels Jul 20, 2018
@convexset
Copy link
Contributor Author

My sense would be that supporting empty matrices should be the expectation, and that non-support should be what is flagged. It seems that this is consistent with dot, tensordot and other operations.

@convexset
Copy link
Contributor Author

But that said, what should be documented and where?

@mattip
Copy link
Member
mattip commented Jul 20, 2018

doc/release/1.16.0-notes.rst under Changes or Improvements.

@convexset
Copy link
Contributor Author

Done.

Previously, a ``LinAlgError`` would be raised when empty matrix ("flat"
or "skinny") is passed in. This has been fixed so that outputs of
appropriate shapes are returned for the various modes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this and the similar note about lstsq for your related work in #11594 could be combined since they are both about handling empty matrices for linalg functions, but maybe that's a pain to coordinate between 2 separate PRs & could be done after I suppose.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separated them because for lstsq, I was working in a strange way (site-packages). For qr it was a more straightforward "change and run tests".

@tylerjereddy tylerjereddy changed the title handle empty matrices in qr decomposition ENH: handle empty matrices in qr decomposition Jul 20, 2018
Copy link
Member
@eric-wieser eric-wieser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't allocate the output arrays in two places - the arrays allocated in the normal case are already the right size and shape. It should be sufficient to add q[...] = eye, to correct for anything lapack fails to do.

@convexset
Copy link
Contributor Author

What do you mean in two places?

@eric-wieser
Copy link
Member

Once in the size == 0 path, and once in the regular path. The only place that 0 needs to be a special case is when actually filling the arrays - the existing code already makes all the arrays the right shape, it just sometimes forgets to fill them, leaving them uninitialized

@convexset
Copy link
Contributor Author

Looking further into it, I also note that the raw and economic modes don't match up with the documentation.

    if mode == 'raw':
        return a, tau

    if mode == 'economic':
        if t != result_t :
            a = a.astype(result_t, copy=False)
        return wrap(a.T)

But I'm not inclined to change that (even though the tests would pass) because it might break something and benefit nothing (given deprecation).

…case of empty arrays (they do not play well with empty arrays)
@codecov-io
Copy link
codecov-io commented Jul 29, 2018

Codecov Report

❗ No coverage uploaded for pull request base (master@977431a). Click here to learn what that means.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master   #11593   +/-   ##
=========================================
  Coverage          ?    85.7%           
=========================================
  Files             ?      327           
  Lines             ?    82002           
  Branches          ?        0           
=========================================
  Hits              ?    70281           
  Misses            ?    11721           
  Partials          ?        0
Impacted Files Coverage Δ
numpy/linalg/linalg.py 91.05% <100%> (ø)
numpy/linalg/tests/test_linalg.py 97.11% <100%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 977431a...ff9063c. Read the comment docs.

@eric-wieser
Copy link
Member

Patch looks pretty good now, thanks - I do wonder if we can just call the lapack functions though - often the only issue is that we didn't read the docs carefully on how to pass size-0 parameters

@eric-wieser eric-wieser dismissed their stale review July 29, 2018 16:39

New approach taken looks good

@convexset
Copy link
Contributor Author

Let me take another swing at that. I just happened to be looking at the LAPACK documentation earlier today and I'm going have another look. I don't like "not calling anything".

if results['info'] != 0:
raise LinAlgError('%s returns %d' % (routine_name, results['info']))

# compute q
lwork = int(abs(work[0]))
lwork = max(1, int(abs(work[0])))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be surprised if this is necessary for the work spaces, since lapack should be populating them with whatever it really needs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not. But my problem was having to pass in lwork=0 which was invalid (it had to be at least... oh shit I may have made a mistake...)

http://www.netlib.org/lapack/explore-3.1.1-html/dgeqrf.f.html
http://www.netlib.org/lapack/explore-3.1.1-html/zungqr.f.html

Tests pass but I'm going to go fix it...

@@ -875,14 +875,14 @@ def qr(a, mode='reduced'):
# calculate optimal size of work data 'work'
lwork = 1
work = zeros((lwork,), t)
results = lapack_routine(m, n, a, m, tau, work, -1, 0)
results = lapack_routine(m, n, a, max(1, m), tau, work, -1, 0)
Copy link
Member
@eric-wieser eric-wieser Jul 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exactly the fix I was expecting :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... yes... and (the late) Gene Golub is not going to tell me to give back my e-mail account. (I got my gmail invite from him.)

Copy link
Member
@eric-wieser eric-wieser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect you'll still need a case to populate the identity matrix when the input is empty but the output is not

assert_almost_equal(np.dot(q, r), a)
assert_almost_equal(np.dot(q.T.conj(), q), np.eye(k))
assert_almost_equal(np.triu(r), r)
#
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for empty comments - just leave a blank line

@@ -1583,8 +1583,45 @@ def check_qr(self, a):
assert_almost_equal(r2, r1)

def test_qr_empty(self):
a = np.zeros((0, 2))
assert_raises(linalg.LinAlgError, linalg.qr, a)
for m, n in [(3, 0), (0, 3)]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add (0, 0) here too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup

assert_(r.shape == (k, n))
assert_almost_equal(np.dot(q, r), a)
assert_almost_equal(np.dot(q.T.conj(), q), np.eye(k))
assert_almost_equal(np.triu(r), r)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure these three tests are interesting - in these tests k == 0, a.size == 0, and r.size == 0 - so there are no values to compare anyway

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll drop some of the ones that are implied

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually - just call self.check_qr(a), and it will do all the work for you

@eric-wieser
Copy link
Member

See my comment above - almost all of your test can be replaced with a call to self.check_qr(a)


self.check_qr(a)

r = np.linalg.qr(a, mode='r')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this already covered by check_qr?

@convexset
Copy link
Contributor Author

Great. Non-ugly patches are good.

@eric-wieser eric-wieser removed the 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes label Jul 31, 2018
@eric-wieser
Copy link
Member

@convexset: I think you might have missed my one lingering comment above - one of your tests seems to still duplicate check_qr - or am I missing something?

@eric-wieser eric-wieser added this to the 1.16.0 release milestone Jul 31, 2018
@convexset
Copy link
Contributor Author

@eric-wieser: it's not exactly duplicating check_qr... but it is implied...

@eric-wieser
Copy link
Member

Can you elaborate on in what way it's not duplicated? Does your test test more than what check_qr tests? If so, can you move your extra checks to within check_qr?

r = np.linalg.qr(a, mode='r')
assert_equal(r.dtype, a_dtype)
assert_(isinstance(r, a_type))
assert_equal(r.shape, (k, n))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shape test should probably be in check_qr anyway

Copy link
Contributor Author
@convexset convexset Jul 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's there implicitly.

        # mode == 'reduced'
        q1, r1 = linalg.qr(a, mode='reduced')
        # ...
        assert_(r1.shape == (k, n))
        # ...

        # mode == 'r'
        r2 = linalg.qr(a, mode='r')
        # ...
        assert_almost_equal(r2, r1)

So I took that bit out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. I think the explicit shape check you had here was a little clearer, but it's not super important.

assert_(isinstance(r, a_type))
assert_equal(r.shape, (k, n))

h, tau = np.linalg.qr(a, mode='raw')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test can stay - for whatever reason, it seems check_qr decided not to test it

(4, 0, 1),
(4, 0, 2),
(4, 2, 0),
(0, 0, 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theses are triplets, but you only have two variables. I assume you just want [(4, 0), (0, 4), (0, 0)]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh shit...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

@eric-wieser
Copy link
Member

This comment still applies

Copy link
Member
@eric-wieser eric-wieser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commits need squashing, but I can do that when I merge. Let's hope tests pass!

@eric-wieser eric-wieser merged commit 8fdc446 into numpy:master Jul 31, 2018
@eric-wieser
Copy link
Member

Thanks for your persistence and your first contribution, @convexset !

@convexset convexset deleted the handle-edge-cases branch August 1, 2018 05:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0