8000 TST: Fix various incorrect linalg tests by eric-wieser · Pull Request #8369 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

TST: Fix various incorrect linalg tests #8369

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Feb 21, 2017

Conversation

eric-wieser
Copy link
Member
@eric-wieser eric-wieser commented Dec 12, 2016

The test cases for linalg supposedly include an empty test case. However, this does not actually test the right thing, since this case is supposed to be square, but is not.

This is because atleast_2d(array([], dtype=double)) returns an array of shape (1, 0), not (0, 0)

Not only that, but when a test is marked as working for both square and non-square inputs, only the square inputs are run, because the non-square ones are shadowed by a copy-and-paste-o. As a result, the non-square inputs don't all even make sense, because they were never being run.


After fixing the tests, fixes a bunch of minor bugs exposed by the updated tests, where instead of LinAlgError being thrown, some much-more-internal error was escaping, that hid the cause of the problem.

@eric-wieser eric-wieser changed the title TST: Correct empty square test case to actually be square, add replac… TST: Correct "empty" square test case to actually be square Dec 12, 2016
@eric-wieser eric-wieser changed the title TST: Correct "empty" square test case to actually be square TST: Fix various incorrect linalg tests Dec 12, 2016
@eric-wieser
Copy link
Member Author

Should LinalgNonsquareTestCase be a subclass of LinalgTestCase?

@@ -338,7 +338,7 @@ def test_sq_cases(self):

class LinalgNonsquareTestCase(object):

def test_sq_cases(self):
Copy link
Member Author
@eric-wieser eric-wieser Dec 12, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This meant that none of the non-square test cases in TestLstsq were actually being run...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that they are run, they fail, because

>>> import numpy as np
>>> a = np.random.rand(8, 11)
>>> b = np.random.rand(11)
>>> np.linalg.lstsq(a, b)

doesn't make any sense

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has been fixed.

@eric-wieser eric-wieser force-pushed the fix-broken-test branch 3 times, most recently from bcd1c2a to c1e3c2e Compare December 13, 2016 02:41
raise RuntimeError(
'LAPACK did not compute workspace sizes: len(work) = {}, len(iwork) = {}'
.format(lwork, liwork)
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a known and fixed (9 Jun 2010) bug in LAPACK: http://icl.cs.utk.edu/lapack-forum/archives/lapack/msg00899.html/. The scipy.linalg code has a very similar test, that decides to just error out if this bug is present in LAPACK.

This condition is not hit on my machine (Windows, Intel MKL).

However, this fails on Appveyor. What version of LAPACK is being used there, and do we really need to support a version released before python 2.7.0?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out that Accelerate based problems creep up everywhere, I've recently bitten by the same lapack version problem -> Scipy discussion scipy/scipy#6051

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ilayn : It turns out the culprit here is the LAPACK 3.0.0 that is bundled in np.linalg.lapack_lite, which I'm looking at in #8376

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that's indeed prehistoric.

@pv
Copy link
Member
pv commented Dec 19, 2016 via email

@eric-wieser
Copy link
Member Author
eric-wieser commented Dec 19, 2016

@pv: I've updated the description. There was a little scope creep here from the commits after 72a3ca4 - I'll cut those back in the next day or two to remedy that. Unfortunately, the only debugging tool I had available to me for the problem I was seeing was the CI servers

# `liwork` for us. But that only works if our version of lapack does
# not have this bug:
# http://icl.cs.utk.edu/lapack-forum/archives/lapack/msg00899.html
# Lapack_lite does have that bug...
nlvl = max( 0, int( math.log( float(min(m, n))/2. ) ) + 1 )
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to LAPACK, this should be:

SMLSIZ is returned by ILAENV and is equal to the maximum
size of the subproblems at the bottom of the computation
tree (usually about 25), and
NLVL = MAX( 0, INT( LOG_2( MIN( M,N )/(SMLSIZ+1) ) ) + 1 )

For whatever reason, we seem to have decided that log(2) = SMLSIZ = 1, which seems best described as "false". Either way, this is a can of worms, and not one that I think this PR should be opening.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we open an issue about this? Is there some example for when this goes bad? (I really hate bugs that might silently create incorrect results, and if this is the case, we should give it some priority)

Copy link
Member Author
@eric-wieser eric-wieser Jan 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a patch in the works that uses lapack's internal mechanism to calculate this correctly. Unfortunately, there's a bug in the modified version of lapack bundled with numpy that makes this work. We'd need to regenerate the lapack c code from a newer-but-not-so-new-to-break-f2c release of lapack (#8376). The next step of fixing this is to actually get the generator running again ( #8381 ).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, I have no idea of these things, was just worried this might be forgotten :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guess it would do no harm to open an issue. My working tree was branching way too much for a bunch of problems discovered while trying to fix the problems, and I think the best call is to sit tight and wait for PR merge/rejection, rather than increasing the amount of rebases I need to do each time!

atleast_2d(array([], dtype=double)) returns an array of shape (1, 0), not (0, 0)
SVD was previously being too generous with accuracy on doubles. This allows
pinv, the test for which was previously was too imprecise, to pass.
Allows each individual function to inspect the flags of a certain test, and
decide whether an exception will be thrown
These testcases were never used in the first place, due to a typo. This makes
their dimensions match the order of the other test cases, even though those
also did not run
Throw LinAlgError instead with an appropriate message
…m math.log

We could make the log conditional on its argument being non-zero, but that entire expression is wrong anyway.
We could omit that calculation entirely and have LAPACK calculate it for us, but the routine in LAPACK is wrong anyway
We could upgrade the version of lapack shipped in lapack_lite, but the tool to do that is wrong anyway.

Let's leave that can of worms for another time, and just improve the error message for now.
Throw LinAlgError instead with an appropriate message
@eric-wieser
Copy link
Member Author
eric-wieser commented Dec 19, 2016

@pv: Squashed, trimmed-in-scope, and passing. Let me know if anything is still unclear

@@ -61,30 +61,44 @@ def get_rtol(dtype):

class LinalgCase(object):

def __init__(self, name, a, b, exception_cls=None):
def __init__(self, name, a, b, flags=frozenset()):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The class should be documented, at least for the constructor. This is especially so now with the rather generic flags.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not a simple list instead of a frozenset? I don't see that an immutable set really matters here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree on documentation, will do that at some point.

sets make the filtering easier than lists, and frozenset prevents individual tests from modifying the list of tests accidentally.

I'm not saying these are particularly good arguments, but they're what I came up with at the time

def test_nonsq_cases(self):
_check_cases(self.do,
require=CaseFlags.none,
exclude=CaseFlags.generalized | CaseFlags.square | CaseFlags.hermitian | CaseFlags.empty)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above. I think these would all be clearer as a list of strings and take up less room to boot.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My reason to avoid a list of strings is that making a typo in a string would lead to a test simply not running, rather than an AttributeError

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My counter argument would be that it isn't important ;) If you were writing fly-by-wire software that had to be 100% correct at all times, then using an uncommon python feature would be justified. In fact, seeing such usage makes me stop and think and track down why it is important. But here I don't think it matters and I've burned up a couple of precious grey cells to no purpose. The simplest approach that works is what I would chose in this case because it makes the code that much more accessible, and likely more error free as it is easier to check.

def do(self, a, b):
def do(self, a, b, flags):
if flags & CaseFlags.empty:
assert_raises(LinAlgError, linalg.eigvalsh, a, 'L')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So checking for an error is the default (no flags). That is somewhat unintuitive.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Darn, It turns out empty is a confusing name. Errors only occur when an array has size=0.

@charris
Copy link
Member
charris commented Jan 23, 2017

I think the tests mods are overly complicated, which makes them more difficult to follow, which makes them harder to maintain.

Note that the commits are not squashed.

@eric-wieser
Copy link
Member Author
eric-wieser commented Feb 9, 2017

@charris I've tried to simplify things now. See the fixup commit above for the diff (I figured it'd be easier to review before applying it). Sorry for the delay on this one



#
# Gufunc test cases
#
def _make_generalized_cases():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there really an advantage in having a function with no arguments that operates on a global and is called once? I think the original is cleaner and easier to follow.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoiding global namespace pollution - stops anything accidentally using these local variables in a test. Also handy for code folding, sort of

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough.

GENERALIZED_NONSQUARE_CASES,
GENERALIZED_HERMITIAN_CASES):

def _make_strided_cases():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above. I don't see the use in making this a function.

"""
for case in CASES:
# filter by require and exclude
if case.tags & require != require:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this is a bit clumsy for a lookup, although I suspect it would require a redesign to improve it. Probably not worth the effort.

@charris
Copy link
Member
charris commented Feb 21, 2017

Couple of style comments. I note this hasn't been squashed, oversight or do you want help?

@eric-wieser
Copy link
Member Author
eric-wieser commented Feb 21, 2017

@charris: I partially squashed it, but out of preference of trying to keep logically separate changes, not into one commit. I could probably go a little further

I don't need help performing the squash, but would perha 1241 ps like some advice on what level to squash things to for numpy

@charris
Copy link
Member
charris commented Feb 21, 2017

I'd probably have squashed all the TST commits, then added the BUG fixes on top of that. However, as long as none of the commits errors out it is not a big problem. However, not all of the commit messages are up to snuff. For instance

TST: Correct test cases to actually make sense

These testcases were never used in the first place, due to a typo. This makes
their dimensions match the order of the other test cases, even though those
also did not run

Doesn't impart much information because it requires context, it is basically "fix some stuff", whatever "stuff" is. A virtuoso -- Al Viro of Linux fame -- would probably redo the series of commits into a logical rather than temporal series, but we aren't that demanding...

Anyway, I'm tempted to do a squash commit on this.

@charris charris merged commit 6b9a02a into numpy:master Feb 21, 2017
@charris
Copy link
Member
charris commented Feb 21, 2017

In it goes, thanks Eric.

@eric-wieser
Copy link
Member Author

Glad I didn't start rebasing those logically then!

@eric-wieser
Copy link
Member Author

Thanks, I'll rebase #8649 onto master next time I work on it (and after #8651), so it gets the more thorough testing

@eric-wieser
Copy link
Member Author

Oops, looks like we left a !fixup in there

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0