[MRG] test against numpy dev #5609

ogrisel · 2015-12-15T21:27:58Z

This is work in progress for a fix for #5379 building upon the work done for numpy/numpy#6493.

~~/The current scipy test fails with:~~

======================================================================
ERROR: test_spearmanr (test_mstats_basic.TestCompareWithStats)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/virtualenv/python3.5.0/lib/python3.5/site-packages/scipy/stats/tests/test_mstats_basic.py", line 974, in test_spearmanr
    r, p = stats.spearmanr(x, y)
  File "/home/travis/virtualenv/python3.5.0/lib/python3.5/site-packages/scipy/stats/stats.py", line 3314, in spearmanr
    t = rs * np.sqrt((n-2) / ((rs+1.0)*(1.0-rs)))
RuntimeWarning: invalid value encountered in sqrt

======================================================================
ERROR: test_tie1 (test_stats.TestCorrSpearmanrTies)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/virtualenv/python3.5.0/lib/python3.5/site-packages/scipy/stats/tests/test_stats.py", line 673, in test_tie1
    sr = stats.spearmanr(x, y)
  File "/home/travis/virtualenv/python3.5.0/lib/python3.5/site-packages/scipy/stats/stats.py", line 3314, in spearmanr
    t = rs * np.sqrt((n-2) / ((rs+1.0)*(1.0-rs)))
RuntimeWarning: invalid value encountered in sqrt

----------------------------------------------------------------------

~~It might be revealing a regression in numpy master or it might be caused by something that I don't understand.~~

~~I am wondering if we should put this numpy dev entry of the scipy build matrix in allow_failure.~~

~~I have not had the opportunity to test the upload section of this PR yet as the tests fail.~~

rgommers · 2015-12-16T07:19:57Z

Those tests error because they give warnings that should be filtered out, and numpy master converts DeprecationWarning and RuntimeWarning to errors to nudge us to do that filtering.

I don't see these warnings locally, but it's easy to filter with warnings.filterwarnings, there's a lot of examples in test_stats.py that can be copied.

rgommers · 2015-12-16T07:24:29Z

allow_failure may make sense, so we don't get all PR statuses red if numpy master has a temporary issue. I can't find an example of what a failed allow_failure would look like right now though.

ogrisel · 2015-12-17T16:57:20Z

I can reproduce the runtime warning with numpy master. Will have to investigate.

rgommers · 2015-12-19T12:17:14Z

It's not at all visible now that the test actually failed. I suspect that no one will now look at the test result against numpy master, and hence it doesn't help much. So I guess just failing is better - if numpy master has a regression that takes a while to fix, we can temporarily disable the test run. Does that make sense?

ogrisel · 2015-12-24T15:36:01Z

@rgommers ok I removed the allow_failures configuration. Furthermore I changed the spearmanr code to not launch warnings on seemingly the valid inputs of test_stats. However that might also silence warnings that one might have considered legit when the inputs are invalid.

ogrisel · 2015-12-28T10:17:24Z

I have amended my commit to use the shorter travis-dev-wheels.scipy.org URL instead of the long rackcdn.com URL. If the tests pass, I think this is ready for merge. The real upload mechanism can only be tested once we merge this PR to master though.

ev-br · 2015-12-28T19:02:04Z

While it's a minor quibble, I do think it's better to filter the warnings in the tests rather than the library code (#5621 (comment) lists the exact warnings).

larsoner · 2015-12-28T19:06:29Z

@ev-br you mean catch the warnings using e.g. warnings.catch_warnings(record=True) and then check to make sure that any caught warnings have one of the accepted messages (or some accepted message forms)?

ev-br · 2015-12-28T19:32:04Z

8000

Or just this: https://github.com/scipy/scipy/blob/master/scipy/stats/tests/test_stats.py#L153

njsmith · 2015-12-28T21:51:49Z

IIUC the specific issue here is that scipy is using numpy.testing, and numpy.testing actually has different warning handling behavior depending on whether numpy is a released version or not. This seems kinda wrong, though -- surely what we actually want is that numpy.testing should act differently depending on whether the code being tested is a released version or not. For numpy itself it doesn't make any difference because we're using numpy.testing to test numpy, but when numpy.testing is testing scipy then it does make a difference. Maybe we should fix that?

(This is of course separate from the more general issue of how we want to handle breakage caused by numpy skew, allow_failure, etc.)

ogrisel · 2015-12-29T09:29:10Z

@ev-br IMHO, scipy.stats.spearmanr should not raise any warning for valid input.

With numpy 1.10.2 the following does not raise a warning:

>>> from scipy import stats
>>> stats.spearmanr([1.0, 2.0, 3.0, 4.0], [1.0, 2.0, 2.0, 3.0])
SpearmanrResult(correlation=0.94868329805051377, pvalue=0.051316701949486225)

while it does with numpy master.

The problem with the fact that numpy.testing turns warnings into exception is a separate issue but at least it revealed a problem in the scipy code base IMHO.

ogrisel · 2015-12-29T09:50:03Z

For reference here is the behavior observed in a shell (outside of the testing environment) with numpy master:

>>> import numpy as np
>>> from scipy import stats
>>> stats.spearmanr([1.0, 2.0, 3.0, 4.0], [1.0, 2.0, 2.0, 3.0])
/Users/ogrisel/code/scipy/scipy/stats/stats.py:3314: RuntimeWarning: invalid value encountered in sqrt
  t = rs * np.sqrt((n-2) / ((rs+1.0)*(1.0-rs)))
/Users/ogrisel/code/scipy/scipy/stats/_distn_infrastructure.py:1748: RuntimeWarning: invalid value encountered in greater
  cond1 = (scale > 0) & (x > self.a) & (x < self.b)
/Users/ogrisel/code/scipy/scipy/stats/_distn_infrastructure.py:1748: RuntimeWarning: invalid value encountered in less
  cond1 = (scale > 0) & (x > self.a) & (x < self.b)
8000

/Users/ogrisel/code/scipy/scipy/stats/_distn_infrastructure.py:1749: RuntimeWarning: invalid value encountered in less_equal
  cond2 = cond0 & (x <= self.a)
SpearmanrResult(correlation=0.94868329805051388, pvalue=0.051316701949486114)

I find this really confusing for the user. The input is valid. The fact that the internals of spearmanr compute stuff with non-finite quantities on the diagonal of the correlation matrix is not the users problem as those quantities are not returned as part the output of the function call.

Here are the details of an ipdb session on numpy master:

ipdb> p rs
array([[ 1.       ,  0.9486833],
       [ 0.9486833,  1.       ]])
ipdb> p 1.0 - rs
array([[  0.00000000e+00,   5.13167019e-02],
       [  5.13167019e-02,  -2.22044605e-16]])
ipdb> p (n-2) / ((rs+1.0)*(1.0-rs))
array([[             inf,   2.00000000e+01],
       [  2.00000000e+01,  -4.50359963e+15]])

The close to zero negative value on the diagonal looks like a usual rounding issue to me.

ogrisel · 2015-12-29T09:57:29Z

Here is an alternative fix that does not change the warning handling:

diff --git a/scipy/stats/stats.py b/scipy/stats/stats.py
index 505fe92..36b44ca 100644
--- a/scipy/stats/stats.py
+++ b/scipy/stats/stats.py
@@ -3311,7 +3311,8 @@ def spearmanr(a, b=None, axis=0, nan_policy='propagate'):

     olderr = np.seterr(divide='ignore')  # rs can have elements equal to 1
     try:
-        t = rs * np.sqrt((n-2) / ((rs+1.0)*(1.0-rs)))
+        tmp = (n - 2) / ((rs + 1.0) * (1.0 - rs))
+        t = rs * np.sqrt(tmp.clip(0))
     finally:
         np.seterr(**olderr)

I can update my PR if you think this is the right fix.

njsmith · 2015-12-29T10:05:58Z

Yeah, the ideal would actually be that numpy.testing acts the same regardless of whether you're using a dev version of numpy... and that tests on scipy dev versions would always convert warnings into errors, regardless of what version of numpy they were running against.

ogrisel · 2015-12-29T10:12:05Z

@njsmith I agree. The behavior of the numpy warnings could be configured in the runtests.py utility, maybe with additional commandline flags.

ogrisel · 2015-12-29T10:16:12Z

I went on and pushed the clip-based fix. Let me know what you think about it.

ev-br · 2015-12-29T14:17:16Z

OK, spearmanr glitch is somewhat parallel to uploading wheels. I'll merge this to ~~get~~ keep the latter ball rolling, and we can keep discussing the former, if needed. Thanks Olivier, all.

[MRG] test against numpy dev

ogrisel · 2015-12-29T17:09:52Z

Thanks!

njsmith · 2015-12-30T02:12:48Z

See also numpy/numpy#6901, which changes numpy's test-runner to always default to release mode.

Our test-runner's raise_warning mode traditionally has varied depending on whether we have a development or release version of numpy: for development versions we raise on warnings, and for release versions we don't. This is all very sensible... *if* you're running numpy's test suite. But our test-runner is also used by other packages like scipy, and it doesn't make sense for scipy's raise_warning mode to vary depending on whether *numpy* is a development or release version. (It should vary depending on whether the scipy-under-test is a development or release version.) So this commit moves the numpy-version-dependent raise_warning logic out of the generic NoseTester class and into numpy-specific code. (See scipy/scipy#5609 for more discussion.)

We want CI tests to fail when they use deprecated functionality or otherwise raise spurious warnings, so enable raise_warnings="develop" for development builds. See scipygh-5609 for related discussion, and numpy/numpy#6901 for a related change in numpy.

rgommers · 2015-12-30T09:40:08Z

Great to have this merged. I see that Python 3.5 wheels are appearing at http://travis-dev-wheels.scipy.org/. Something is not quite right there though - only the very latest wheel should be kept. Now pip will have no way of knowing what the most recent one is right?

njsmith · 2015-12-30T10:23:06Z

IIUC it keeps the most recent 5, and inserts a timestamp into the version
number so that pip can sort them properly.
On Dec 30, 2015 1:40 AM, "Ralf Gommers" notifications@github.com wrote:

Great to have this merged. I see that Python 3.5 wheels are appearing at
http://travis-dev-wheels.scipy.org/. Something is not quite right there
though - only the very latest wheel should be kept. Now pip will have no
way of knowing what the most recent one is right?

—
Reply to this email directly or view it on GitHub
#5609 (comment).

pv · 2015-12-30T10:24:58Z

Does pip sort the file names? Or is the part following + just ignored?

rgommers · 2015-12-30T10:33:14Z

IIRC pip just takes a random one. But maybe it got smarter recently?

If there are only 5 files, they'll all be recent. But it's still possible that you would fetch a numpy and a scipy wheel from the wheelhouse that are incompatible - there's a pretty decent chance of that happening actually at the moment that numpy adds something to its API.

njsmith · 2015-12-30T10:43:26Z

Interesting question! I can't find anything definitive in a quick search, but https://bitbucket.org/pypa/pypi-metadata-formats/issues/37/ordering-of-local-segments suggests that at some point the relevant people at least decided that pip should sort the local version specifiers in the way that we want.

ogrisel · 2015-12-30T17:53:54Z

Based on my tests pip picks up the latest version including some deterministic ordering on the local version segment which is why I made wheelhouse_uploader insert an upload timestamp as a prefix for the local segment (in the wheel filename only).

By default it keeps the last 5 most recent dev builds (per-project / python version / platform tags). So this should all be fine.

rgommers · 2015-12-30T18:07:35Z

okay, thanks for confirming @ogrisel

Our test-runner's raise_warning mode traditionally has varied depending on whether we have a development or release version of numpy: for development versions we raise on warnings, and for release versions we don't. This is all very sensible... *if* you're running numpy's test suite. But our test-runner is also used by other packages like scipy, and it doesn't make sense for scipy's raise_warning mode to vary depending on whether *numpy* is a development or release version. (It should vary depending on whether the scipy-under-test is a development or release version.) So this commit moves the numpy-version-dependent raise_warning logic out of the generic NoseTester class and into numpy-specific code. (See scipy/scipy#5609 for more discussion.)

rgommers added the maintenance Items related to regular maintenance tasks label Dec 16, 2015

ogrisel force-pushed the travis-wheel branch 3 times, most recently from e974097 to df87dd4 Compare December 17, 2015 16:27

ogrisel force-pushed the travis-wheel branch from df87dd4 to 577ecb9 Compare December 24, 2015 14:51

ogrisel changed the title ~~[WIP] test against numpy dev~~ [MRG] test against numpy dev Dec 24, 2015

MAINT: run travis tests against numpy dev

29a9999

ogrisel force-pushed the travis-wheel branch from 577ecb9 to 8678b4b Compare December 28, 2015 10:15

ogrisel mentioned this pull request Dec 28, 2015

Add a build step at the end of .travis.yml that uploads working wheels to a known location #5379

Closed

ev-br mentioned this pull request Dec 28, 2015

Failures on 0.17.x with MKL #5621

Closed

FIX: invalid warnings in spearmanr

5c18776

ogrisel force-pushed the travis-wheel branch from 8678b4b to 5c18776 Compare December 29, 2015 10:15

ev-br added a commit that referenced this pull request Dec 29, 2015

Merge pull request #5609 from ogrisel/travis-wheel

eecc0ed

[MRG] test against numpy dev

ev-br merged commit eecc0ed into scipy:master Dec 29, 2015

ev-br added this to the 0.18.0 milestone Dec 29, 2015

maniteja123 mentioned this pull request Dec 29, 2015

Nose tests failing on current master scikit-learn/scikit-learn#6096

Closed

njsmith mentioned this pull request Dec 30, 2015

[FIX] fix NoseTester's raise_warning default numpy/numpy#6901

Merged

njsmith mentioned this pull request Dec 30, 2015

[TST] raise on warnings when running dev version tests #5641

Closed

ev-br mentioned this pull request Dec 30, 2015

TST: do one of the test runs on TravisCI with latest numpy master. #4729

Closed

argriffing mentioned this pull request Dec 31, 2015

On one of the TravisCI configs, treat numpy deprecation warnings as errors or failures. #4327

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[MRG] test against numpy dev #5609

[MRG] test against numpy dev #5609

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[MRG] test against numpy dev #5609

[MRG] test against numpy dev #5609

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!