8000 ENH: enable OpenBLAS on windows. · Pull Request #9645 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

ENH: enable OpenBLAS on windows. #9645

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from Sep 30, 2017
Merged

ENH: enable OpenBLAS on windows. #9645

merged 13 commits into from Sep 30, 2017

Conversation

ghost
Copy link
@ghost ghost commented Sep 2, 2017

Previously, #9431 allowed distutils to make use of gfortran to create fortran wrappers that worked with MSVC. For a quick review:

  • Distutils detects objects that cannot be linked with an MSVC lib.
  • Distutils compiles unlinkable objects into a DLL
  • Distutils links Python extension against the DLL
  • Distutils writes __config__ that includes the DLLs in the PATH.

Here, we use distutils the link against the "unlinkable object" openblas.a. Distutils compiles openblas with gfortran as a DLL and then links against it. As a result of using OpenBLAS, performance restrictions on windows are addressed.

Because gfortran enables BLAS and fortran functionality, some new tests are run that were previously skipped. Some of these newly run tests fail. Such tests are marked as "known failures" on windows because they were previously untested on the CI, thus they are not regressions.

@ghost
Copy link
Author
ghost commented Sep 2, 2017

@charris Is pytest supported?


os: Visual Studio 2015
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still want to test MSVC as well, it needs to stay working. So instead of removing it all, I suggest to use a build matrix with a mix of the two build configs (without increasing the total number of builds).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which MSVC are you concerned about? This configuration should test 2008, 2010, and 2015. The "os" has nothing to do with that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ralf - I'm guessing you mean a lapack-lite build, without OpenBLAS?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I didn't mean lapack-lite, that's okay to only test on Linux I think. The install and build_script part here explicitly add MINGW to the path and in the whole file there are only references to Cygwin and MinGW. So if MSVC is still being used, it's very much non-obvious (plus it's unclear what the purpose of the MinGW stuff is).

So either something is wrong (like MSVC being dropped) or this needs a lot of comments explaining what is going on.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MinGW is used to compile the OpenBLAS library. MSVC is used to compile everything else.

Copy link
Member
@pv pv Sep 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, all the compilers are installed in the appveyor VM regardless of this setting, and Python distutils generally finds MSVC based on registry entries rather than PATH.

If present, MSVC is used by default over mingw C compilers, even if the latter is on PATH.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's my understanding too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pv @matthew-brett not sure if you're arguing against including more documentation or not. My comment is not particular to the line os: Visual Studio 2015. There's a lot of opaque stuff in this PR that simply needs explaining.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I'm just explaining why it works.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, more documentation would be good.

I hate to be annoying, but I have previously done the same work in the windows-wheel-builder repo - here's a PR : numpy/windows-wheel-builder#3 in case it's helpful.

@ghost
Copy link
Author
ghost commented Sep 2, 2017

So we have File "C:\Python36\lib\site-packages\numpy\f2py\tests\util.py", line 149, in build_module responding "DLL load failed," probably due to skipping the configuration.

@ghost
Copy link
Author
ghost commented Sep 2, 2017

Would it be okay to xfail the three failing f2py tests since there is no way that they could have been run before?

@charris
Copy link
Member
charris commented Sep 3, 2017

@xoviat We don't yet support pytest.

@charris charris changed the title appveyor: enable OpenBLAS on windows. ENH: enable OpenBLAS on windows. Sep 3, 2017
@ghost
Copy link
Author
ghost commented Sep 3, 2017

@charris Let me clarify the question. There are additional tests run on this pull request because numpy.distutils is more capable now after my previous gfortran PR with gfortran on the PATH.

Previously these tests were not run:

numpy.f2py.tests.test_block_docstring.TestBlockDocString.test_block_docstring ... SKIP: No C compiler available
numpy.f2py.tests.test_callback.TestF77Callback.test_string_callback ... SKIP: No C compiler available
numpy.f2py.tests.test_common.TestCommonBlock.test_common_block ... SKIP: No C compiler available

Now the are run but they fail, hence the red cross. So would it be acceptable to simply xfail these for now on windows?

@charris
Copy link
Member
charris commented Sep 4, 2017

You can skip tests by adding dec to the imports from numpy.testing and use @dec.skipif(...). Whether you should skip the tests is the question. Is something missing that needs to be added?

@ghost
Copy link
Author
ghost commented Sep 4, 2017

The error that I receive is:

ImportError: DLL load failed: The specified module could not be found.

The tests apparently invoke the fortran compiler which generates a DLL that must be on the path for the generated module to work correctly. Normally this isn't a problem because the __config__ module automatically extends the path, but I suspect that the __config__ module generated here is thrown away.

And then you have a couple Python 2.7 failures, because well, it's Python 2.7.

@ghost
Copy link
Author
ghost commented Sep 4, 2017

Hrm...the tests that I marked should be skipped are still run.

@pv
Copy link
Member
pv commented Sep 4, 2017 via email

@pv
Copy link
Member
pv commented Sep 4, 2017 via email

1. fail tests related to DLL load failure as they were previously untested.
2. fix have_compiler to return false on old compilers
3. xfail some tests that were not working on old Python versions.
@ghost
Copy link
Author
ghost commented Sep 5, 2017

I've updated the description; the PR should be ready for review.

Copy link
Member
@pv pv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overall build config is the same as for Scipy, so I think everything is fine from implementation POV.

There are some outdated comments, and the non-f2py xfails could use extra comment.

appveyor.yml Outdated

# Install the BLAS library
# - install "openblas.lib" to PYTHON\lib
# - install OpenBLAS.dll to MINGW\bin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These comments are outdated (only openblas.a is installed)

PYTHON_ARCH: 64
TEST_MODE: full

- PYTHON: C:\Python27
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the tag builds are actually needed for Numpy. The production wheels are built by a different CI config, this one is used only for integration testing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think they're needed but they could be used to compare differences between the released wheels and the wheels used for testing. A major advantage of this appveyor configuration is that you can actually download the built wheels and inspect them yourself.

@@ -3605,6 +3607,8 @@ def test_varstd(self):
assert_almost_equal(np.sqrt(mXvar0[k]),
mX[:, k].compressed().std())

@dec.knownfailureif(sys.platform=='win32' and sys.version_info < (3, 6),
msg='Fails on Python < 3.6')
Copy link
Member
< 23DA div class="edit-comment-hide">

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually understand why this fails only on Windows?
If yes, best to add that to the msg= so we don't need to wonder about it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, but this appears to have not been tested on the CI before. It's probably a new discovery.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I opened a new issue for this and updated the description with the issue number.

Copy link
Member
@pv pv Sep 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I now remember these also appeared on Scipy, it's probably some issue in old MSVC CRTs.

@@ -263,6 +263,8 @@ def test_complex(self):
self._assert_func(x, x)
self._test_not_equal(x, y)

@dec.knownfailureif(sys.version_info < (3, 0),
msg="Error message different on Python 2")
Copy link
Member
@pv pv Sep 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why wasn't this issue noticed before?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe it's only on win32 (in which case there should be a platform check in the knownfailureif).

Copy link
Member
@pv pv Sep 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From log: https://ci.appveyor.com/project/charris/numpy/build/1.0.6743/job/b9ro4kpqo4h29ljf

AssertionError: '\nArrays are not equal\n\n(shapes (2L,), (1L, 2L) mismatch)\n x: array([1, 2])\n y: [repr failed for <matrix>: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()]' != '\nArrays are not equal\n\n(shapes (2,), (1, 2) mismatch)\n x: array([1, 2])\n y: [repr failed for <matrix>: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()]'

Probably would be simple to fix to accept also a second variant
msg2 = msg.replace("shapes (2,), (1, 2)", "shapes (2L,), (1L, 2L)")

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pv To clarify, should we replace the message only on Python2 win32?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd maybe accept either one, assert_(err == msg or err == msg2, err).
iirc the Ls are due to the integer sizes on win32-32 so they could also appear on other plats.

@@ -68,7 +68,7 @@ def have_compiler():
try:
if not compiler.initialized:
compiler.initialize() # MSVC is different
except DistutilsError:
except (DistutilsError, ValueError):
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW this is because MSVC 2008 is a crappy compiler.

@@ -319,6 +319,7 @@ class F2PyTest(object):
module = None
module_name = None

@dec.knownfailureif(sys.platform=='win32', msg='Fails with MinGW64 Gfortran')
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where would the best place be to put the comments? There are multiple tests all with the same cause.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe open an issue and just put the issue number.

Allow the error message to contain "large" arrays.
str(e),
"\nArrays are not equal\n\n"
msg = str(e)
msg2 = msg.replace("shapes (2,), (1, 2)", "shapes (2L,), (1L, 2L)")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguments the other way around.
Also, the try: except: isn't necessary if you do replacement in str(e).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not thinking here; I copied the replace statement directly from the comment without looking at it myself.

@ghost ghost closed this Sep 9, 2017
@ghost ghost reopened this Sep 9, 2017
@ghost
Copy link
Author
ghost commented Sep 9, 2017

@rgommers Do you have any other questions about this?

try:
self.assertEqual(msg, msg_reference)
except AssertionError:
self.assertEqual(msg2, msg_reference)
Copy link
Member
@pv pv Sep 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The try: except is not needed. You can make it simpler via msg = msg.replace(...)

Copy link
Member
@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @xoviat, clearer already. I have put some comments for all places that are still hard to understand.

appveyor.yml Outdated
}

install:
- C:\cygwin\bin\du -hs "%LOCALAPPDATA%\pip\Cache"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a comment above this line, I think something like

This `install` step only builds and installs OpenBLAS plus build/test dependencies.
SciPy itself is built with `build_script`. Cygwin is only needed for WHAT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cygwin is probably only needed for the unix tools (du, find etc) used for cleaning up the pip cache.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cygwin has nothing to do with the build. This comes from experience dealing with appveyor caches.

appveyor.yml Outdated
- C:\cygwin\bin\du -hs "%LOCALAPPDATA%\pip\Cache"

on_finish:
- ps: |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comment, uploads what to where? Not needed for all Appveyor scripts, so good to know why here

appveyor.yml Outdated
mkdir dist
pip wheel -v -v -v --wheel-dir=dist .

ls dist -r | Foreach-Object {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be good to have a comment here, not sure what PushArtifact does and why there's a Foreach-Object (there should be only one wheel I think, built with pip wheel -v -v -v --wheel-dir=dist . above?)

} Else {
$OPENBLAS = $env:OPENBLAS_64
}
$clnt = new-object System.Net.WebClient
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment here or above ps would be useful. I think it's "downloads an already built OpenBLAS library from the path given by $OPENBLAS". That build was done from repo XXX.

Copy link
Author
@ghost ghost Sep 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually don't know the repo but matthew is in charge of these builds.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the comment above the ps block.

@rgommers
Copy link
Member

Thanks @xoviat, all LGTM now

@matthew-brett
Copy link
Contributor

I think this one's ready to merge?

@charris
Copy link
Member
charris commented Sep 30, 2017

Let's give it a shot. The folks who know more about this than I do have signed off.

@charris charris merged commit d05fd30 into numpy:master Sep 30, 2017
@charris
Copy link
Member
charris commented Oct 1, 2017

Thanks @xoviat .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0