8000 BENCH, DOC: Benchmark matmul and update documentation by jakirkham · Pull Request #7034 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

BENCH, DOC: Benchmark matmul and update documentation #7034

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jan 16, 2016

Conversation

jakirkham
Copy link
Contributor

Related: #6932

Based on the code path followed by numpy.matmul ( 35790f@numpy/core/src/multiarray/multiarraymodule.c#L2408 ), it appears this should also benefit from the syrk optimization. Added some benchmarks to demonstrate that optimization. Also, updated the release documentation to note where the syrk optimization is used. Finally, resorted some benchmarks to match the order they appear at runtime.

@jakirkham
Copy link
Contributor Author

@njsmith, I remember we were talking awhile back about the code path that matmul follows. So, I figured I should take a look at the source code. It appears to be calling cblas_matrixproduct, which we optimized for using syrk already. ( #6932 ) Since we already get that speedup, I figured we should better document it and maybe add some benchmarks.

@njsmith
Copy link
Member
njsmith commented Jan 16, 2016

Oh, cool, then I misremembered. Sorry about that.
On Jan 16, 2016 11:16, "jakirkham" notifications@github.com wrote:

@njsmith https://github.com/njsmith, I remember we were talking awhile
back about the code path that matmul follows. So, I figured I should take
a look at the source code. It appears to be calling cblas_matrixproduct,
which we optimized for using syrk already. ( #6932
#6932 ) Since we already get that
speedup, I figured we should better document it and maybe add some
benchmarks.


Reply to this email directly or view it on GitHub
#7034 (comment).

Previously, ``gemm`` BLAS operations were used for all matrix products. Now, 8000
if the matrix product is between a matrix and its transpose, it will use
``syrk`` BLAS operations for a performance boost.
``syrk`` BLAS operations for a performance boost. This optimization has been
extended to ``numpy.dot``, ``numpy.inner``, and ``numpy.matmul``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe list @ here explicitly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. Added.

@njsmith
Copy link
Member
njsmith commented Jan 16, 2016

Otherwise lgtm and should get into 1.11 since it's mostly a release note update.

…n has been extended to several NumPy operations.
@jakirkham
Copy link
Contributor Author

No worries. It's always nice to get something for free. :)

@njsmith
Copy link
Member
njsmith commented Jan 16, 2016

Lgtm

@jakirkham
Copy link
Contributor Author

Though it does make me wonder how many other functions are optimized that we weren't previously aware of.

@jakirkham
Copy link
Contributor Author

So, Travis checks out. Does AppVeyor even run the benchmarks?

@charris
Copy link
Member
charris commented Jan 16, 2016

No ;) AppVeyor seems to have a Saturday hangover.

@charris
Copy link
Member
charris commented Jan 16, 2016

AppVeyor is running 30 min/build and 12 builds behind, so about 6 hours to catch up. it's turning into a real bottleneck.

@rgommers
Copy link
Member

@charris I think it's OK to merge things unless they have hairy Windows-specific stuff in them. if something fails it'll still show up in one of the later builds (which is a huge improvement over what we had a few weeks ago).

If it stays like this we can solve it with a little bit of money.

@njsmith
Copy link
Member
njsmith commented Jan 16, 2016

The queue here is indeed pretty daunting to look at! https://ci.appveyor.com/project/charris/numpy/history

But yeah, waiting a little longer seems reasonable too

charris added a commit that referenced this pull request Jan 16, 2016
BENCH, DOC: Benchmark matmul and update documentation
@charris charris merged commit d65d871 into numpy:master Jan 16, 2016
@charris
Copy link
Member
charris commented Jan 16, 2016

Well, that's 30 minutes less to wait. Thanks @jakirkham .

@jakirkham jakirkham deleted the bench_matmul branch January 16, 2016 21:55
@jakirkham
Copy link
Contributor Author

Sounds good to me. :) Thanks everyone. If it turns out AppVeyor fails (not sure what I would have done to cause it though), please just ping me.

@jakirkham
Copy link
Contributor Author

I think @matthew-brett mentioned that we might be able to just ask AppVeyor for more job bandwidth and get it for free. ( #7020 (comment) )

@jakirkham
Copy link
Contributor Author

@pv, would it be possible, if it's not too much trouble, to run these benchmarks through airspeed velocity comparing ( a7377d8 ) to ( 25c8d1c )? There should be a roughly 2x speedup between the two. Both commits precede this one by ~10 days.

@rgommers
Copy link
Member

@jakirkham you can just run python runtests.py --bench-compare a7377d8 25c8d1c and get the results right? Or is that not working?

@jakirkham
Copy link
Contributor Author

Yes, I am sure I can. I was just hoping they could be published on the benchmarking website. If it's too much to ask, then it need not be done. Sorry, I should have been more explicit.

@rgommers
Copy link
Member

ah ok, just checking that you were aware that you could run them.

@jakirkham
Copy link
Contributor Author

Well, actually, I just tried and I get this traceback.

Traceback (most recent call last):
  File "runtests.py", line 462, in <module>
  File "runtests.py", line 240, in main
  File "/opt/conda/lib/python2.7/os.py", line 346, in execvp
    _execvpe(file, args)
  File "/opt/conda/lib/python2.7/os.py", line 382, in _execvpe
    func(fullname, *argrest)
OSError: [Errno 2] No such file or directory

@jakirkham
Copy link
Contributor Author

Turns out there was no asv. After installing it and virtualenv things flamed out, but I expect it is because conda and virtualenv don't play nice. Is there some way for me to override the virtualenv install?

@pv
Copy link
Member
pv commented Jan 19, 2016

The benchmarking website is automated, and I'm unfortunately not volunteering to run anything manually.

You can edit asv.conf.json to change virtualenv to conda. Or, check out the commits via git and use --bench instead of --bench-compare and compare the timings manually. Benchmarking the currently checked out code can be done without needing for asv to manage build/installation.

@jakirkham
Copy link
Contributor Author

Ok, that's fine. Thanks for letting me know.

I ran into more issues getting the comparison to work, but I will raise the issue on asv instead of here.

@jakirkham
Copy link
Contributor Author

See related issue here ( airspeed-velocity/asv#367 ).

@jakirkham
Copy link
< 8968 /details-menu>
Contributor Author

So, now I can get it to run. :) However, it says no benchmarks selected. Here is what I see.

$ python runtests.py --bench-compare a7377d8 25c8d1c 
********************************************************************************
WARNING: you have uncommitted changes --- these will NOT be benchmarked!
********************************************************************************
· Fetching recent changes
· Fetching recent changes
· Creating environments
· Discovering benchmarks
·· Uninstalling from py2.7-six
·· Installing into py2.7-six.
· No benchmarks selected

Here is the diff applied to asv.conf.json.

diff --git a/benchmarks/asv.conf.json b/benchmarks/asv.conf.json
index d837b0d..d8dc4c3 100644
--- a/benchmarks/asv.conf.json
+++ b/benchmarks/asv.conf.json
@@ -28,7 +28,7 @@
     // If missing or the empty string, the tool will be automatically
     // determined by looking for tools on the PATH environment
     // variable.
-    "environment_type": "virtualenv",
+    "environment_type": "conda",

     // the base URL to show a commit for the project.
     "show_commit_url": "https://github.com/numpy/numpy/commit/",

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0