-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
Add benchmarks #16723
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
(Copied from #16864 as requested) Following PR #13511 it appears that there is not reference benchmark for SVMs in scikit-learn or in any side-project (sklearn-contrib). This seems quite risky on the long run, maybe we should create one - especially to quantify the impact of changes to C code such as in PR #13511 . I have been working quite a bit on this topic of creating reference benchmarks in the past years, leading to the creation of tools in the pytest ecosystem: However I do not know a good set of reference datasets to start with (apart from creating challenging ones "by hand"). |
Note: this ticket should probably be renamed into "adding benchmarks to scikit learn", to address the whole picture. In particular I see at least two kind of benchmarks:
Of course both kinds should/could be run on single models or multiple models. For example classification problems can be tackled by many models in scikit learn. They would end up in the same benchmark. If this does not fall into this ticket please reopen #16864 |
+1 for adding benchmarks, regressions are very easy to introduce with Cython (and even more if we ever want to use numba one day) How does asv work in terms of workflow? Does it automatically compare a branch with master, or do PR authors need to run the benchmarks on both branches manually?
We typically try to have these in the tests but yes, for long ones, benchmarks may help too |
I think it's a good idea.
I already tried that in a few PRs from people from the inria team and it was helpful. Obviously I could discuss with them irl so they could easily use the benchmark suite but if we document it properly it's not difficult to use. I'd need to check that the benchmark suite is not outdated and I'd like to make a few modifications before we move it here if we decide to.
+1
what's the procedure ? Does it it requires a vote ? |
Thanks @jeremiedbb ! Maybe an email on the public mailing list? It's purely a developer tool that doesn't affect users or the API, so I doubt it needs a SLEP. Don't hesitate opening a PR once you made the changes you wanted.
@smarie Indeed I was considering more run time/memory benchmarks here. However if we go with ASV, it is able to also store custom metrics (e.g. model accuracy). The advantage is that it can compute it for commits in the history and identify regressions that way. By itself accuracy of a model on a problem is not meaningful, but if it changes a lot it would deserve additional investigation. Maybe we could start with run time benchmarks and add model performance benchmarks in a second step. We do try to test general (e.g. invariance) properties of estimators in common tests.
Typically you want to run it on HEAD, then on master and show the difference. ASV allows to do that in one command. See pandas contribution guide, linked in the issue description, for more details. To give an example, the output could look something like,
|
Also once this is resolved, we might want to archive and m 8000 ark as no longer maintained https://github.com/scikit-learn/scikit-learn-speed and https://github.com/scikit-learn/ml-benchmarks |
The benchmark suite allows to do accuracy checks also. For now I just check that some metric (depending on the model) and/or that the predicted values stay exactly the same between 2 commits. |
I made the changes I wanted. Changes in asv broke a functionality I thought there was no work around but just found one so it's fine. I opened #17026. Still missing the documentation |
Uh oh!
There was an error while loading. Please reload this page.
We should add a folder with airspeed velocity (ASV) benchmarks inside scikit-learn.
ASV benchmarks done by @jeremiedbb exist in https://github.com/jeremiedbb/scikit-learn_benchmarks but they currently don't get enough exposure (i.e. contributors are not aware about them). I think we should include them e.g. under
asv_bench/
folder and remove redundant scripts frombenchmarks/
folder.I just made a PR to pandas, and it's nice when reviewers can ask to add an ASV benchmark for a PR claiming to improve performance. For this to work they need to be inside the scikit-learn repo. Whether we run them somewhere and where is a secondary concern IMO.
For the docs we can "steal" part of the section from the pandas dev guide https://pandas.pydata.org/docs/development/contributing.html#running-the-performance-test-suite
The text was updated successfully, but these errors were encountered: