DOC Add demo on parallelization with context manager using different backends #25714

ArturoAmorQ · 2023-02-27T14:21:32Z

Reference Issues/PRs

What does this implement/fix? Explain your changes.

We currently have some very insightful information about parallelism on the user guide, where we even link to this document from @thomasjpfan. But IMHO a demo using different backends might be the cherry on the cake.

Any other comments?

I tried benchmarking on spark but I had several problems of compatibility and did not seem to improve the computing time on a local computer.

This work is a joint effort with @Vincent-Maladiere.

…nt backends

thomasjpfan

Thanks for the benchmark! Since we can not run the benchmark on the CI when building examples, this feels like something that belongs in the benchmark directory. In that case, we can reference the benchmark from doc/computing/parallelism.rst.

examples/miscellaneous/parallel_backend_benchmarks.py

thomasjpfan · 2023-02-28T18:10:23Z

examples/miscellaneous/parallel_backend_benchmarks.py

+from sklearn.ensemble import HistGradientBoostingClassifier
+
+X, y = make_classification(n_samples=1_000, random_state=0)
+hist_gbdt = HistGradientBoostingClassifier(random_state=0)


Does joblib automatically OMP_NUM_THREADS for backends such as dask or ray?

Hi @thomasjpfan! I don't understand your question; could you rephrase it?

Does joblib control the number of OpenMP threads for other backends? There can be over-subscription because of dask parallelism + OpenMP parallelism.

XREF: I know for loky, joblib will control the environment variables, but I have not checked for other backends.

That's a good question. I don't think so but I think we should investigate. It's possible that ray and dask do something smart by default outside of the of joblib connector itself.

examples/miscellaneous/parallel_backend_benchmarks.py

ArturoAmorQ · 2023-03-01T15:24:01Z

Thanks for the benchmark! Since we can not run the benchmark on the CI when building examples, this feels like something that belongs in the benchmark directory.

Indeed, I was wondering the best place to add this file and thought that this case is similar to the approximate nearest neighbors example, which is both a benchmark and a demo of a scikit-learn tool (in the case of the current PR, the parameter n_jobs).

I am certainly open to moving it to the benchmark directory, though I tried to keep a beginner-friendly narrative which I find more appropriate for the example gallery. WDYT?

thomasjpfan · 2023-03-04T14:29:01Z

I am certainly open to moving it to the benchmark directory, though I tried to keep a beginner-friendly narrative which I find more appropriate for the example gallery. WDYT?

Currently, examples run nightly to ensure that they work with the latest version of scikit-learn. The example in this PR is a snapshot of performance with fixed versions of scikit-learn and dependencies.

If the goal is to render the example as HTML, then we can create a "performance" directory that do not run nightly, but a snapshot in time about performance. I am unsure about adding a "performance" directory, because it will get out of date. Updates to Scikit-learn, ray, or dask may change the performance results.
The other way to render the HTML and to share the example is to make it a blog post in https://blog.scikit-learn.org/.

In both cases, I'll like to see the library versions used to generate the example so others can reproduce it.

ArturoAmorQ · 2023-03-07T13:40:15Z

If the goal is to render the example as HTML, then we can create a "performance" directory that do not run nightly

As far as I understand, the example gallery does not run nightly for files without the plot_ prefix.

thomasjpfan · 2023-03-07T14:34:17Z

As far as I understand, the example gallery does not run nightly for files without the plot_ prefix.

That is true and I prefer not to have them, because we can not be sure they work anymore. ~~I do not want to add more examples that do not run nightly.~~

Edit: I am +0.25 for adding this example, as long as scikit-learn's version and dependencies' versions are displayed in the beginning of this example.

examples/miscellaneous/parallel_backend_benchmarks.py

thomasjpfan · 2023-03-08T01:11:23Z

Thinking it over, I am +0.25 for adding this example, as long as scikit-learn's version and the dependencies' versions are displayed in the beginning of this example. With non-running examples, I'm still concerned that either the code does not work anymore or the benchmark results will be out of date when scikit-learn or the dependencies update.

Given that the examples are rendered into HTML and public facing, we could be showcasing code that may not run anymore.

ArturoAmorQ · 2023-03-21T17:28:25Z

examples/miscellaneous/parallel_backend_benchmarks.py

+from time import time
+from sklearn.model_selection import cross_validate
+from threadpoolctl import threadpool_limits
+import numpy as np
+
+n_threads_grid = 2 ** np.arange(np.log2(2 * N_CORES).astype(np.int32) + 1)
+
+for n_threads in n_threads_grid:
+    tic = time()
+    with threadpool_limits(limits=int(n_threads)):
+        cross_validate(clf, X, y, cv=cv)
+    toc = time()
+    print(f"n_threads: {n_threads}, elapsed time: {toc - tic:.3f} sec")


On my laptop (4 physical cores and 8 logical cores) I observe the following:

n_threads: 1, elapsed time: 3.381 sec n_threads: 2, elapsed time: 3.261 sec n_threads: 4, elapsed time: 3.235 sec n_threads: 8, elapsed time: 26.078 sec

The last line reveals serious oversubscription problem when the number of threads is not limited to the number of physical cores.

Furthermore we can observe that the dataset is too small for the multi-threading to give significant speed-up.

n_threads: 8, elapsed time: 26.078 sec

ouch :)

ogrisel · 2023-03-21T17:35:37Z

The timings measured in #25714 (comment) are another case of the problem reported by @adrinjalali in #25822: note that the datasets are small in both cases.

Furthermore, irrespective of the size of the dataset, we might want to make scikit-learn automatically limit the number of threads used in openmp in Cython code to the number of physical cores instead of logical cores.

The latter point could be a follow-up to @jeremiedbb's #25918 that changes _openmp_effective_n_threads to make it possible to use the number of physical cores. We should probably change the default to True in a dedicated PR once #25918 is merged.

ogrisel · 2023-03-21T17:37:44Z

Let's put this PR on hold for now because there will not be any simple educational message as long as #25822 is not fixed. Once this is the case, I think it would be interesting in continuing this work to show how to use threadpoolctl and various joblib backends as a tutorial on how to configure the different levels of parallelism in scikit-learn, their pros and cons.

adrinjalali · 2023-03-22T18:28:03Z

To me this is more of a user guide kind of material which doesn't reproduce on CI, but is there as a fixed state, and for that I think it makes sense to have it. Users should be able to reproduce on their machines, and if we do it in a user guide, then the dataset can be large enough for the points raised here to be valid.

ArturoAmorQ · 2025-03-03T14:13:53Z

Somewhat related to #30662, as we want the message to hold regardless of the size of the dataset.

ogrisel · 2025-03-20T16:13:29Z

as we want the message to hold regardless of the size of the dataset.

That might be difficult to achieve, but I agree that we should try to protect against the most pathological slowdowns with the default settings.

DOC Add example on parallelization with context manager using differe…

ca33d22

…nt backends

github-actions bot added the Documentation label Feb 27, 2023

thomasjpfan reviewed Feb 28, 2023

View reviewed changes

ogrisel reviewed Mar 7, 2023

View reviewed changes

examples/miscellaneous/parallel_backend_benchmarks.py Outdated Show resolved Hide resolved

examples/miscellaneous/parallel_backend_benchmarks.py Outdated Show resolved Hide resolved

ArturoAmorQ added 7 commits March 8, 2023 16:01

Show dependencies versions and physical cores

61765c6

Change notation and avoid oversubscription

9198378

Relocate paragraph

ebbed3f

Improve narrative

6aff258

Improve plotting format

62d1cfd

Improve wording

292023d

Show impact of built-in threading

830ce17

ArturoAmorQ commented Mar 21, 2023

View reviewed changes

ArturoAmorQ marked this pull request as draft March 21, 2023 17:30

ogrisel mentioned this pull request Mar 21, 2023

TST Speed-up test suite when using pytest-xdist #25918

Merged

ogrisel mentioned this pull request Apr 4, 2023

PERF set openmp to use only physical cores by default #26082

Merged

ArturoAmorQ mentioned this pull request Oct 17, 2025

DOC Add reference to OpenMP parallelism in MiniBatchKMeans doc #32520

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

DOC Add demo on parallelization with context manager using different backends #25714

DOC Add demo on parallelization with context manager using different backends #25714

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Uh oh!

DOC Add demo on parallelization with context manager using different backends #25714

Are you sure you want to change the base?

DOC Add demo on parallelization with context manager using different backends #25714

Uh oh!

Conversation

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants