8000 [MRG+2] DOC Correct default n_jobs & reference the glossary by qinhanmin2014 · Pull Request #11808 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

[MRG+2] DOC Correct default n_jobs & reference the glossary #11808

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Aug 18, 2018
Merged

[MRG+2] DOC Correct default n_jobs & reference the glossary #11808

merged 9 commits into from
Aug 18, 2018

Conversation

qinhanmin2014
Copy link
Member

part of #11771
Correct default n_jobs (n_jobs=None is not the same as n_jobs=1) and reference the glossary.
ping @jnothman @rth this is my solution. If you like it, I'll make similar changes in the whole repo.

Copy link
Member
@rth rth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments below, otherwise LGTM.

The number of jobs to use for the computation. This works by breaking
down the pairwise matrix into n_jobs even slices and computing them in
parallel.

If ``None``, then the number of jobs is set to 1 unless current joblib
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe remove "then" and possibly "unless the current"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant: possibly add "the" in "unless current"..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rth So you mean something like: If None, the number of jobs is set to 1 unless the current joblib backend context specifies otherwise?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but let's wait for @jnothman 's opinion before propagating these changes everywhere..

Copy link
Member
@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is okay, but seems very verbose. This seems sufficient:

"Number of parallel jobs to run. None means 1 unless in a joblib.parallel_backend context. -1 means use all processors. See the Glossary for more."

Number of jobs to run in parallel (default 1).
n_jobs : int or None, optional (default=None)
Number of jobs to run in parallel.
``None`` means 1 unless in a ``joblib.parallel_backend`` context.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use :obj:`joblib.parallel_backend` to create an intersphinx link

@qinhanmin2014 qinhanmin2014 changed the title [WIP] DOC Correct default n_jobs & reference the glossary [MRG] DOC Correct default n_jobs & reference the glossary Aug 15, 2018
@qinhanmin2014
Copy link
Member Author

ping @jnothman @rth ready for a review

Copy link
Member
@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether we would keep people happier by retaining "If 1, no job-level parallelism is used, which is useful for debugging." But ... whatever.

n_jobs : int,
number of parallel jobs to run
n_jobs : int or None, optional (default=None)
number of parallel jobs to run.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit awkward to keep this lowercase

@@ -2186,10 +2191,11 @@ class MultiTaskLassoCV(LinearModelCV, RegressorMixin):
verbose : bool or integer
Amount of verbosity.

n_jobs : integer, optional
Number of CPUs to use during the cross validation. If ``-1``, use
all the CPUs. Note that this is used only if multiple values for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You missed this note.

@@ -2014,10 +2018,11 @@ class MultiTaskElasticNetCV(LinearModelCV, RegressorMixin):
verbose : bool or integer
Amount of verbosity.

n_jobs : integer, optional
Number of CPUs to use during the cross validation. If ``-1``, use
all the CPUs. Note that this is used only if multiple values for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You missed this note

The number of jobs to use for the computation.
If -1 all CPUs are used. This will only provide speedup for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You missed this note

@qinhanmin2014
Copy link
Member Author

I wonder whether we would keep people happier by retaining "If 1, no job-level parallelism is used, which is useful for debugging." But ... whatever.

@jnothman I add it to the glossary. I'll make similar changes everywhere if you decide to do so.

Copy link
Member
@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otherwise LGTM

@@ -1479,8 +1479,9 @@ functions or non-estimator constructors.

``n_jobs`` is an int, specifying the maximum number of concurrently
running jobs. If set to -1, all CPUs are used. If 1 is given, no
parallel computing code is used at all. For n_jobs below -1, (n_cpus +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The statement already here is not quite true. Better to say no job-level parallelism or something, and perhaps to note that other parallelism may be done by the numerical processing libraries (as per the FAQ's coverage of this)...

57AE
Copy link
Member
@rth rth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment below, otherwise LGTM.

doc/glossary.rst Outdated
@@ -1479,8 +1479,9 @@ functions or non-estimator constructors.

``n_jobs`` is an int, specifying the maximum number of concurrently
running jobs. If set to -1, all CPUs are used. If 1 is given, no
parallel computing code is used at all. For n_jobs below -1, (n_cpus +
1 + n_jobs) are used. Thus for n_jobs = -2, all CPUs but one are used.
job-level parallel computing code is used at all, which is useful for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "no joblib level parallelism is used at all".
I also think it's important to emphasize that other parallelism may be used as mentionned above with a link to the FAQ, as a number of users don't seem to be aware of it.

Possibly,
"""
Even with n_jobs = 1, parallelism may occur be due to numerical processing libraries (see FAQ#section_on_parallelism).
"""
?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, I'm not aware of it.
This version is a bit duplicate with the first paragraph, but I think it's beneficial to emphasize the meaning of n_jobs=1.

@qinhanmin2014 qinhanmin2014 added this to the 0.20 milestone Aug 18, 2018
@qinhanmin2014 qinhanmin2014 changed the title [MRG] DOC Correct default n_jobs & reference the glossary [MRG+2] DOC Correct default n_jobs & reference the glossary Aug 18, 2018
@jnothman jnothman merged commit 9b8fd0b into scikit-learn:master Aug 18, 2018
@jnothman
Copy link
Member

Thanks @qinhanmin2014

@qinhanmin2014 qinhanmin2014 deleted the n_jobs-doc branch August 18, 2018 11:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0