8000 DOC Correct default n_jobs & reference the glossary (#11808) · scikit-learn/scikit-learn@9b8fd0b · GitHub
[go: up one dir, main page]

Skip to content

Commit 9b8fd0b

Browse files
qinhanmin2014jnothman
authored andcommitted
DOC Correct default n_jobs & reference the glossary (#11808)
Also improves the glossary entry for n_jobs.
1 parent 07e909a commit 9b8fd0b

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

47 files changed

+419
-258
lines changed

doc/glossary.rst

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1479,8 +1479,11 @@ functions or non-estimator constructors.
14791479

14801480
``n_jobs`` is an int, specifying the maximum number of concurrently
14811481
running jobs. If set to -1, all CPUs are used. If 1 is given, no
1482-
parallel computing code is used at all. For n_jobs below -1, (n_cpus +
1483-
1 + n_jobs) are used. Thus for n_jobs = -2, all CPUs but one are used.
1482+
joblib level parallelism is used at all, which is useful for
1483+
debugging. Even with ``n_jobs = 1``, parallelism may occur due to
1484+
numerical processing libraries (see :ref:`FAQ <faq_mkl_threading>`).
1485+
For n_jobs below -1, (n_cpus + 1 + n_jobs) are used. Thus for
1486+
``n_jobs = -2``, all CPUs but one are used.
14841487

14851488
``n_jobs=None`` means *unset*; it will generally be interpreted as
14861489
``n_jobs=1``, unless the current :class:`joblib.Parallel` backend

examples/model_selection/plot_learning_curve.py

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@
2525

2626

2727
def plot_learning_curve(estimator, title, X, y, ylim=None, cv=None,
28-
n_jobs=1, train_sizes=np.linspace(.1, 1.0, 5)):
28+
n_jobs=None, train_sizes=np.linspace(.1, 1.0, 5)):
2929
"""
3030
Generate a simple plot of the test and training learning curve.
3131
@@ -63,8 +63,11 @@ def plot_learning_curve(estimator, title, X, y, ylim=None, cv=None,
6363
Refer :ref:`User Guide <cross_validation>` for the various
6464
cross-validators that can be used here.
6565
66-
n_jobs : integer, optional
67-
Number of jobs to run in parallel (default 1).
66+
n_jobs : int or None, optional (default=None)
67+
Number of jobs to run in parallel.
68+
``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.
69+
``-1`` means using all processors. See :term:`Glossa 10000 ry <n_jobs>`
70+
for more details.
6871
6972
train_sizes : array-like, shape (n_ticks,), dtype float or int
7073
Relative or absolute numbers of training examples that will be used to

sklearn/cluster/bicluster.py

Lines changed: 8 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -228,15 +228,14 @@ class SpectralCoclustering(BaseSpectral):
228228
chosen and the algorithm runs once. Otherwise, the algorithm
229229
is run for each initialization and the best solution chosen.
230230
231-
n_jobs : int, optional, default: 1
231+
n_jobs : int or None, optional (default=None)
232232
The number of jobs to use for the computation. This works by breaking
233233
down the pairwise matrix into n_jobs even slices and computing them in
234234
parallel.
235235
236-
If -1 all CPUs are used. If 1 is given, no parallel computing code is
237-
used at all, which is useful for debugging. For n_jobs below -1,
238-
(n_cpus + 1 + n_jobs) are used. Thus for n_jobs = -2, all CPUs but one
239-
are used.
236+
``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.
237+
``-1`` means using all processors. See :term:`Glossary <n_jobs>`
238+
for more details.
240239
241240
random_state : int, RandomState instance or None (default)
242241
Used for randomizing the singular value decomposition and the k-means
@@ -375,15 +374,14 @@ class SpectralBiclustering(BaseSpectral):
375374
chosen and the algorithm runs once. Otherwise, the algorithm
376375
is run for each initialization and the best solution chosen.
377376
378-
n_jobs : int, optional, default: 1
377+
n_jobs : int or None, optional (default=None)
379378
The number of jobs to use for the computation. This works by breaking
380379
down the pairwise matrix into n_jobs even slices and computing them in
381380
parallel.
382381
383-
If -1 all CPUs are used. If 1 is given, no parallel computing code is
384-
used at all, which is useful for debugging. For n_jobs below -1,
385-
(n_cpus + 1 + n_jobs) are used. Thus for n_jobs = -2, all CPUs but one
386-
are used.
382+
``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.
383+
``-1`` means using all processors. See :term:`Glossary <n_jobs>`
384+
for more details.
387385
388386
random_state : int, RandomState instance or None (default)
389387
Used for randomizing the singular value decomposition and the k-means

sklearn/cluster/dbscan_.py

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -76,9 +76,11 @@ def dbscan(X, eps=0.5, min_samples=5, metric='minkowski', metric_params=None,
7676
weight may inhibit its eps-neighbor from being core.
7777
Note that weights are absolute, and default to 1.
7878
79-
n_jobs : int, optional (default = 1)
79+
n_jobs : int or None, optional (default=None)
8080
The number of parallel jobs to run for neighbors search.
81-
If ``-1``, then the number of jobs is set to the number of CPU cores.
81+
``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.
82+
``-1`` means using all processors. See :term:`Glossary <n_jobs>`
83+
for more details.
8284
8385
Returns
8486
-------
@@ -229,9 +231,11 @@ class DBSCAN(BaseEstimator, ClusterMixin):
229231
The power of the Minkowski metric to be used to calculate distance
230232
between points.
231233
232-
n_jobs : int, optional (default = 1)
234+
n_jobs : int or None, optional (default=None)
233235
The number of parallel jobs to run.
234-
If ``-1``, then the number of jobs is set to the number of CPU cores.
236+
``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.
237+
``-1`` means using all processors. See :term:`Glossary <n_jobs>`
238+
for more details.
235239
236240
Attributes
237241
----------

sklearn/cluster/k_means_.py

Lines changed: 8 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -261,14 +261,13 @@ def k_means(X, n_clusters, sample_weight=None, init='k-means++',
261261
the data mean, in this case it will also not ensure that data is
262262
C-contiguous which may cause a significant slowdown.
263263
264-
n_jobs : int
264+
n_jobs : int or None, optional (default=None)
265265
The number of jobs to use for the computation. This works by computing
266266
each of the n_init runs in parallel.
267267
268-
If -1 all CPUs are used. If 1 is given, no parallel computing code is
269-
used at all, which is useful for debugging. For n_jobs below -1,
270-
(n_cpus + 1 + n_jobs) are used. Thus for n_jobs = -2, all CPUs but one
271-
are used.
268+
``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.
269+
``-1`` means using all processors. See :term:`Glossary <n_jobs>`
270+
for more details.
272271
273272
algorithm : "auto", "full" or "elkan", default="auto"
274273
K-means algorithm to use. The classical EM-style algorithm is "full".
@@ -834,14 +833,13 @@ class KMeans(BaseEstimator, ClusterMixin, TransformerMixin):
834833
the data mean, in this case it will also not ensure that data is
835834
C-contiguous which may cause a significant slowdown.
836835
837-
n_jobs : int
836+
n_jobs : int or None, optional (default=None)
838837
The number of jobs to use for the computation. This works by computing
839838
each of the n_init runs in parallel.
840839
841-
If -1 all CPUs are used. If 1 is given, no parallel c 10000 omputing code is
842-
used at all, which is useful for debugging. For n_jobs below -1,
843-
(n_cpus + 1 + n_jobs) are used. Thus for n_jobs = -2, all CPUs but one
844-
are used.
840+
``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.
841+
``-1`` means using all processors. See :term:`Glossary <n_jobs>`
842+
for more details.
845843
846844
algorithm : "auto", "full" or "elkan", default="auto"
847845
K-means algorithm to use. The classical EM-style algorithm is "full".

sklearn/cluster/mean_shift_.py

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -53,9 +53,11 @@ def estimate_bandwidth(X, quantile=0.3, n_samples=None, random_state=0,
5353
deterministic.
5454
See :term:`Glossary <random_state>`.
5555
56-
n_jobs : int, optional (default = 1)
56+
n_jobs : int or None, optional (default=None)
5757
The number of parallel jobs to run for neighbors search.
58-
If ``-1``, then the number of jobs is set to the number of CPU cores.
58+
``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.
59+
``-1`` means using all processors. See :term:`Glossary <n_jobs>`
60+
for more details.
5961
6062
Returns
6163
-------
@@ -152,14 +154,13 @@ def mean_shift(X, bandwidth=None, seeds=None, bin_seeding=False,
152154
Maximum number of iterations, per seed point before the clustering
153155
operation terminates (for that seed point), if has not converged yet.
154156
155-
n_jobs : int
157+
n_jobs : int or None, optional (default=None)
156158
The number of jobs to use for the computation. This works by computing
157159
each of the n_init runs in parallel.
158160
159-
If -1 all CPUs are used. If 1 is given, no parallel computing code is
160-
used at all, which is useful for debugging. For n_jobs below -1,
161-
(n_cpus + 1 + n_jobs) are used. Thus for n_jobs = -2, all CPUs but one
162-
are used.
161+
``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.
162+
``-1`` means using all processors. See :term:`Glossary <n_jobs>`
163+
for more details.
163164
164165
.. versionadded:: 0.17
165166
Parallel Execution using *n_jobs*.
@@ -334,14 +335,13 @@ class MeanShift(BaseEstimator, ClusterMixin):
334335
not within any kernel. Orphans are assigned to the nearest kernel.
335336
If false, then orphans are given cluster label -1.
336337
337-
n_jobs : int
338+
n_jobs : int or None, optional (default=None)
338339
The number of jobs to use for the computation. This works by computing
339340
each of the n_init runs in parallel.
340341
341-
If -1 all CPUs are used. If 1 is given, no parallel computing code is
342-
used at all, which is useful for debugging. For n_jobs below -1,
343-
(n_cpus + 1 + n_jobs) are used. Thus for n_jobs = -2, all CPUs but one
344-
are used.
342+
``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.
343+
``-1`` means using all processors. See :term:`Glossary <n_jobs>`
344+
for more details.
345345
346346
Attributes
347347
----------

sklearn/cluster/optics_.py

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -118,9 +118,11 @@ def optics(X, min_samples=5, max_bound=np.inf, metric='euclidean',
118118
required to store the tree. The optimal value depends on the
119119
nature of the problem.
120120
121-
n_jobs : int, optional (default=1)
121+
n_jobs : int or None, optional (default=None)
122122
The number of parallel jobs to run for neighbors search.
123-
If ``-1``, then the number of jobs is set to the number of CPU cores.
123+
``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.
124+
``-1`` means using all processors. See :term:`Glossary <n_jobs>`
125+
for more details.
124126
125127
Returns
126128
-------
@@ -243,9 +245,11 @@ class OPTICS(BaseEstimator, ClusterMixin):
243245
required to store the tree. The optimal value depends on the
244246
nature of the problem.
245247
246-
n_jobs : int, optional (default=1)
248+
n_jobs : int or None, optional (default=None)
247249
The number of parallel jobs to run for neighbors search.
248-
If ``-1``, then the number of jobs is set to the number of CPU cores.
250+
``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.
251+
``-1`` means using all processors. See :term:`Glossary <n_jobs>`
252+
for more details.
249253
250254
Attributes
251255
----------

sklearn/cluster/spectral.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -358,9 +358,11 @@ class SpectralClustering(BaseEstimator, ClusterMixin):
358358
Parameters (keyword arguments) and values for kernel passed as
359359
callable object. Ignored by other kernels.
360360
361-
n_jobs : int, optional (default = 1)
361+
n_jobs : int or None, optional (default=None)
362362
The number of parallel jobs to run.
363-
If ``-1``, then the number of jobs is set to the number of CPU cores.
363+
``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.
364+
``-1`` means using all processors. See :term:`Glossary <n_jobs>`
365+
for more details.
364366
365367
Attributes
366368
----------

sklearn/compose/_column_transformer.py

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -93,8 +93,11 @@ class ColumnTransformer(_BaseComposition, TransformerMixin):
9393
the stacked result will be sparse or dense, respectively, and this
9494
keyword will be ignored.
9595
96-
n_jobs : int, optional
97-
Number of jobs to run in parallel (default 1).
96+
n_jobs : int or None, optional (default=None)
97+
Number of jobs to run in parallel.
98+
``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.
99+
``-1`` means using all processors. See :term:`Glossary <n_jobs>`
100+
for more details.
98101
99102
transformer_weights : dict, optional
100103
Multiplicative weights for features per transformer. The output of the
@@ -666,8 +669,11 @@ def make_column_transformer(*transformers, **kwargs):
666669
non-specified columns will use the ``remainder`` estimator. The
667670
estimator must support `fit` and `transform`.
668671
669-
n_jobs : int, optional
670-
Number of jobs to run in parallel (default 1).
672+
n_jobs : int or None, optional (default=None)
673+
Number of jobs to run in parallel.
674+
``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.
675+
``-1`` means using all processors. See :term:`Glossary <n_jobs>`
676+
for more details.
671677
672678
Returns
673679
-------

sklearn/covariance/graph_lasso_.py

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -520,8 +520,11 @@ class GraphicalLassoCV(GraphicalLasso):
520520
than number of samples. Elsewhere prefer cd which is more numerically
521521
stable.
522522
523-
n_jobs : int, optional
524-
number of jobs to run in parallel (default 1).
523+
n_jobs : int or None, optional (default=None)
524+
number of jobs to run in parallel.
525+
``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.
526+
``-1`` means using all processors. See :term:`Glossary <n_jobs>`
527+
for more details.
525528
526529
verbose : boolean, optional
527530
If verbose is True, the objective function and duality gap are
@@ -927,8 +930,11 @@ class GraphLassoCV(GraphicalLassoCV):
927930
than number of samples. Elsewhere prefer cd which is more numerically
928931
stable.
929932
930-
n_jobs : int, optional
931-
number of jobs to run in parallel (default 1).
933+
n_jobs : int or None, optional (default=None)
934+
number of jobs to run in parallel.
935+
``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.
936+
``-1`` means using all processors. See :term:`Glossary <n_jobs>`
937+
for more details.
932938
933939
verbose : boolean, optional
934940
If verbose is True, the objective function and duality gap are

sklearn/decomposition/dict_learning.py

Lines changed: 29 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -245,8 +245,11 @@ def sparse_encode(X, dictionary, gram=None, cov=None, algorithm='lasso_lars',
245245
max_iter : int, 1000 by default
246246
Maximum number of iterations to perform if `algorithm='lasso_cd'`.
247247
248-
n_jobs : int, optional
248+
n_jobs : int or None, optional (default=None)
249249
Number of parallel jobs to run.
250+
``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.
251+
``-1`` means using all processors. See :term:`Glossary <n_jobs>`
252+
for more details.
250253
251254
check_input : boolean, optional
252255
If False, the input arrays X and dictionary will not be checked.
@@ -453,8 +456,11 @@ def dict_learning(X, n_components, alpha, max_iter=100, tol=1e-8,
453456
Lasso solution (linear_model.Lasso). Lars will be faster if
454457
the estimated components are sparse.
455458
456-
n_jobs : int,
457-
Number of parallel jobs to run, or -1 to autodetect.
459+
n_jobs : int or None, optional (default=None)
460+
Number of parallel jobs to run.
461+
``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.
462+
``-1`` means using all processors. See :term:`Glossary <n_jobs>`
463+
for more details.
458464
459465
dict_init : array of shape (n_components, n_features),
460466
Initial value for the dictionary for warm restart scenarios.
@@ -648,8 +654,11 @@ def dict_learning_online(X, n_components=2, alpha=1, n_iter=100,
648654
shuffle : boolean,
649655
Whether to shuffle the data before splitting it in batches.
650656
651-
n_jobs : int,
652-
Number of parallel jobs to run, or -1 to autodetect.
657+
n_jobs : int or None, optional (default=None)
658+
Number of parallel jobs to run.
659+
``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.
660+
``-1`` means using all processors. See :term:`Glossary <n_jobs>`
661+
for more details.
653662
654663
method : {'lars', 'cd'}
655664
lars: uses the least angle regression method to solve the lasso problem
@@ -943,8 +952,11 @@ class SparseCoder(BaseEstimator, SparseCodingMixin):
943952
its negative part and its positive part. This can improve the
944953
performance of downstream classifiers.
945954
946-
n_jobs : int,
947-
number of parallel jobs to run
955+
n_jobs : int or None, optional (default=None)
956+
Number of parallel jobs to run.
957+
``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.
958+
``-1`` means using all processors. See :term:`Glossary <n_jobs>`
959+
for more details.
948960
949961
positive_code : bool
950962
Whether to enforce positivity when finding the code.
@@ -1063,8 +1075,11 @@ class DictionaryLearning(BaseEstimator, SparseCodingMixin):
10631075
the reconstruction error targeted. In this case, it overrides
10641076
`n_nonzero_coefs`.
10651077
1066-
n_jobs : int,
1067-
number of parallel jobs to run
1078+
n_jobs : int or None, optional (default=None)
1079+
Number of parallel jobs to run.
1080+
``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.
1081+
``-1`` means using all processors. See :term:`Glossary <n_jobs>`
1082+
for more details.
10681083
10691084
code_init : array of shape (n_samples, n_components),
10701085
initial value for the code, for warm restart
@@ -1214,8 +1229,11 @@ class MiniBatchDictionaryLearning(BaseEstimator, SparseCodingMixin):
12141229
Lasso solution (linear_model.Lasso). Lars will be faster if
12151230
the estimated components are sparse.
12161231
1217-
n_jobs : int,
1218-
number of parallel jobs to run
1232+
n_jobs : int or None, optional (default=None)
1233+
Number of parallel jobs to run.
1234+
``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.
1235+
``-1`` means using all processors. See :term:`Glossary <n_jobs>`
1236+
for more details.
12191237
12201238
batch_size : int,
12211239
number of samples in each mini-batch

sklearn/decomposition/kernel_pca.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -89,9 +89,11 @@ class KernelPCA(BaseEstimator, TransformerMixin):
8989
9090
.. versionadded:: 0.18
9191
92-
n_jobs : int, default=1
92+
n_jobs : int or None, optional (default=None)
9393
The number of parallel jobs to run.
94-
If `-1`, then the number of jobs is set to the number of CPU cores.
94+
``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.
95+
``-1`` means using all processors. See :term:`Glossary <n_jobs>`
96+
for more details.
9597
9698
.. versionadded:: 0.18
9799

0 commit comments

Comments
 (0)
0