@@ -114,9 +114,11 @@ threads than the number of CPUs on a machine. Over-subscription happens when
114
114
a program is running too many threads at the same time.
115
115
116
116
Suppose you have a machine with 8 CPUs. Consider a case where you're running
117
- a :class: `~GridSearchCV ` (parallelized with joblib) with ``n_jobs=8 `` over
118
- a :class: `~HistGradientBoostingClassifier ` (parallelized with OpenMP). Each
119
- instance of :class: `~HistGradientBoostingClassifier ` will spawn 8 threads
117
+ a :class: `~sklearn.model_selection.GridSearchCV ` (parallelized with joblib)
118
+ with ``n_jobs=8 `` over a
119
+ :class: `~sklearn.ensemble.HistGradientBoostingClassifier ` (parallelized with
120
+ OpenMP). Each instance of
121
+ :class: `~sklearn.ensemble.HistGradientBoostingClassifier ` will spawn 8 threads
120
122
(since you have 8 CPUs). That's a total of ``8 * 8 = 64 `` threads, which
121
123
leads to oversubscription of physical CPU resources and to scheduling
122
124
overhead.
@@ -129,9 +131,10 @@ is the default), joblib will tell its child **processes** to limit the
129
131
number of threads they can use, so as to avoid oversubscription. In practice
130
132
the heuristic that joblib uses is to tell the processes to use ``max_threads
131
133
= n_cpus // n_jobs ``, via their corresponding environment variable. Back to
132
- our example from above, since the joblib backend of :class: `~GridSearchCV `
133
- is ``loky ``, each process will only be able to use 1 thread instead of 8,
134
- thus mitigating the oversubscription issue.
134
+ our example from above, since the joblib backend of
135
+ :class: `~sklearn.model_selection.GridSearchCV ` is ``loky ``, each process will
136
+ only be able to use 1 thread instead of 8, thus mitigating the
137
+ oversubscription issue.
135
138
136
139
Note that:
137
140
0 commit comments