Description
GridSearchCV
's parameter n_jobs
appears to indicate the amount of jobs that the work should be split up into, but it's actually the amount of jobs that can run in parallel. Looking at the docs for joblib's Parallel
class, the explanation is much clearer than the one in sklearn's docs:
n_jobs: int, default: 1 :
The maximum number of concurrently running jobs, such as the number of Python worker processes when backend=”multiprocessing” or the size of the thread-pool when backend=”threading”. If -1 all CPUs are used. If 1 is given, no parallel computing code is used at all, which is useful for debugging. For n_jobs below -1, (n_cpus + 1 + n_jobs) are used. Thus for n_jobs = -2, all CPUs but one are used.
Can that be clarified in the docs?