Serialization error when using parallelism in cross_val_score with GridSearchCV and a custom estimator #12413

TomDLT · 2018-10-18T16:01:11Z

Minimal example:

import numpy as np
from sklearn.model_selection import cross_val_score, GridSearchCV
from sklearn.base import ClassifierMixin, BaseEstimator

class Dummy(ClassifierMixin, BaseEstimator):
    def __init__(self, answer=1):
        self.answer = answer

    def fit(self, X, y=None):
        return self

    def predict(self, X):
        return np.ones(X.shape[0], dtype='int') * self.answer

n_samples, n_features = 500, 8
X = np.random.randn(n_samples, n_features)
y = np.random.randint(0, 2, n_samples)

dummy = Dummy()
gcv = GridSearchCV(dummy, {'answer': [0, 1]}, cv=5, iid=False, n_jobs=1)
cross_val_score(gcv, X, y, cv=5, n_jobs=5)

# BrokenProcessPool: A task has failed to un-serialize.
# Please ensure that the arguments of the function are all picklable.

Full traceback in details.

Interestingly, it does not fail when:

calling cross_val_score with n_jobs=1.
calling cross_val_score directly on dummy, without GridSearchCV.
using a imported classifier, as LogisticRegression, or even the same Dummy custom classifier but imported from another file.

This is a joblib 0.12 issue, different from #12289 or #12389. @ogrisel @tomMoral

Traceback (most recent call last):
  File "/cal/homes/tdupre/work/src/joblib/joblib/externals/loky/process_executor.py", line 393, in _process_worker
    call_item = call_queue.get(block=True, timeout=timeout)
  File "/cal/homes/tdupre/miniconda3/envs/py36/lib/python3.6/multiprocessing/queues.py", line 113, in get
    return _ForkingPickler.loads(res)
AttributeError: Can't get attribute 'Dummy' on <module 'sklearn.externals.joblib.externals.loky.backend.popen_loky_posix' from '/cal/homes/tdupre/work/src/scikit-learn/sklearn/externals/joblib/externals/loky/backend/popen_loky_posix.py'>
'''

The above exception was the direct cause of the following exception:

BrokenProcessPool                         Traceback (most recent call last)
~/work/src/script_csc/condition_effect/test.py in <module>()
     32 
     33     # fails
---> 34     cross_val_score(gcv, X, y, cv=5, n_jobs=5)
     35     """
     36     BrokenProcessPool: A task has failed to un-serialize.

~/work/src/scikit-learn/sklearn/model_selection/_validation.py in cross_val_score(estimator, X, y, groups, scoring, cv, n_jobs, verbose, fit_params, pre_dispatch, error_score)
    384                                 fit_params=fit_params,
    385                                 pre_dispatch=pre_dispatch,
--> 386                                 error_score=error_score)
    387     return cv_results['test_score']
    388 

~/work/src/scikit-learn/sklearn/model_selection/_validation.py in cross_validate(estimator, X, y, groups, scoring, cv, n_jobs, verbose, fit_params, pre_dispatch, return_train_score, return_estimator, error_score)
    232             return_times=True, return_estimator=return_estimator,
    233             error_score=error_score)
--> 234         for train, test in cv.split(X, y, groups))
    235 
    236     zipped_scores = list(zip(*scores))

~/work/src/joblib/joblib/parallel.py in __call__(self, iterable)
    996 
    997             with self._backend.retrieval_context():
--> 998                 self.retrieve()
    999             # Make sure that we get a last message telling us we are done
   1000             elapsed_time = time.time() - self._start_time

~/work/src/joblib/joblib/parallel.py in retrieve(self)
    899             try:
    900                 if getattr(self._backend, 'supports_timeout', False):
--> 901                     self._output.extend(job.get(timeout=self.timeout))
    902                 else:
    903                     self._output.extend(job.get())

~/work/src/joblib/joblib/_parallel_backends.py in wrap_future_result(future, timeout)
    519         AsyncResults.get from multiprocessing."""
    520         try:
--> 521             return future.result(timeout=timeout)
    522         except LokyTimeoutError:
    523             raise TimeoutError()

~/miniconda3/envs/py36/lib/python3.6/concurrent/futures/_base.py in result(self, timeout)
    403                 raise CancelledError()
    404             elif self._state == FINISHED:
--> 405                 return self.__get_result()
    406             else:
    407                 raise TimeoutError()

~/miniconda3/envs/py36/lib/python3.6/concurrent/futures/_base.py in __get_result(self)
    355     def __get_result(self):
    356         if self._exception:
--> 357             raise self._exception
    358         else:
    359             return self._result

BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.

The text was updated successfully, but these errors were encountered:

albertcthomas · 2018-10-18T17:54:36Z

I had the same issue a few days ago with a custom estimator. Thanks for the minimal example @TomDLT

glemaitre · 2018-10-21T18:56:08Z

Duplicate of #12250

sanketchavan08 · 2019-01-28T22:00:40Z

hii thank you for posting this but if we put n-jobs = 1, it will take much more right?
because n_jobs = -1 uses all the processors ....
did you find found any way to solve this issue
OR do we need much more powerful pc

albertcthomas · 2019-01-28T22:12:00Z

You should not have such an error with the last version of scikit-learn. Which version are you using?

sanketchavan08 · 2019-01-28T22:29:22Z

The scikit-learn version is 0.20.0.
and python 3.5

albertcthomas · 2019-01-28T22:31:42Z

This is solved in versions > 0.20.1.

sanketchavan08 · 2019-01-28T22:36:14Z

I really appreciate your help thank u
i will check after updating if that solve the problem

After checking both the version i am still getting error . the error is following
BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.

elayden · 2019-02-01T21:03:27Z

I am still encountering this problem when using either scikit-learn versions 0.20.1 or 0.20.2.

albertcthomas · 2019-02-02T00:14:35Z

@elayden could you post a reproducible example of the error?

sanketchavan08 · 2019-02-02T00:46:19Z

Here I am sharing my complete code:
`from keras.wrappers.scikit_learn import KerasClassifier

from sklearn.model_selection import cross_val_score

from keras.models import Sequential

from keras.layers import Dense

def build_classifier():
classifier = Sequential()
classifier.add(Dense(units = 6, kernel_initializer = 'uniform', activation = 'relu', input_dim = 11))
classifier.add(Dense(units = 6, kernel_initializer = 'uniform', activation = 'relu'))
classifier.add(Dense(units = 1, kernel_initializer = 'uniform', activation = 'sigmoid'))
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
return classifier

classifier = KerasClassifier(build_fn = build_classifier, batch_size = 10, epochs = 100)

accuracies = cross_val_score(estimator = classifier, X = X_train, y = y_train, cv = 10, n_jobs = -1)`

I have checked for both scikit-learn versions 0.20.1 and 0.20.2.

icoxfog417 · 2019-02-07T06:41:12Z

I also have this problem. Python=3.6 and scikit-learn=0.20.2
I can skip the error by n_jobs=1.

Hien-Trinh-91 · 2019-02-09T04:52:27Z

@sanketchavan08 I tried code on Windows and it did not work. I changed to use it on Ubuntu and it well done. But it just uses CPU not GPU, may be it does not support GPU for ANN training
https://scikit-learn.org/stable/faq.html#why-is-there-no-support-for-deep-or-reinforcement-learning-will-there-be-support-for-deep-or-reinforcement-learning-in-scikit-learn

kenga · 2019-02-18T03:41:36Z

I also have this problem on Window 7 even if I am running with CPUs.
I am using:
joblib 0.13.2
scikit-learn 0.20.2
python 3.6

rfernandezv · 2019-02-21T04:32:00Z

I also have this problem using:
Python 3.7.1
scikit-learn 0.20.1

but n_jobs=1 helps

albertcthomas · 2019-02-21T09:42:47Z

Could you please open a new issue with a reproducible example? The error seems to concern only Windows and custom estimators implemented using Keras whereas the original issue was concerning any platform and any custom estimator

rfernandezv · 2019-02-22T05:43:41Z

Could you please open a new issue with a reproducible example? The error seems to concern only Windows and custom estimators implemented using Keras whereas the original issue was concerning any platform and any custom estimator

My problem is the same example explained in previous comments, and yes, I'm running a 64-bit Windows 10 and using keras. My lines of code are:

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

dataset = pd.read_csv('example.csv')
X = dataset.iloc[:, 3:13].values    
y = dataset.iloc[:, 13].values      

from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_X_1 = LabelEncoder()
X[:, 1] = labelencoder_X_1.fit_transform(X[:, 1])  
labelencoder_X_2 = LabelEncoder()
X[:, 2] = labelencoder_X_2.fit_transform(X[:, 2])   

onehotencoder = OneHotEncoder(categorical_features = [1])
X = onehotencoder.fit_transform(X).toarray()
X = X[:, 1:]

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)

from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import cross_val_score
from keras.models import Sequential
from keras.layers import Dense


def build_classifier():
    classifier = Sequential()
    classifier.add(Dense(units = 6, kernel_initializer = 'uniform', activation = 'relu', input_dim = 11))
    classifier.add(Dense(units = 6, kernel_initializer = 'uniform', activation = 'relu'))
    classifier.add(Dense(units = 1, kernel_initializer = 'uniform', activation = 'sigmoid'))
    classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
    return classifier
classifier = KerasClassifier(build_fn = build_classifier, batch_size = 10, epochs = 100)
accuracies = cross_val_score(estimator = classifier, X = X_train, y = y_train, cv = 10, n_jobs = -1)

Error:

accuracies = cross_val_score(estimator = classifier, X = X_train, y = y_train, cv = 10, n_jobs = -1)
Traceback (most recent call last):

  File "<ipython-input-4-cc51c2d2980a>", line 1, in <module>
    accuracies = cross_val_score(estimator = classifier, X = X_train, y = y_train, cv = 10, n_jobs = -1)

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\model_selection\_validation.py", line 402, in cross_val_score
    error_score=error_score)

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\model_selection\_validation.py", line 240, in cross_validate
    for train, test in cv.split(X, y, groups))

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py", line 930, in __call__
    self.retrieve()

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py", line 833, in retrieve
    self._output.extend(job.get(timeout=self.timeout))

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py", line 521, in wrap_future_result
    return future.result(timeout=timeout)

  File "C:\Users\Richar\Anaconda3\lib\concurrent\futures\_base.py", line 432, in result
    return self.__get_result()

  File "C:\Users\Richar\Anaconda3\lib\concurrent\futures\_base.py", line 384, in __get_result
    raise self._exception

BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.

I'm using:
Python 3.7.1
scikit-learn 0.20.1
Spyder 3.3.3

albertcthomas · 2019-02-22T10:08:13Z

Thanks for the details @rfernandezv. What I meant is that the original issue, described in the top comment is not the same as the one you describe, although similar. The original issue was solved and closed. It would be better to open a new issue with your error (related to Windows, Keras and joblib).

rfernandezv · 2019-02-22T19:04:44Z

Thanks for the details @rfernandezv. What I meant is that the original issue, described in the top comment is not the same as the one you describe, although similar. The original issue was solved and closed. It would be better to open a new issue with your error (related to Windows, Keras and joblib).

Thanks @albertcthomas. I opened a new issue #13228

tsdeepak · 2019-03-06T08:18:24Z

using n_jobs = 1 , solved the error on windows 10 ,python 3.5

ShubhamCoder007 · 2019-03-06T17:29:34Z

you need to install the joblib package.
If that error still happens, set n_jobs = None.

rfernandezv · 2019-03-08T01:24:40Z

using n_jobs = 1 , solved the error on windows 10 ,python 3.5

I know, but I wanted to fix the error with -1

llc8888 · 2019-05-16T11:07:21Z

import os
os.environ['KMP_DUPLICATE_LIB_OK']='True'
Hope these work for you

vurdeljica · 2019-05-31T12:57:00Z

I get the same error with Python 3.7.3, and Sklearn 0.20.2 under Windows 10. None of the above worked for me. Can this issue be solved somehow?

ifranco14 · 2019-11-20T18:05:20Z

import os
os.environ['KMP_DUPLICATE_LIB_OK']='True'
Hope these work for you

What does it do @llc8888?

Sumirmat97 · 2020-03-28T08:44:35Z

Having same issues with sklearn v0.20.2
Resolved this by updating to v0.22.2

Hope it helps.

TomDLT added the Bug label Oct 18, 2018

TomDLT added this to the 0.20.1 milestone Oct 18, 2018

glemaitre marked this as a duplicate of #12250 Oct 21, 2018

mblouin02 mentioned this issue Oct 24, 2018

GridSearchCV cannot be paralleled when custom scoring is used #10054

Closed

amueller mentioned this issue Oct 26, 2018

Grid Search CV hangs with n_jobs anything other than 1 #10533

Closed

ogrisel mentioned this issue Nov 6, 2018

joblib 0.13.0 #12531

Merged

ogrisel closed this as completed in #12531 Nov 7, 2018

kirilldolmatov mentioned this issue Jun 17, 2019

Serialization error when using parallelism in GridSearchCV #14110

Closed

MaartenGr mentioned this issue Oct 1, 2021

help sought to train a big data sentence model (upto 1.5 million sentences) MaartenGr/BERTopic#151

Closed

jonlee112 mentioned this issue Dec 14, 2021

Task failing to unserialize ddangelov/Top2Vec#229

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Serialization error when using parallelism in cross_val_score with GridSearchCV and a custom estimator #12413

Serialization error when using parallelism in cross_val_score with GridSearchCV and a custom estimator #12413

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Serialization error when using parallelism in cross_val_score with GridSearchCV and a custom estimator #12413

Serialization error when using parallelism in cross_val_score with GridSearchCV and a custom estimator #12413

Comments

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!