8000 Serialization error when using parallelism in cross_val_score with GridSearchCV and a custom estimator · Issue #12413 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

Serialization error when using parallelism in cross_val_score with GridSearchCV and a custom estimator #12413

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
TomDLT opened this issue Oct 18, 2018 · 25 comments · Fixed by #12531
Labels
Milestone

Comments

@TomDLT
Copy link
Member
TomDLT commented Oct 18, 2018

Minimal example:

import numpy as np
from sklearn.model_selection import cross_val_score, GridSearchCV
from sklearn.base import ClassifierMixin, BaseEstimator

class Dummy(ClassifierMixin, BaseEstimator):
    def __init__(self, answer=1):
        self.answer = answer

    def fit(self, X, y=None):
        return self

    def predict(self, X):
        return np.ones(X.shape[0], dtype='int') * self.answer

n_samples, n_features = 500, 8
X = np.random.randn(n_samples, n_features)
y = np.random.randint(0, 2, n_samples)

dummy = Dummy()
gcv = GridSearchCV(dummy, {'answer': [0, 1]}, cv=5, iid=False, n_jobs=1)
cross_val_score(gcv, X, y, cv=5, n_jobs=5)

# BrokenProcessPool: A task has failed to un-serialize.
# Please ensure that the arguments of the function are all picklable.

Full traceback in details.

Interestingly, it does not fail when:

  • calling cross_val_score with n_jobs=1.
  • calling cross_val_score directly on dummy, without GridSearchCV.
  • using a imported classifier, as LogisticRegression, or even the same Dummy custom classifier but imported from another file.

This is a joblib 0.12 issue, different from #12289 or #12389. @ogrisel @tomMoral

Traceback (most recent call last):
  File "/cal/homes/tdupre/work/src/joblib/joblib/externals/loky/process_executor.py", line 393, in _process_worker
    call_item = call_queue.get(block=True, timeout=timeout)
  File "/cal/homes/tdupre/miniconda3/envs/py36/lib/python3.6/multiprocessing/queues.py", line 113, in get
    return _ForkingPickler.loads(res)
AttributeError: Can't get attribute 'Dummy' on <module 'sklearn.externals.joblib.externals.loky.backend.popen_loky_posix' from '/cal/homes/tdupre/work/src/scikit-learn/sklearn/externals/joblib/externals/loky/backend/popen_loky_posix.py'>
'''

The above exception was the direct cause of the following exception:

BrokenProcessPool                         Traceback (most recent call last)
~/work/src/script_csc/condition_effect/test.py in <module>()
     32 
     33     # fails
---> 34     cross_val_score(gcv, X, y, cv=5, n_jobs=5)
     35     """
     36     BrokenProcessPool: A task has failed to un-serialize.

~/work/src/scikit-learn/sklearn/model_selection/_validation.py in cross_val_score(estimator, X, y, groups, scoring, cv, n_jobs, verbose, fit_params, pre_dispatch, error_score)
    384                                 fit_params=fit_params,
    385                                 pre_dispatch=pre_dispatch,
--> 386                                 error_score=error_score)
    387     return cv_results['test_score']
    388 

~/work/src/scikit-learn/sklearn/model_selection/_validation.py in cross_validate(estimator, X, y, groups, scoring, cv, n_jobs, verbose, fit_params, pre_dispatch, return_train_score, return_estimator, error_score)
    232             return_times=True, return_estimator=return_estimator,
    233             error_score=error_score)
--> 234         for train, test in cv.split(X, y, groups))
    235 
    236     zipped_scores = list(zip(*scores))

~/work/src/joblib/joblib/parallel.py in __call__(self, iterable)
    996 
    997             with self._backend.retrieval_context():
--> 998                 self.retrieve()
    999             # Make sure that we get a last message telling us we are done
   1000             elapsed_time = time.time() - self._start_time

~/work/src/joblib/joblib/parallel.py in retrieve(self)
    899             try:
    900                 if getattr(self._backend, 'supports_timeout', False):
--> 901                     self._output.extend(job.get(timeout=self.timeout))
    902                 else:
    903                     self._output.extend(job.get())

~/work/src/joblib/joblib/_parallel_backends.py in wrap_future_result(future, timeout)
    519         AsyncResults.get from multiprocessing."""
    520         try:
--> 521             return future.result(timeout=timeout)
    522         except LokyTimeoutError:
    523             raise TimeoutError()

~/miniconda3/envs/py36/lib/python3.6/concurrent/futures/_base.py in result(self, timeout)
    403                 raise CancelledError()
    404             elif self._state == FINISHED:
--> 405                 return self.__get_result()
    406             else:
    407                 raise TimeoutError()

~/miniconda3/envs/py36/lib/python3.6/concurrent/futures/_base.py in __get_result(self)
    355     def __get_result(self):
    356         if self._exception:
--> 357             raise self._exception
    358         else:
    359             return self._result

BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.
@TomDLT TomDLT added the Bug label Oct 18, 2018
@TomDLT TomDLT added this to the 0.20.1 milestone Oct 18, 2018
@albertcthomas
Copy link
Contributor

I had the same issue a few days ago with a custom estimator. Thanks for the minimal example @TomDLT

@glemaitre
Copy link
Member

Duplicate of #12250

@sanketchavan08
Copy link

hii thank you for posting this but if we put n-jobs = 1, it will take much more right?
because n_jobs = -1 uses all the processors ....
did you find found any way to solve this issue
OR do we need much more powerful pc

@albertcthomas
Copy link
Contributor

You should not have such an error with the last version of scikit-learn. Which version are you using?

@sanketchavan08
Copy link

The scikit-learn version is 0.20.0.
and python 3.5

@albertcthomas
Copy link
Contributor

This is solved in versions > 0.20.1.

@sanketchavan08
Copy link
sanketchavan08 commented Jan 28, 2019

I really appreciate your help thank u
i will check after updating if that solve the problem

After checking both the version i am still getting error . the error is following
BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.

@elayden
Copy link
elayden commented Feb 1, 2019

I am still encountering this problem when using either scikit-learn versions 0.20.1 or 0.20.2.

@albertcthomas
Copy link
Contributor

@elayden could you post a reproducible example of the error?

@sanketchavan08
Copy link
sanketchavan08 commented Feb 2, 2019

Here I am sharing my complete code:
`from keras.wrappers.scikit_learn import KerasClassifier

from sklearn.model_selection import cross_val_score

from keras.models import Sequential

from keras.layers import Dense

def build_classifier():
classifier = Sequential()
classifier.add(Dense(units = 6, kernel_initializer = 'uniform', activation = 'relu', input_dim = 11))
classifier.add(Dense(units = 6, kernel_initializer = 'uniform', activation = 'relu'))
classifier.add(Dense(units = 1, kernel_initializer = 'uniform', activation = 'sigmoid'))
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
return classifier

classifier = KerasClassifier(build_fn = build_classifier, batch_size = 10, epochs = 100)

accuracies = cross_val_score(estimator = classifier, X = X_train, y = y_train, cv = 10, n_jobs = -1)`

I have checked for both scikit-learn versions 0.20.1 and 0.20.2.

@icoxfog417
Copy link
icoxfog417 commented Feb 7, 2019

I also have this problem. Python=3.6 and scikit-learn=0.20.2
I can skip the error by n_jobs=1.

@Hien-Trinh-91
Copy link
Hien-Trinh-91 commented Feb 9, 2019

@sanketchavan08 I tried code on Windows and it did not work. I changed to use it on Ubuntu and it well done. But it just uses CPU not GPU, may be it does not support GPU for ANN training
https://scikit-learn.org/stable/faq.html#why-is-there-no-support-for-deep-or-reinforcement-learning-will-there-be-support-for-deep-or-reinforcement-learning-in-scikit-learn

@kenga
Copy link
kenga commented Feb 18, 2019

I also have this problem on Window 7 even if I am running with CPUs.
I am using:
joblib 0.13.2
scikit-learn 0.20.2
python 3.6

@rfernandezv
Copy link

I also have this problem using:
Python 3.7.1
scikit-learn 0.20.1

but n_jobs=1 helps

@albertcthomas
Copy link
Contributor

Could you please open a new issue with a reproducible example? The error seems to concern only Windows and custom estimators implemented using Keras whereas the original issue was concerning any platform and any custom estimator

@rfernandezv
Copy link
rfernandezv commented Feb 22, 2019

Could you please open a new issue with a reproducible example? The error seems to concern only Windows and custom estimators implemented using Keras whereas the original issue was concerning any platform and any custom estimator

My problem is the same example explained in previous comments, and yes, I'm running a 64-bit Windows 10 and using keras. My lines of code are:

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

dataset = pd.read_csv('example.csv')
X = dataset.iloc[:, 3:13].values    
y = dataset.iloc[:, 13].values      

from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_X_1 = LabelEncoder()
X[:, 1] = labelencoder_X_1.fit_transform(X[:, 1])  
labelencoder_X_2 = LabelEncoder()
X[:, 2] = labelencoder_X_2.fit_transform(X[:, 2])   

onehotencoder = OneHotEncoder(categorical_features = [1])
X = onehotencoder.fit_transform(X).toarray()
X = X[:, 1:]

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)

from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import cross_val_score
from keras.models import Sequential
from keras.layers import Dense


def build_classifier():
    classifier = Sequential()
    classifier.add(Dense(units = 6, kernel_initializer = 'uniform', activation = 'relu', input_dim = 11))
    classifier.add(Dense(units = 6, kernel_initializer = 'uniform', activation = 'relu'))
    classifier.add(Dense(units = 1, kernel_initializer = 'uniform', activation = 'sigmoid'))
    classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
    return classifier
classifier = KerasClassifier(build_fn = build_classifier, batch_size = 10, epochs = 100)
accuracies = cross_val_score(estimator = classifier, X = X_train, y = y_train, cv = 10, n_jobs = -1)

Error:

accuracies = cross_val_score(estimator = classifier, X = X_train, y = y_train, cv = 10, n_jobs = -1)
Traceback (most recent call last):

  File "<ipython-input-4-cc51c2d2980a>", line 1, in <module>
    accuracies = cross_val_score(estimator = classifier, X = X_train, y = y_train, cv = 10, n_jobs = -1)

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\model_selection\_validation.py", line 402, in cross_val_score
    error_score=error_score)

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\model_selection\_validation.py", line 240, in cross_validate
    for train, test in cv.split(X, y, groups))

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py", line 930, in __call__
    self.retrieve()

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py", line 833, in retrieve
    self._output.extend(job.get(timeout=self.timeout))

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py", line 521, in wrap_future_result
    return future.result(timeout=timeout)

  File "C:\Users\Richar\Anaconda3\lib\concurrent\futures\_base.py", line 432, in result
    return self.__get_result()

  File "C:\Users\Richar\Anaconda3\lib\concurrent\futures\_base.py", line 384, in __get_result
    raise self._exception

BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.

I'm using:
Python 3.7.1
scikit-learn 0.20.1
Spyder 3.3.3

@albertcthomas
Copy link
Contributor

Thanks for the details @rfernandezv. What I meant is that the original issue, described in the top comment is not the same as the one you describe, although similar. The original issue was solved and closed. It would be better to open a new issue with your error (related to Windows, Keras and joblib).

@rfernandezv
Copy link

Thanks for the details @rfernandezv. What I meant is that the original issue, described in the top comment is not the same as the one you describe, although similar. The original issue was solved and closed. It would be better to open a new issue with your error (related to Windows, Keras and joblib).

Thanks @albertcthomas. I opened a new issue #13228

@tsdeepak
Copy link
tsdeepak commented Mar 6, 2019

using n_jobs = 1 , solved the error on windows 10 ,python 3.5

@ShubhamCoder007
Copy link

you need to install the joblib package.
If that error still happens, set n_jobs = None.

@rfernandezv
Copy link

using n_jobs = 1 , solved the error on windows 10 ,python 3.5

I know, but I wanted to fix the error with -1

@llc8888
Copy link
llc8888 commented May 16, 2019

import os
os.environ['KMP_DUPLICATE_LIB_OK']='True'
Hope these work for you

@vurdeljica
Copy link

I get the same error with Python 3.7.3, and Sklearn 0.20.2 under Windows 10. None of the above worked for me. Can this issue be solved somehow?

@ifranco14
Copy link

import os
os.environ['KMP_DUPLICATE_LIB_OK']='True'
Hope these work for you

What does it do @llc8888?

@Sumirmat97
Copy link

Having same issues with sklearn v0.20.2
Resolved this by updating to v0.22.2

Hope it helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

0