8000 cross_val_score issue with n_jobs = -1 on Windows · Issue #13228 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

cross_val_score issue with n_jobs = -1 on Windows #13228

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rfernandezv opened this issue Feb 22, 2019 · 45 comments
Closed

cross_val_score issue with n_jobs = -1 on Windows #13228

rfernandezv opened this issue Feb 22, 2019 · 45 comments

Comments

@rfernandezv
Copy link

Description

The error is thrown when utilizing n_jobs = -1 with the function: cross_val_score. If I use n_jobs = 1, it works fine.

Steps/Code to Reproduce

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

dataset = pd.read_csv('example.csv')
X = dataset.iloc[:, 3:13].values    
y = dataset.iloc[:, 13].values      

from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_X_1 = LabelEncoder()
X[:, 1] = labelencoder_X_1.fit_transform(X[:, 1])  
labelencoder_X_2 = LabelEncoder()
X[:, 2] = labelencoder_X_2.fit_transform(X[:, 2])   

onehotencoder = OneHotEncoder(categorical_features = [1])
X = onehotencoder.fit_transform(X).toarray()
X = X[:, 1:]

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)

from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import cross_val_score
from keras.models import Sequential
from keras.layers import Dense


def build_classifier():
    classifier = Sequential()
    classifier.add(Dense(units = 6, kernel_initializer = 'uniform', activation = 'relu', input_dim = 11))
    classifier.add(Dense(units = 6, kernel_initializer = 'uniform', activation = 'relu'))
    classifier.add(Dense(units = 1, kernel_initializer = 'uniform', activation = 'sigmoid'))
    classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
    return classifier
classifier = KerasClassifier(build_fn = build_classifier, batch_size = 10, epochs = 100)
accuracies = cross_val_score(estimator = classifier, X = X_train, y = y_train, cv = 10, n_jobs = -1)

Expected Results

Expect my example to run multiple epochs at a time.

Actual Results

Error:

accuracies = cross_val_score(estimator = classifier, X = X_train, y = y_train, cv = 10, n_jobs = -1)
Traceback (most recent call last):

  File "<ipython-input-4-cc51c2d2980a>", line 1, in <module>
    accuracies = cross_val_score(estimator = classifier, X = X_train, y = y_train, cv = 10, n_jobs = -1)

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\model_selection\_validation.py", line 402, in cross_val_score
    error_score=error_score)

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\model_selection\_validation.py", line 240, in cross_validate
    for train, test in cv.split(X, y, groups))

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py", line 930, in __call__
    self.retrieve()

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py", line 833, in retrieve
    self._output.extend(job.get(timeout=self.timeout))

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py", line 521, in wrap_future_result
    return future.result(timeout=timeout)

  File "C:\Users\Richar\Anaconda3\lib\concurrent\futures\_base.py", line 432, in result
    return self.__get_result()

  File "C:\Users\Richar\Anaconda3\lib\concurrent\futures\_base.py", line 384, in __get_result
    raise self._exception

BrokenProcessPool: A task has failed to un-
8000
serialize. Please ensure that the arguments of the function are all picklable.

I think the problem is related to Windows, Keras and joblib

Versions

I'm using:

  • 64-bit Windows 10
  • Python 3.7.1
  • scikit-learn 0.20.1
  • Spyder 3.3.3

Thanks for the help.

@jnothman
Copy link
Member

Put build_classifier in a separate module and import it

@christian-steinmeyer
Copy link

May I ask how this fixes the issue and is this intended behavior?

@albertcthomas
Copy link
Contributor

@rfernandezv or anyone else having this error, could you please try @jnothman's solution and tell us if this solves the issue?

@christian-steinmeyer
Copy link

For me, this seems to do the trick. My code didn't run through yet (total runtime is many hours), but so far no errors (which was the case before).

@jnothman
Copy link
Member

Something can only be unpickled if it can be imported by name.

@albertcthomas
Copy link
Contributor

This should work with cloudpickle, maybe with the next release of joblib.

@rfernandezv
Copy link
Author

@rfernandezv or anyone else having this error, could you please try @jnothman's solution and tell us if this solves the issue?

I tried but I still get the same error.

from newFile import build_classifier_function
# Evaluating the ANN

classifier = KerasClassifier(build_fn = build_classifier_function, batch_size = 10, epochs = 100)
accuracies = cross_val_score(estimator = classifier, X = X_train, y = y_train, cv = 10, n_jobs = -1)

Error:

Traceback (most recent call last):

  File "<ipython-input-50-cc51c2d2980a>", line 1, in <module>
    accuracies = cross_val_score(estimator = classifier, X = X_train, y = y_train, cv = 10, n_jobs = -1)

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\model_selection\_validation.py", line 402, in cross_val_score
    error_score=error_score)

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\model_selection\_validation.py", line 240, in cross_validate
    for train, test in cv.split(X, y, groups))

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py", line 930, in __call__
    self.retrieve()

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py", line 833, in retrieve
    self._output.extend(job.get(timeout=self.timeout))

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py", line 521, in wrap_future_result
    return future.result(timeout=timeout)

  File "C:\Users\Richar\Anaconda3\lib\concurrent\futures\_base.py", line 432, in result
    return self.__get_result()

  File "C:\Users\Richar\Anaconda3\lib\concurrent\futures\_base.py", line 384, in __get_result
    raise self._exception

BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.

Updated to version 0.13.2 (joblib) and still have the same error.
Only works for me with n_jobs = 1, no -1

@albertcthomas
Copy link
Contributor

cross_val_score is using the version of joblib vendored with scikit-learn so updating joblib won't solve the error, you need to wait for the update of the vendored joblib to try it in your case.

However I don't know why you still have the error when importing from another module.

@tomMoral
Copy link
Contributor
tomMoral commented Mar 7, 2019

I could not reproduce your bug with the latest versiono of sklearn.
Could you update your version to 0.20.2 and report if there is any changes?

For the record, this should work even if you do 8000 not put the build_classifier function in a new module with newest versions of joblib as we are relying on cloudpickle.

@rfernandezv
Copy link
Author

I could not reproduce your bug with the latest versiono of sklearn.
Could you update your version to 0.20.2 and report if there is any changes?

For the record, this should work even if you do not put the build_classifier function in a new module with newest versions of joblib as we are relying on cloudpickle.

I tried but the same error :(

BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.

Updated to version 0.20.3

import sklearn print(sklearn.__version__) 0.20.3

@tomMoral
Copy link
Contributor
tomMoral commented Mar 8, 2019 via email

@rfernandezv
Copy link
Author

Look, this is the complete error:

accuracies = cross_val_score(estimator = classifier, X = X_train, y = y_train, cv = 10, n_jobs = -1)
Traceback (most recent call last):

  File
8000
 "<ipython-input-6-cc51c2d2980a>", line 1, in <module>
    accuracies = cross_val_score(estimator = classifier, X = X_train, y = y_train, cv = 10, n_jobs = -1)

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\model_selection\_validation.py", line 402, in cross_val_score
    error_score=error_score)

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\model_selection\_validation.py", line 240, in cross_validate
    for train, test in cv.split(X, y, groups))

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py", line 930, in __call__
    self.retrieve()

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py", line 833, in retrieve
    self._output.extend(job.get(timeout=self.timeout))

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py", line 521, in wrap_future_result
    return future.result(timeout=timeout)

  File "C:\Users\Richar\Anaconda3\lib\concurrent\futures\_base.py", line 432, in result
    return self.__get_result()

  File "C:\Users\Richar\Anaconda3\lib\concurrent\futures\_base.py", line 384, in __get_result
    raise self._exception

BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.

@albertcthomas
Copy link
Contributor

@rfernandezv I could not reproduce the error either with the following code on windows

from keras.layers import Dense
from keras.models import Sequential
from sklearn.model_selection import cross_val_score
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split

from sklearn.datasets import make_classification

X, y = make_classification(n_features=11)

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=0)

sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)


def build_classifier():
    classifier = Sequential()
    classifier.add(Dense(units=6, kernel_initializer='uniform',
                         activation='relu', input_dim=11))
    classifier.add(
        Dense(units=6, kernel_initializer='uniform', activation='relu'))
    classifier.add(
        Dense(units=1, kernel_initializer='uniform', activation='sigmoid'))
    classifier.compile(
        optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
    return classifier


classifier = KerasClassifier(
    build_fn=build_classifier, batch_size=10, epochs=100)
accuracies = cross_val_score(
    estimator=classifier, X=X_train, y=y_train, cv=10, n_jobs=-1)

The versions that I'm using:
Windows 10
Python 3.6.6
NumPy 1.15.4
SciPy 1.2.0
Scikit-Learn 0.20.2
Keras 2.2.4
Tensorflow 1.12.0

@olgazju
Copy link
olgazju commented Mar 11, 2019

I have the same error with cross_val_score and n_jobs=1

BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.

Python 3.7.2
scikit-learn==0.20.3
joblib==0.13.2
windows 8.1

@christian-steinmeyer
Copy link
christian-steinmeyer commented Mar 19, 2019

I've ran into this problem again, in a slightly different form using scikit-learn 0.20.3:
I've set up a grid search in a for loop (in dependence of a balance strategy, as I have imbalanced data). Not, interestingly, this runs through smoothly when executed for the first time, however upon the second run, I get a similar error to above.

Here's a narrowed down version of my code:

from other.module import multi_metric_score, create_model, early_stopping

# ... define parameters

model = KerasClassifier(build_fn=create_model, epochs=100, batch_size=256, verbose=0)
multi_metric_scorer = make_scorer(multi_metric_score, greater_is_better=True)

for balance_strategy, _ in balance_strategies.items():
    # ... get X, y in dependence of balance_strategy

    skfs = StratifiedKFold(n_splits=3, shuffle=True, random_state=42)
    classifier = GridSearchCV(estimator=model, param_grid=parameters, cv=skfs, scoring=multi_metric_scorer, verbose=1, n_jobs=-1,
                              return_train_score=True)
    # ... define ratio
    results = classifier.fit(X, y, callbacks=early_stopping, class_weight=[ratio, 1.0-ratio])
    data = pd.DataFrame(results.cv_results_['params'])
    data['mean_test_score'] = results.cv_results_['mean_test_score']
    data['mean_fit_time'] = results.cv_results_['mean_fit_time']
    data['balance_strategy'] = balance_strategy

And here is the console output including the complete stack trace:

....    
Fitting 3 folds for each of 1 candidates, totalling 3 fits
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.
Using TensorFlow backend.
Using TensorFlow backend.
Using TensorFlow backend.
...
[Parallel(n_jobs=-1)]: Done   3 out of   3 | elapsed:   12.3s finished
2019-03-19 13:39:38.850213: I tensorflow/core/common_runtime/process_util.cc:71] Creating new thread pool with default inter op setting: 4. Tune using inter_op_parallelism_threads for best performance.
Fitting 3 folds for each of 1 candidates, totalling 3 fits
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.
sklearn.externals.joblib.externals.loky.process_executor._RemoteTraceback: 
"""
Traceback (most recent call last):
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\site-packages\sklearn\externals\joblib\externals\loky\backend\queues.py", line 150, in _feed
    obj_ = dumps(obj, reducers=reducers)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\site-packages\sklearn\externals\joblib\externals\loky\backend\reduction.py", line 243, in dumps
    dump(obj, buf, reducers=reducers, protocol=protocol)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\site-packages\sklearn\externals\joblib\externals\loky\backend\reduction.py", line 236, in dump
    _LokyPickler(file, reducers=reducers, protocol=protocol).dump(obj)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\site-packages\sklearn\externals\joblib\externals\cloudpickle\cloudpickle.py", line 284, in dump
    return Pickler.dump(self, obj)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 409, in dump
    self.save(obj)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 521, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 634, in save_reduce
    save(state)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 476, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 821, in save_dict
    self._batch_setitems(obj.items())
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 847, in _batch_setitems
    save(v)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 521, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 634, in save_reduce
    save(state)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 476, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 821, in save_dict
    self._batch_setitems(obj.items())
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 852, in _batch_setitems
    save(v)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 521, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 634, in save_reduce
    save(state)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 476, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 821, in save_dict
    self._batch_setitems(obj.items())
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 847, in _batch_setitems
    save(v)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 476, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 781, in save_list
    self._batch_appends(obj)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 808, in _batch_appends
    save(tmp[0])
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 476, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 736, in save_tuple
    save(element)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 476, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 821, in save_dict
    self._batch_setitems(obj.items())
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 847, in _batch_setitems
    save(v)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 476, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 821, in save_dict
    self._batch_setitems(obj.items())
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 847, in _batch_setitems
    save(v)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 521, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 634, in save_reduce
    save(state)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 476, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 821, in save_dict
    self._batch_setitems(obj.items())
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 847, in _batch_setitems
    save(v)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 476, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 781, in save_list
    self._batch_appends(obj)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 808, in _batch_appends
    save(tmp[0])
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 521, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 634, in save_reduce
    save(state)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 476, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 821, in save_dict
    self._batch_setitems(obj.items())
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 847, in _batch_setitems
    save(v)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 476, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 781, in save_list
    self._batch_appends(obj)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 808, in _batch_appends
    save(tmp[0])
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 521, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 634, in save_reduce
    save(state)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 476, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 821, in save_dict
    self._batch_setitems(obj.items())
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 847, in _batch_setitems
    save(v)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 476, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 781, in save_list
    self._batch_appends(obj)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 805, in _batch_appends
    save(x)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 521, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 634, in save_reduce
    save(state)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 476, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 821, in save_dict
    self._batch_setitems(obj.items())
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 847, in _batch_setitems
    save(v)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 521, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 634, in save_reduce
    save(state)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 476, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 821, in save_dict
    self._batch_setitems(obj.items())
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 847, in _batch_setitems
    save(v)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 521, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 634, in save_reduce
    save(state)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 476, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 821, in save_dict
    self._batch_setitems(obj.items())
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 847, in _batch_setitems
    save(v)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 521, in 
8000
save
    self.save_reduce(obj=obj, *rv)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 634, in save_reduce
    save(state)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 476, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 821, in save_dict
    self._batch_setitems(obj.items())
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 847, in _batch_setitems
    save(v)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\pickle.py", line 496, in save
    rv = reduce(self.proto)
TypeError: can't pickle _thread.RLock objects
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:/Users/chris/Documents/Academics/Uni/Georg-August-University/Computer Science/Masterarbeit/code/src/model-grid-search.py", line 42, in <module>
    results = classifier.fit(X, y, callbacks=early_stopping, class_weight=[ratio, 1.0-ratio])
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\site-packages\sklearn\model_selection\_search.py", line 722, in fit
    self._run_search(evaluate_candidates)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\site-packages\sklearn\model_selection\_search.py", line 1191, in _run_search
    evaluate_candidates(ParameterGrid(self.param_grid))
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\site-packages\sklearn\model_selection\_search.py", line 711, in evaluate_candidates
    cv.split(X, y, groups)))
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\site-packages\sklearn\externals\joblib\parallel.py", line 930, in __call__
    self.retrieve()
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\site-packages\sklearn\externals\joblib\parallel.py", line 833, in retrieve
    self._output.extend(job.get(timeout=self.timeout))
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py", line 521, in wrap_future_result
    return future.result(timeout=timeout)
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\concurrent\futures\_base.py", line 432, in result
    return self.__get_result()
  File "C:\ProgramData\Miniconda\envs\nnmp\lib\concurrent\futures\_base.py", line 384, in __get_result
    raise self._exception
_pickle.PicklingError: Could not pickle the task to send it to the workers.

@rfernandezv
Copy link
Author

@albertcthomas, I ran your code but it does not work

Traceback (most recent call last):

  File "<ipython-input-5-8e611add1011>", line 2, in <module>
    estimator=classifier, X=X_train, y=y_train, cv=10, n_jobs=-1)

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\model_selection\_validation.py", line 402, in cross_val_score
    error_score=error_score)

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\model_selection\_validation.py", line 240, in cross_validate
    for train, test in cv.split(X, y, groups))

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py", line 930, in __call__
    self.retrieve()

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py", line 833, in retrieve
    self._output.extend(job.get(timeout=self.timeout))

  File "C:\Users\Richar\Anaconda3\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py", line 521, in wrap_future_result
    return future.result(timeout=timeout)

  File "C:\Users\Richar\Anaconda3\lib\concurrent\futures\_base.py", line 432, in result
    return self.__get_result()

  File "C:\Users\Richar\Anaconda3\lib\concurrent\futures\_base.py", line 384, in __get_result
    raise self._exception

BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.

@shahabsh92
Copy link

Hi, I have exactly the same issue like by @rfernandezv
I get the same error if I use any other number than 1 for n_jobs in the cross_val_score function.
The Code is the same.

Python 3.7.1
Keras 2.2.4
TensorFlow 1.13.1
scipy 1.1.0
Numpy 1.15.4
Sklearn 0.20.2

@rfernandezv
Copy link
Author
628C rfernandezv commented Apr 27, 2019

I have not found the solution yet. @shahabsh92 , did you try to update Sklearn version?
Updated to version 0.20.3 but it did not work

@kush-daga
Copy link

Has anyone been able to solve this issue, as im getting the same error in conda envt with python 3.7

@shahabsh92
Copy link

@rfernandezv yes, I created another environment in the Anaconda with the latest version of sklearn 0.20.3 + keras-gpu and it didnt work again. It works only with n_jobs = 1 and nothing else.

@albertcthomas
Copy link
Contributor

@shahabsh92 you mean that you have the same error as @rfernandezv when you run the code in this comment?

@kush-daga
Copy link

I feel it's an issue with the ipython console in spyder, i ran the same code in terminal of vscode using the python console and also in jupyter.. It worked.

@nicholasg97
Copy link

I feel it's an issue with the ipython console in spyder, i ran the same code in terminal of vscode using the python console and also in jupyter.. It worked.

I concur, I'm having the same issue with spyder. Restarting the kernel doesn't work...

@shahabsh92
Copy link

@albertcthomas yes exactly. I get both on spyder and Jupiter Errors. Both BrokenProcessPool like @rfernandezv.

That's the Error on IPython console of Spyder:

`exception calling callback for <Future at 0x2139edbbe10 state=finished raised BrokenProcessPool>
sklearn.externals.joblib.externals.loky.process_executor.RemoteTraceback:
'''
Traceback (most recent call last):
File "D:\Softwares\Anacoda\lib\site-packages\sklearn\externals\joblib\externals\loky\process_executor.py", line 393, in process_worker
call_item = call_queue.get(block=True, timeout=timeout)
File "D:\Softwares\Anacoda\lib\multiprocessing\queues.py", line 113, in get
return ForkingPickler.loads(res)
File "D:\Softwares\Anacoda\lib\site-packages\keras_init
.py", line 3, in
from . import utils
File "D:\Softwares\Anacoda\lib\site-packages\keras\utils_init
.py", line 6, in
from . import conv_utils
File "D:\Softwares\Anacoda\lib\site-packages\keras\utils\conv_utils.py", line 9, in
from .. import backend as K
File "D:\Softwares\Anacoda\lib\site-packages\keras\backend_init.py", line 88, in
sys.stderr.write('Using TensorFlow backend.\n')
AttributeError: 'NoneType' object has no attribute 'write'
'''

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "D:\Softwares\Anacoda\lib\site-packages\sklearn\externals\joblib\externals\loky_base.py", line 625, in _invoke_callbacks
callback(self)
File "D:\Softwares\Anacoda\lib\site-packages\sklearn\externals\joblib\parallel.py", line 375, in call
self.parallel.dispatch_next()
File "D:\Softwares\Anacoda\lib\site-packages\sklearn\externals\joblib\parallel.py", line 797, in dispatch_next
if not self.dispatch_one_batch(self._original_iterator):
File "D:\Softwares\Anacoda\lib\site-packages\sklearn\externals\joblib\parallel.py", line 825, in dispatch_one_batch
self._dispatch(tasks)
File "D:\Softwares\Anacoda\lib\site-packages\sklearn\externals\joblib\parallel.py", line 782, in _dispatch
job = self._backend.apply_async(batch, callback=cb)
File "D:\Softwares\Anacoda\lib\site-packages\sklearn\externals\joblib_parallel_backends.py", line 506, in apply_async
future = self._workers.submit(SafeFunction(func))
File "D:\Softwares\Anacoda\lib\site-packages\sklearn\externals\joblib\externals\loky\reusable_executor.py", line 151, in submit
fn, *args, **kwargs)
File "D:\Softwares\Anacoda\lib\site-packages\sklearn\externals\joblib\externals\loky\process_executor.py", line 1016, in submit
raise self._flags.broken
sklearn.externals.joblib.externals.loky.process_executor.BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.
Traceback (most recent call last):

File "", line 4, in
estimator=classifier, X=X_train, y=y_train, cv=10, n_jobs=-1)

File "D:\Softwares\Anacoda\lib\site-packages\sklearn\model_selection_validation.py", line 402, in cross_val_score
error_score=error_score)

File "D:\Softwares\Anacoda\lib\site-packages\sklearn\model_selection_validation.py", line 240, in cross_validate
for train, test in cv.split(X, y, groups))

File "D:\Softwares\Anacoda\lib\site-packages\sklearn\externals\joblib\parallel.py", line 996, in call
self.retrieve()

File "D:\Softwares\Anacoda\lib\site-packages\sklearn\externals\joblib\parallel.py", line 899, in retrieve
self._output.extend(job.get(timeout=self.timeout))

File "D:\Softwares\Anacoda\lib\site-packages\sklearn\externals\joblib_parallel_backends.py", line 517, in wrap_future_result
return future.result(timeout=timeout)

File "D:\Softwares\Anacoda\lib\concurrent\futures_base.py", line 405, in result
return self.__get_result()

File "D:\Softwares\Anacoda\lib\concurrent\futures_base.py", line 357, in __get_result
raise self._exception

File "D:\Softwares\Anacoda\lib\site-packages\sklearn\externals\joblib\externals\loky_base.py", line 625, in _invoke_callbacks
callback(self)

File "D:\Softwares\Anacoda\lib\site-packages\sklearn\externals\joblib\parallel.py", line 375, in call
self.parallel.dispatch_next()

File "D:\Softwares\Anacoda\lib\site-packages\sklearn\externals\joblib\parallel.py", line 797, in dispatch_next
if not self.dispatch_one_batch(self._original_iterator):

File "D:\Softwares\Anacoda\lib\site-packages\sklearn\externals\joblib\parallel.py", line 825, in dispatch_one_batch
self._dispatch(tasks)

File "D:\Softwares\Anacoda\lib\site-packages\sklearn\externals\joblib\parallel.py", line 782, in _dispatch
job = self._backend.apply_async(batch, callback=cb)

File "D:\Softwares\Anacoda\lib\site-packages\sklearn\externals\joblib_parallel_backends.py", line 506, in apply_async
future = self._workers.submit(SafeFunction(func))

File "D:\Softwares\Anacoda\lib\site-packages\sklearn\externals\joblib\externals\loky\reusable_executor.py", line 151, in submit
fn, *args, **kwargs)

File "D:\Softwares\Anacoda\lib\site-packages\sklearn\externals\joblib\externals\loky\process_executor.py", line 1016, in submit
raise self._flags.broken

BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.

Traceback (most recent call last):

File "", line 4, in
estimator=classifier, X=X_train, y=y_train, cv=10, n_jobs=-1)

File "D:\Softwares\Anacoda\lib\site-packages\sklearn\model_selection_validation.py", line 402, in cross_val_score
error_score=error_score)

File "D:\Softwares\Anacoda\lib\site-packages\sklearn\model_selection_validation.py", line 240, in cross_validate
for train, test in cv.split(X, y, groups))

File "D:\Softwares\Anacoda\lib\site-packages\sklearn\externals\joblib\parallel.py", line 996, in call
self.retrieve()

File "D:\Softwares\Anacoda\lib\site-packages\sklearn\externals\joblib\parallel.py", line 899, in retrieve
self._output.extend(job.get(timeout=self.timeout))

File "D:\Softwares\Anacoda\lib\site-packages\sklearn\externals\joblib_parallel_backends.py", line 517, in wrap_future_result
return future.result(timeout=timeout)

File "D:\Softwares\Anacoda\lib\concurrent\futures_base.py", line 405, in result
return self.__get_result()

File "D:\Softwares\Anacoda\lib\concurrent\futures_base.py", line 357, in __get_result
raise self._exception

File "D:\Softwares\Anacoda\lib\site-packages\sklearn\externals\joblib\externals\loky_base.py", line 625, in _invoke_callbacks
callback(self)

File "D:\Softwares\Anacoda\lib\site-packages\sklearn\externals\joblib\parallel.py", line 375, in call
self.parallel.dispatch_next()

File "D:\Softwares\Anacoda\lib\site-packages\sklearn\externals\joblib\parallel.py", line 797, in dispatch_next
if not self.dispatch_one_batch(self._original_iterator):

File "D:\Softwares\Anacoda\lib\site-packages\sklearn\externals\joblib\parallel.py", line 825, in dispatch_one_batch
self._dispatch(tasks)

File "D:\Softwares\Anacoda\lib\site-packages\sklearn\externals\joblib\parallel.py", line 782, in _dispatch
job = self._backend.apply_async(batch, callback=cb)

File "D:\Softwares\Anacoda\lib\site-packages\sklearn\externals\joblib_parallel_backends.py", line 506, in apply_async
future = self._workers.submit(SafeFunction(func))

File "D:\Softwares\Anacoda\lib\site-packages\sklearn\externals\joblib\externals\loky\reusable_executor.py", line 151, in submit
fn, *args, **kwargs)

File "D:\Softwares\Anacoda\lib\site-packages\sklearn\externals\joblib\externals\loky\process_executor.py", line 1016, in submit
raise self._flags.broken

BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.

@albertcthomas
Copy link
Contributor

@rfernandezv and @shahabsh92 do you still have the error if you run your script using
python script.py in a terminal or the anaconda prompt?

@shahabsh92
Copy link

@albertcthomas I tried it in the terminal and I got again an error.

I will post the pictures of the Error. Something which I find interesting, is that there are 3 errors and the first error is complaining about the build_classifier and says that it cannot find it (as an AttributeError)

TerminalError1
TerminalError2
TerminalError3

@tomMoral
Copy link
Contributor
tomMoral commented May 1, 2019

The errors labeled BrokenProcessPool are caused directly by the error:

Traceback (most recent call last):
File "D:\Softwares\Anacoda\lib\site-packages\sklearn\externals\joblib\externals\loky\process_executor.py", line 393, in process_worker
call_item = call_queue.get(block=True, timeout=timeout)
File "D:\Softwares\Anacoda\lib\multiprocessing\queues.py", line 113, in get
return ForkingPickler.loads(res)
AttributeError: Can't get attribute 'build_classifier' on <module '__main__' (build-in)>

This is an error with the serialization process for both traceback you showed. (The previous one is because keras try to write on sys.stderr which is None).

The serialisation process for loky should be conducted with cloudpickle and should not fail but it seems something is wrong here. Could you please confirm that you are using the version 0.20.3 of scikit-learn? If yes, could you share the script you are running?

@rfernandezv
Copy link
Author
rfernandezv commented May 2, 2019

@albertcthomas , I still have the problem, but when I ran it in Visual Studio Code using Jupyter Server everything worked fine. I think the problem is Anaconda.

@shahabsh92 you can run it in VS code

@albertcthomas
Copy link
Contributor
albertcthomas commented May 2, 2019

@rfernandezv and @shahabsh92 I cannot reproduce your error when running the code that I posted in the comment above both in cmder and the anaconda prompt in a conda env with the same configuration as @shahabsh92

python 3.7.1
sklearn 0.20.3
tensorflow 1.13.1
keras 2.2.4
numpy 1.15.4
scipy 1.1.0

I'm using conda 4.6.14

@streetr0ck
Copy link

So still no solution to this problem?
I think that the majority of people who are on this forum and have this problem use the same program in the same tutorial, and I am one of them today.

I bought a new laptop with an I5 8300H processor, and a GTX 1050 TI MAX-Q, because I thought the problem came from them, but obviously it's not the case.

If someone has found a solution to this problem I would very much like to know it.

@rth
Copy link
Member
rth commented Jul 26, 2019

@streetr0ck Which versions of Python, scikit-learn and joblib are you using? https://scikit-learn.org/stable/developers/contributing.html#how-to-make-a-good-bug-report

@streetr0ck
Copy link
streetr0ck commented Jul 26, 2019

@rth i'm using :
Windows-10-10.0.17134-SP0
Python 3.6.8 |Anaconda, Inc.| (default, Feb 21 2019, 18:30:04) [MSC v.1916 64 bit (AMD64)]
scikit-learn version is 0.20.3.
joblib version is 0.13.0.

@rfernandezv
Copy link
Author

@streetr0ck
I was using Spyder and had the problem, but now I run it in Visual Code and the same code worked correctly.

@streetr0ck
Copy link

@rfernandezv
I had some problems with visual studio code, but after launching the python script, I was able to exploit all the cores. It worked fine.
It's a pity there's no solution on spyder.
Thank you for the help.

@streetr0ck
Copy link
streetr0ck commented Jul 28, 2019

@rfernandezv
I'm just curious to know if you had a broken display in the terminal, as shown in the picture attached to this message.

Capture d’écran (26)

@trungnghia2009
Copy link

I also got the same problem on spyder (work only with n_jobs = 1), but on vscode it worked well.
Using windows platform for the tutorial

paralell computing

@streetr0ck
Copy link

Thank you very much @trungnghia2009 for the answer.

@suvofalcon
Copy link
suvofalcon commented Oct 29, 2019

How will n_jobs=-1 will work or will it at all work, if you run your tensorflow backend on gpu?

in my opinion, n_jobs=-1 will try and use all the cores for CV and each of tit will try to start a tf session on GPU.

In short n_jobs=-1 doesn't work for me, when using GPU

@hanss01
Copy link
hanss01 commented Jan 3, 2020

so you all come from deep learning a-z right? XD

@metalglove
Copy link

I am having the same problem.

@ghost
Copy link
ghost commented Jan 19, 2020

For all you deep learning a-z and??

Problem solved in a way. I upgraded joblib from 0.13.2 to joblib 0.14.1 and the program completes using Spyder with Tensor-GPU and n_jobs = -1. However, you do not see all the console output (the animations).

conda install joblib=0.14.1

Here are the results, print (accuracies), using joblib 0.13.2 and running from Anaconda command prompt
[0.83125001 0.83875 0.82999998 0.83125001 0.86624998 0.86374998
0.83375001 0.82249999 0.84375 0.85250002]
print(mean) = 0.8413749992847442
print(variance) = 0.014158142154973179

Here are the results, print (accuracies), using joblib 0.14.1 running within Spyder
[0.84375 0.83499998 0.83249998 0.83125001 0.8725 0.83875
0.83125001 0.82749999 0.80874997 0.86624998]
print(mean) = 0.8387499928474427
print(variance) = 0.017642281365836756

I also updated scikit-learn to 0.22.1, but I feel that really has nothing to do with this working.

Edit: to add the mean and variance

@blurred-machine
Copy link

For me, the same error came but it was solved when I removed
n_jobs=-1
from the code, then it worked perfectly.
Maybe there is some issue with the multithreading compatibility for parallel computing.
Can you help me understand what could be the real reason?

@negrinij
Copy link
negrinij commented May 21, 2020

Hi, Just in case is helpful. I had to update Anaconda to 1.9.12 and Spyder to 4.1.3, their current latest versions. Updating joblib and other packages above did not do the trick for me. As other users mentioned, maybe is something with spyder / anaconda.

@Jolomi-Tosanwumi
Copy link

Any solution to this issue? I am having the same errors when n_jobs is set to any number greater than 1. I am trying to do cross validation on a Keras model wrapped in sklearn's KerasClassifier.

@mattiaguerri
Copy link

Issue still persists, the suggested solution does not fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

0