-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
GridSearchCV cannot be paralleled when custom scoring is used #10054
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This is a limitation of pickling. Define my_custome_loss_func in a module you import, or at least not in a closure. |
Wow. Was wondering what was wrong with my custom loss function. Turns out, it was n_jobs=-1 that was causing this issue! Removed that and the error vanishes.
|
Is this still an issue in version 0.20?
|
Yep.
On Tue, 9 Oct 2018 at 5:05 PM, Joel Nothman ***@***.***> wrote:
Is this still an issue in version 0.20?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#10054 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABPLM3rRBt6WhZi3kwUViBfHSq5t9fx4ks5ujIoTgaJpZM4QN4tf>
.
--
(Handheld)
|
@fx86 can you please provide code to reproduce or at least the error message? |
@amueller Code is above.
|
That's not the same issue. There might be an issue open for this one already, though. cc @ogrisel ? |
The user is providing a function that isn't picklable. Please put |
“Please put rmsle into a module that can be imported.”
If this be mentioned in the documentation with an example, it would save on
a lot of head scratching.
On Wed, 10 Oct 2018 at 6:26 AM, Joel Nothman ***@***.***> wrote:
The user is providing a function that isn't picklable. Please put rmsle
into a module that can be imported.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#10054 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABPLM0_OL86cX5Tkuai_gkmhkcNfzTqGks5ujUXUgaJpZM4QN4tf>
.
--
(Handheld)
|
Where in the docs would you see it? Somewhere specific to scoring, even though it is relevant to all parallel processing where a custom function can be provided? Under http://scikit-learn.org/0.20/glossary.html#term-n-jobs?? http://scikit-learn.org/0.20/faq.html?? |
I feel, this is an important caveat when writing a custom scoring function
and it could be mentioned in
http://scikit-learn.org/stable/modules/model_evaluation.html#defining-your-scoring-strategy-from-metric-functions
The only other way I would have found is with a site-search on google or on
stackoverflow.
…On Wed, Oct 10, 2018 at 8:54 AM Joel Nothman ***@***.***> wrote:
Where in the docs would you see it? Somewhere specific to scoring, even
though it is relevant to all parallel processing where a custom function
can be provided? Under
http://scikit-learn.org/0.20/glossary.html#term-n-jobs??
http://scikit-learn.org/0.20/faq.html??
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#10054 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABPLMzQsR9-sB3Oj8lhMlB2bbWa42uf3ks5ujWhlgaJpZM4QN4tf>
.
|
Would you like to submit a pull request?
|
Yes, I would. Is there a target date you guys are looking at for this ? |
@fx86 whenever you can ;) No rush |
Hi, Note that this issue is also present when using k-fold validation with scikit-learn, using function cross_val_score(). This has been reported to the sclearn team, and the target to fix it is in the next version, 0.20.1. See here for more info: #12250 |
Thanks. Newbie question here, how would one import a module with a custom scorer? I have the custom scorer below How would I import that in a module? Hope this helps others too as well. def rmse_cv(y_true, y_pred) : |
Put rmse_cv in a file called custom.py, for instance, then |
Thanks . Also with the make_scorer ? I imported the function and put make_scorer in my script (that I imported into) and it said ‘Predict score’ object has no attribute ‘__name’ |
@GinoWoz1 I'm using it like this. How are you using it ?
|
Thanks @fx86 , that helped to remove the name error. But still getting the below error with n_jobs=-1 Put this in a separate file and import it. I can't get the code block working for the life of me...I hit the "insert code" option but it just doesnt render...sorry Function - need to indent after def...cant get it to work def rmse_cv(y_true, y_pred) : Then run the below... rkfold = RepeatedKFold(n_splits=5,n_repeats=5) url = 'https://github.com/GinoWoz1/Learnings/raw/master/' X_train = pd.read_csv(url + 'X_trainGA.csv',index_col= 'Unnamed: 0') X_train.rename(columns={'Constant Term':'tax'},inplace=True) elnet_final = ElasticNet() elnet_pipe = Pipeline([('std',StandardScaler()), cross_val = cross_val_score(elnet_pipe,X_train,y_train,scoring=make_scorer(rmse_cv,greater_is_better=False),cv=rkfold,n_jobs=-1) |
@GinoWoz1 This seems to be working okay for me. I'm on the following versions - `Python 3.6.0 :: Continuum Analytics, Inc. scikit-image==0.13.1 scikit-learn==0.20.0` |
Thanks, it was a package conflict error. It is working now. |
No need to open an issue first, @fx86 |
Same issue happened to me |
Hi,
I met a problem with the code:
in which I used custom scoring object in GridSearchCV(...) and set n_jobs = 2.
I got the following error message:
It seems that if and only if n_jobs is set to 1 can the program be run.
Any ideas?
The text was updated successfully, but these errors were encountered: