Make it possible to pickle a fit `GridSearchCV` and `RandomizedSearchCV` #1801

jnothman · 2013-03-22T07:16:03Z

Fix for #1789:

don't make best_estimator_'s methods available by copying its methods. Instead link to its attributes by properties (also, added properties for decision_function and transform where predict, predict_proba and score had been available).
and don't declare CVScoresTuple in a closure.

…er search Avoids copying an unpicklable method to parameter search's __dict__. Also adds decision_function and transform where only predict and predict_proba were available before.

Define CVScoreTuple in the module namespace.

amueller · 2013-03-22T08:34:23Z

Great, I think this is the right solution. Not sure we need the underscore in front of the CVScoresTuple, though. Could you please make the test not only smoke tests? For example unpickle and see if the results of predict are the same.
Also you can assert that the results of transform and decision_function are the same as the ones of best_estimator_. Which estimator did you test transform with? That only works with LDA ,right?

One downside of your solution is that it is no longer possible to ducktype the presence of decision_function (for example). Not sure this would be possible, though. If an base estimator doesn't provide a given function, an AttributeError is rased, correct?

jnothman · 2013-03-23T13:04:41Z

I'll add some non-smoke testing of the pickle soon.

I'm only smoke-testing transform, etc. with test_grid_search.MockClassifier. And estimators with L1 regularisation also implement transform.

To answer your second point (I hadn't been sure of the answer myself):

class O(object):
  @property
  def a(self):
    return self.b

>>> o = O()
>>> hasattr(o, 'a')
False
>>> o.b = 5
>>> hasattr(o, 'a')
True

I.e. if AttributeError is raised in property.__get__, hasattr will return False.

This too should be tested somehow, but I'm not sure if it's for this patch, or generally an issue for meta-classifiers and functions only available after fit().

This suggests that anything relying on best_estimator_ is best implemented as a property (including score)!

jnothman · 2013-03-23T21:49:02Z

Actually it looks like GridSearchCV.score is broken: it uses a deprecated definition of scorer, and only when the classifier doesn't have a score function. Is that correct, or useful for any particular purpose?

And rethinking, I'm not really sure how/why to test pickling: isn't it enough to assume that if pickle works and is tested, as long as no error occurs it's fine?

larsmans · 2013-03-27T16:30:37Z

LGTM, 👍 for merge.

(@GaelVaroquaux might like to give his opinion too, since this involves properties...)

GaelVaroquaux · 2013-03-27T17:16:41Z

(@GaelVaroquaux might like to give his opinion too, since this involves
properties...)

Valid usecase for properties: adaption pattern, which is what they are
for IMHO.

larsmans · 2013-03-27T20:08:13Z

@amueller Merge?

amueller · 2013-03-31T15:57:47Z

Sorry, have been offline for week as you might have noticed. LGTM.

@jnothman If the runtime of a real test is basically the same as the smoke test, I don't see why we shouldn't do it. No idea what could break if you don't test for it ;)

amueller · 2013-04-02T08:33:12Z

Merged by rebase. Thanks :)

jnothman added 2 commits March 22, 2013 18:10

ENH/FIX make best_estimator_'s predict functions available in paramet…

baa4742

…er search Avoids copying an unpicklable method to parameter search's __dict__. Also adds decision_function and transform where only predict and predict_proba were available before.

FIX make *SearchCV picklable

7f19c20

Define CVScoreTuple in the module namespace.

jnothman mentioned this pull request Mar 23, 2013

Problem with ducktyping and meta-estimators #1805

Closed

amueller closed this Apr 2, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Make it possible to pickle a fit `GridSearchCV` and `RandomizedSearchCV` #1801

Make it possible to pickle a fit `GridSearchCV` and `RandomizedSearchCV` #1801

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Make it possible to pickle a fit GridSearchCV and RandomizedSearchCV #1801

Make it possible to pickle a fit GridSearchCV and RandomizedSearchCV #1801

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Make it possible to pickle a fit `GridSearchCV` and `RandomizedSearchCV` #1801

Make it possible to pickle a fit `GridSearchCV` and `RandomizedSearchCV` #1801