-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
[WIP] ENH Adds caching to multimetric scoring with a wrapper class #14261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This is a nice solution, I think. Memory could be a small problem, but unlikely. This assumes all estimators have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking pretty good. Add what's new, please
cache = {} | ||
names = ['predict', 'predict_proba', 'decision_function', 'score'] | ||
cache_funcs = {name: getattr(estimator, name) for name in names if | ||
hasattr(estimator, name)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might as well also check that it is callable??
|
||
cache = {} | ||
names = ['predict', 'predict_proba', 'decision_function', 'score'] | ||
cache_funcs = {name: getattr(estimator, name) for name in names if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe name it orig_funcs
Codecov Report
@@ Coverage Diff @@
## master #14261 +/- ##
==========================================
- Coverage 96.82% 96.75% -0.07%
==========================================
Files 394 394
Lines 71916 71949 +33
Branches 7904 7906 +2
==========================================
- Hits 69630 69616 -14
- Misses 2263 2315 +52
+ Partials 23 18 -5
Continue to review full report at Codecov.
|
Is there a way to write an estimator without a Edit: joblib makes things not writable |
6ed3613
to
ae8d7a4
Compare
I am unhappy with the two approaches I tried:
|
This PR updated with:
|
For some reason I feel iffy about the cache not checking X, even though Generally I would kind of prefer not to use these hacks and expand the scorer interface to allow returning dicts of scores so we can do The Right Thing™ directly. That would also allow us to use |
I'm happy with investigating the scorer returning dict solution, but I am not sure that will quickly remedy the present efficiency problem. Efficiently calculating per-class and micro average prf would require some rewriting of the metrics anyway... |
Why does setattr fail with joblib?
|
The issue is how class Hello:
@property
def hello():
return "world"
h = Hello()
setattr(h, "hello", 1) |
Does directly using `vars(est)[attr] = thing` work??
|
Does not seem to work. To get this to work, we would need to define a setter for the property: class MyObj:
def _hello(sefl):
return "world"
@property
def hello(self):
return self._hello
@hello.setter
def hello(self, func):
self._hello = func
def another_hello():
return "a whole new world"
obj = MyObj()
old_func = obj.hello
print("before setattr", obj.hello())
setattr(obj, "hello", another_hello)
print("after setattr", obj.hello())
setattr(obj, "hello", old_func)
print("restore", obj.hello()) |
The issue is not with the |
Reference Issues/PRs
Resolves #10802
Alternative to #10979
What does this implement/fix? Explain your changes.
Wraps the estimator, and caches the results of
predict
,predict_proba
,decision_function
, andscore
during_multimetric_score
.