-
-
Notifications
You must be signed in to change notification settings - Fork 26k
[MRG+1] fix _BaseComposition._set_params with nested parameters #9945
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I had intended this case to be handled by _BaseComposition._set_params in
the case of Pipeline. Why isn't that working??
But this fix should be helpful for metaestimators that don't need to define
their own set_params. Please add a test for that case...
|
I've tried tracing the _BaseComposition._set_params logic, but still can't
see why this bug should occur, or why this patch should fix it. Running the
snippet, what's passed to set_params is {'a__b', 'a__b__alpha'}; then
{'a__b'} is handled by _BaseComposition._set_params; then set_params is
asked to set 'b__alpha' alone (and then 'alpha').
|
I pushed a more extensive fix to your branch. I hope you don't mind. |
Codecov Report
@@ Coverage Diff @@
## master #9945 +/- ##
==========================================
+ Coverage 96.17% 96.17% +<.01%
==========================================
Files 336 336
Lines 62533 62540 +7
==========================================
+ Hits 60138 60145 +7
Misses 2395 2395
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amueller, let me know if you find this use of groupby illegible. Using defaultdict may be neater.
b2e5a7b
to
5129522
Compare
5129522
to
1c07283
Compare
28a3235
to
b7b4464
Compare
You can't determine even local parameters with only class name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not an expert on the Pipeline code, but this looks good to me, with a small comment.
I double-checked that the tests were failing on master Python 3.6.
sklearn/tests/test_pipeline.py
Outdated
('b', DummyRegressor()) | ||
])) | ||
]) | ||
params = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use a collections.OrderedDict
to make this test more future-proof?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shrug. I think it distorts the purpose of the test somewhat if we only test with an OrderedDict.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a bit clearer (without some of the subtlety of python version-specific dict ordering):
estimator.set_params(a__b__alpha=0.001, a__b=Lasso())
set_params
set_params
set_params
I rename the PR title because it did not seem appropriate any more. Feel free to use a better one. |
Pushed a change that should fix the flake8 error. |
@jnothman you understood the problem, I guess, because you pushed the other fix? As I said, that fix might be a bit cleaner. Just for the record: |
LGTM. This behavior is more natural. |
Your fix is not sufficient if, say, the user sets |
I'd missed your original comment about this potential solution! |
I'm happy to merge when Travis says my new assertion is okay. |
Hey, can you have a look at my issue here: This very similar case still fails. |
yes, you've diagnosed it well. thanks, I'll try work on a solution
|
Issue where estimator is changed as well as its parameter: scikit-learn#9945 (comment)
Thanks for that very quick fix! Indeed your fix does also work for my original SO minimal example. To close the issue there: Would you want to add an answer there referring to your fix, then I could accept that and close this issue there? |
Fixes #9944.
This seemed the simplest fix.
This kind of means that the protection in
_BaseComposition._set_params
is not enough.We could make that method work by grouping the things in
BaseEstimator.set_params
according to their prefixes and callset_params
only once per prefix. That seems like a slightly cleaner solution, but not sure it's worth the effort.This will go away in Python3.6 as iteration is guaranteed to be ordered then.