-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
Copy _make_estimator parameters #12690
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -124,7 +125,7 @@ def _make_estimator(self, append=True, random_state=None): | |||
sub-estimators. | |||
""" | |||
estimator = clone(self.base_estimator_) | |||
estimator.set_params(**dict((p, getattr(self, p)) | |||
estimator.set_params(**dict((p, copy.deepcopy(getattr(self, p))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it make sense rather copy the parameters in the set_params
Well you can add a smoke test and if it segfaults we know that we messed up, right? |
@wyegelwel do you intend to complete this? |
I'm running into the same issue and confirmed that the patch fixes it. What is left to get this over the line? Does it just need a smoke test? @amueller |
@aadamson yes, it seems it needs a smoke test and a contributor. You're welcome to take it on |
Great, thanks. I'll take a look this week. |
@cstefansen I had a branch that I never pushed with a smoke test that should cover this. See #18985. |
Closing as resolved in #19580 |
Reference Issues/PRs
Fixes #12623
What does this implement/fix? Explain your changes.
The issue is that when trees are created by the base ensemble class, the parameters are not copied. This is problematic in the case of the criterion parameter if a Criterion object is passed in. When this happens, all the trees share the same object and mutate it. If n_jobs > 1, they are mutating the same object concurrently.
Any other comments?
I couldn't think of a good test for this fix. Since the repro case causes a segfault it is hard to add a test case that doesn't cause grief.