-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[WIP] FIX avoid making deepcopy in clone #9696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this.
I think @amueller was proposing that we don't copy at all. I now realise that not copying at all will break grid search over pipeline where steps are substituted.
We will have to think carefully about this change.
sklearn/base.py
Outdated
if deepcopy is None: | ||
warnings.warn("A simple copy will be performed after 0.22 instead of a" | ||
" deep copy. Set 'deepcopy=True' if you wish to make a" | ||
" deep copy of the objects which are not estimators.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Objects -> parameters
sklearn/base.py
Outdated
Use ``deepcopy=True`` to get the previous behavior. | ||
|
||
deepcopy : boolean, optional | ||
Whether to make a deep copy or a simple copy of the objects that ware |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ware -> are
All of them, I think.
…On 7 September 2017 at 23:02, Guillaume Lemaitre ***@***.***> wrote:
@jnothman <https://github.com/jnothman> I will need to turn deepcopy to
True or False to avoid triggering the warning.
Do we call all clone with deepcopy=False for all the current call or only
a subset of them.
@ogrisel <https://github.com/ogrisel> should also have the answer since
that he raised the issue.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#9696 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz6yxdU9aHJHoBlo96lbkpzaH71R--ks5sf-lEgaJpZM4POLMH>
.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose the deepcopy=True case should be tested..?
Otherwise, the code looks okay (I'm not happy that it touches so much, and maybe we should note that we intend to get rid of the parameter).
Now I suppose we should merge it and wait for people to complain that it broke their code?
If safe is false, clone will fall back to a deep copy on objects | ||
that are not estimators. | ||
|
||
If safe is false, clone will fall back to a copy (deep copy or simple |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't this be better described as "if True, the argument to clone must be an estimator (i.e. support get_params
). Otherwise the object will be copied ..."
@@ -26,7 +27,7 @@ def _first_and_last_element(arr): | |||
return arr[0, 0], arr[-1, -1] | |||
|
|||
|
|||
def clone(estimator, safe=True): | |||
def clone(estimator, safe=True, deepcopy=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been preferring deepcopy='warn'
and similar lately over None.
I wonder if we can instead use magic to check if the caller is internal to this library, and if so, just replace deepcopy='warn'
with deepcopy=False
.
import inspect
def is_caller_sklearn():
caller = inspect.current_frame().f_back.f_back
return caller is not None and caller.f_globals['__name__'].startswith('sklearn.')
I think we can close this for the same reason as #5563 (comment) |
Reference Issue
closes #5563
What does this implement/fix? Explain your changes.
Do not trigger a deep copy of the parameters when cloning estimators.
Any other comments?