8000 [MRG] FIX: Allow coef_=None with warm_start=True in MultiTaskElasticNet by larsoner · Pull Request #12844 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

[MRG] FIX: Allow coef_=None with warm_start=True in MultiTaskElasticNet #12844

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

larsoner
Copy link
Contributor

This restores the ability to have clf.coef_ = None during fitting with clf = MultiTaskElasticNet(..., warm_start=True). Code that currently works with the latest release started emitting this error when running on sklearn master (now shown with the error in the test suite when the changes to coordinate_descent.py are omitted):

sklearn/linear_model/tests/test_coordinate_descent.py:410: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
sklearn/utils/testing.py:196: in assert_warns_message
    result = func(*args, **kw)
sklearn/linear_model/coordinate_descent.py:1816: in fit
    check_random_state(self.random_state), random)

One potential alternative is that I could in my code set warm_start=False if the user supplies coef_ = None, rather than always passing warm_start=True. Feel free to close if that's the preferred way, though it's possible other people might also get hit by this problem if they also relied on warm_start=True followed by clf.coef_ = None working.

Just modified an existing test to avoid adding to test time, but can split it off if it seems worthwhile.

@massich
Copy link
Contributor
massich commented Dec 21, 2018

LGTM

@NicolasHug
Copy link
Member

Just for my own understanding, when does it makes sense to want warm-starting while having no coefficients to start with?

@larsoner
Copy link
Contributor Author
larsoner commented Dec 21, 2018

@NicolasHug IIUC it's just a convenience thing at our end that we can always have warm_start=True and then the value of coefs_ provided by the user (in our function, which we pass to sklearn) determines whether or not there is actually a warm start. As mentioned in the issue, the alternative would be for us to have warm_start = coefs_ is not None rather than always doing warm_start=True. But the code in sklearn used to allow our lazy always-warm_start=True behavior by checking if self.coefs_ is None but this was changed to a hasattr call. Using getattr(self, 'coefs_', None) allows backward compat with the old behavior of allowing coefs_=None and warm_start=True.

@larsoner
Copy link
Contributor Author

... by "our function" I mean in a function we have in another library (MNE), and callers of that function don't need to care about warm_start param at all (it's hidden from them) because it's implied by coefs_ is not None already (and previously respected by sklearn).

@NicolasHug
Copy link
Member

Ok thanks.

So I guess my question becomes: how can coefs_ be None? As far as I understand, coefs_ either does not exist (prior to the first call to fit), or is set to a non-None value once fit() is called.

If your use-case is 'we allow our users to set est.coefs_ to some arbitrary value, which can be None' I don't think this should be supported in scikit-learn because as far as scikit-learn is concerned, est.coefs_ is read-only (I think).

@larsoner
Copy link
Contributor Author

So I guess my question becomes: how can coefs_ be None?

In the latest sklearn release, self.coefs_ is None is permitted after instantiation and before the fit, and is treated as an alias for np.zeros:

https://github.com/scikit-learn/scikit-learn/blob/0.20.X/sklearn/linear_model/coordinate_descent.py#L1799

In sklearn master, though, this has been changed to be signified by not hasattr(self, 'coefs_'), which breaks this pattern.

If your use-case is 'we allow our users to set est.coefs_ to some arbitrary value, which can be None' I don't think this should be supported in scikit-learn because as far as scikit-learn is concerned, est.coefs_ is read-only (I think).

No our use case is to repeat instantiation + fit steps, supplying a properly shaped coef_ value each time (except the first run, where it is None). This allows for changing instantiation params (alpha, tol, etc.) with a warm start.

@NicolasHug
Copy link
Member
NicolasHug commented Dec 21, 2018

In the latest sklearn release, self.coefs_ is None is permitted after instantiation and before the fit, and is treated as an alias for np.zeros

That seems to be legacy code from times when setting attributes 8000 in __init__ was permitted:

https://github.com/scikit-learn/scikit-learn/blame/72334f1e8d746103eae16951630b2f6f91389f41/sklearn/linear_model/coordinate_descent.py#L1350

but this is not the case anymore.

Would need input from core devs. @amueller maybe?

EDIT:

our use case is to repeat instantiation + fit steps, supplying a properly shaped coef_ value

So you're still doing something like est.coefs_ = some_value, right? I think those attributes are read-only.

@amueller
Copy link
Member
amueller commented Dec 21, 2018

I don't see why we should support this. This is not in line with any standard contracts in sklearn. If you want to do this, I feel it's up to you (MNE) to ensure something sensible happens.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0