[MRG] FIX: Allow coef_=None with warm_start=True in MultiTaskElasticNet #12844

larsoner · 2018-12-20T20:34:42Z

This restores the ability to have clf.coef_ = None during fitting with clf = MultiTaskElasticNet(..., warm_start=True). Code that currently works with the latest release started emitting this error when running on sklearn master (now shown with the error in the test suite when the changes to coordinate_descent.py are omitted):

sklearn/linear_model/tests/test_coordinate_descent.py:410: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
sklearn/utils/testing.py:196: in assert_warns_message
    result = func(*args, **kw)
sklearn/linear_model/coordinate_descent.py:1816: in fit
    check_random_state(self.random_state), random)

One potential alternative is that I could in my code set warm_start=False if the user supplies coef_ = None, rather than always passing warm_start=True. Feel free to close if that's the preferred way, though it's possible other people might also get hit by this problem if they also relied on warm_start=True followed by clf.coef_ = None working.

Just modified an existing test to avoid adding to test time, but can split it off if it seems worthwhile.

massich · 2018-12-21T14:55:02Z

LGTM

NicolasHug · 2018-12-21T14:58:00Z

Just for my own understanding, when does it makes sense to want warm-starting while having no coefficients to start with?

larsoner · 2018-12-21T15:04:15Z

@NicolasHug IIUC it's just a convenience thing at our end that we can always have warm_start=True and then the value of coefs_ provided by the user (in our function, which we pass to sklearn) determines whether or not there is actually a warm start. As mentioned in the issue, the alternative would be for us to have warm_start = coefs_ is not None rather than always doing warm_start=True. But the code in sklearn used to allow our lazy always-warm_start=True behavior by checking if self.coefs_ is None but this was changed to a hasattr call. Using getattr(self, 'coefs_', None) allows backward compat with the old behavior of allowing coefs_=None and warm_start=True.

larsoner · 2018-12-21T15:06:22Z

... by "our function" I mean in a function we have in another library (MNE), and callers of that function don't need to care about warm_start param at all (it's hidden from them) because it's implied by coefs_ is not None already (and previously respected by sklearn).

NicolasHug · 2018-12-21T15:31:48Z

Ok thanks.

So I guess my question becomes: how can coefs_ be None? As far as I understand, coefs_ either does not exist (prior to the first call to fit), or is set to a non-None value once fit() is called.

If your use-case is 'we allow our users to set est.coefs_ to some arbitrary value, which can be None' I don't think this should be supported in scikit-learn because as far as scikit-learn is concerned, est.coefs_ is read-only (I think).

larsoner · 2018-12-21T15:46:45Z

So I guess my question becomes: how can coefs_ be None?

In the latest sklearn release, self.coefs_ is None is permitted after instantiation and before the fit, and is treated as an alias for np.zeros:

https://github.com/scikit-learn/scikit-learn/blob/0.20.X/sklearn/linear_model/coordinate_descent.py#L1799

In sklearn master, though, this has been changed to be signified by not hasattr(self, 'coefs_'), which breaks this pattern.

If your use-case is 'we allow our users to set est.coefs_ to some arbitrary value, which can be None' I don't think this should be supported in scikit-learn because as far as scikit-learn is concerned, est.coefs_ is read-only (I think).

No our use case is to repeat instantiation + fit steps, supplying a properly shaped coef_ value each time (except the first run, where it is None). This allows for changing instantiation params (alpha, tol, etc.) with a warm start.

NicolasHug · 2018-12-21T16:04:25Z

In the latest sklearn release, self.coefs_ is None is permitted after instantiation and before the fit, and is treated as an alias for np.zeros

That seems to be legacy code from times when setting attributes in __init__ was permitted:

https://github.com/scikit-learn/scikit-learn/blame/72334f1e8d746103eae16951630b2f6f91389f41/sklearn/linear_model/coordinate_descent.py#L1350

but this is not the case anymore.

Would need input from core devs. @amueller maybe?

EDIT:

our use case is to repeat instantiation + fit steps, supplying a properly shaped coef_ value

So you're still doing something like est.coefs_ = some_value, right? I think those attributes are read-only.

amueller · 2018-12-21T18:05:31Z

I don't see why we should support this. This is not in line with any standard contracts in sklearn. If you want to do this, I feel it's up to you (MNE) to ensure something sensible happens.

FIX: Fix warm_start=True, coef_=None

51e2e0a

larsoner mentioned this pull request Dec 20, 2018

MRG: Fix bleeding-edge library bugs mne-tools/mne-python#5811

Merged

agramfort approved these changes Dec 21, 2018

View reviewed changes

larsoner closed this Dec 21, 2018

larsoner deleted the fix-cd branch December 21, 2018 18:21

agramfort mentioned this pull request Dec 22, 2018

FIX : broken MultiTaskLasso with warm_start=True #12853

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[MRG] FIX: Allow coef_=None with warm_start=True in MultiTaskElasticNet #12844

[MRG] FIX: Allow coef_=None with warm_start=True in MultiTaskElasticNet #12844

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[MRG] FIX: Allow coef_=None with warm_start=True in MultiTaskElasticNet #12844

[MRG] FIX: Allow coef_=None with warm_start=True in MultiTaskElasticNet #12844

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!