8000 [NOMRG] new warmstart API for GBDTs by NicolasHug · Pull Request #15105 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

[NOMRG] new warmstart API for GBDTs #15105

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

NicolasHug
Copy link
Member
@NicolasHug NicolasHug commented Sep 27, 2019

Ping @adrinjalali is this what was decided during the sprint?

(CI will break because I'm not catching the deprecation warnings)

@NicolasHug NicolasHug changed the title [NOMRG] new warmstart API + grid search for GBDTs [NOMRG] new warmstart API for GBDTs Oct 3, 2019
Copy link
Member
@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few notes as I was reading. Yeah this is what we had in mind IIRC (from the estimator's perspective).

The implementation in the pipeline would be trickier.

@@ -106,10 +109,16 @@ def fit(self, X, y):

rng = check_random_state(self.random_state)

# For backward compat (for now)
if self.warm_start:
warnings.warn("warm_start parameter is deprecated", DeprecationWarning)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also mention how the user can fix it/what they should do instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure this is just some sort of proof of concept.

In the specific case of the hist GBDT, we don't need to deprecate anything since the warm-start hasn't even been released yet :)

@NicolasHug
Copy link
Member Author

The implementation in the pipeline would be trickier.

Are you sure? for the fit_param, we could just do e.g. pipe.fit(X, y, warm_start_with={'hist_gbdt__max_iter': 100}

and the _warmstartable_parameters attribute could just be a property that we would set depending on the last step? (This is assuming that only the last step of the pipeline can be warm-started which I guess is reasonable)

@adrinjalali
Copy link
Member

I was thinking of allowing warm start, or refitting the pipeline, having changed a parameter anywhere, and only refit the steps after the changed one. So in a sense, pipeline can warm start with almost any of its parameters

@NicolasHug
Copy link
Member Author

You mean, when doing pipe.set_params({'second_step__something': 12}) then not fitting the first step again when calling pip.fit()?

I feel like this is yet another kind of warm-starting, mostly orthogonal to the kind of warm-starting that this API is concerned with?

@adrinjalali
Copy link
Member

So for now, we forget about warm start on the pipeline for this PR, and we should test for metaestimators such as pipeline and Voting estimators.

@NicolasHug
Copy link
Member Author

Are you happy with how it looks so far @adrinjalali ? (test failure is unrelated)

I think we should write a SLEP for this.

Copy link
Member
@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we decide on the API, I think we should put the warm startable params higher in the meta-estimator hierarchy (I think), but otherwise looks good.

And yeah we should write a slep for the new API.

def _warmstartable_parameters(self):
# This property exposes the _warmstartable_parameters attribute, e.g.
# ['+last_step_name__param']
# We consider that only the last step can be warm-started. The first
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to make this design choice? I may have asked this question before, but don't remember your logic behind it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A pipeline is either:

  • a sequence of transformers
  • a sequence of transformers + a predictor as the last step

It's safe to assume that transformers are not warm-startable. Maybe in the future one of them will be?? We can worry about that when that happens.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could have a word embedding transformer as a step, which very often is warm started. We may not have that inside sklearn, but the pipeline's API should support it, I think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we should leave the possibility open, and make the code future-proof. This is one of the reasons why _warmstartable_parameters is a list.

But I don't think we should implement support for that, we have no use-case ATM.

Concretely, supporting warm-start for transformers right now is writing code that isn't used (that would require updating the _fit method that fits all the transformers)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Yeah fair.

@adrinjalali
Copy link
Member

@jnothman I'd appreciate your thoughts on this one.

@NicolasHug
Copy link
Member Author

The benefits of this approach aren't clear to me, and we have enough SLEPs to deal with ATM. Closing, might re-open one day

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0