8000 [MRG] Added check for idempotence of fit() by NicolasHug · Pull Request #12328 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

[MRG] Added check for idempotence of fit() #12328

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Oct 29, 2018

Conversation

NicolasHug
Copy link
Member

Reference Issues/PRs

Follows #12305 (comment)

What does this implement/fix? Explain your changes.

This PR checks that fit() is idempotent for all non-meta estimator, that is est.fit(X) is equivalent to est.fit(X).fit(X).

Any other comments?

# Fit again
est.fit(X_train, y_train)

if hasattr(est, 'predict'):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we consider resuscitating the concept of assert_same_model (#4841)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally yes...

I tried to do that by comparing the attributes ending with _ but ran into tons of edge cases, so I went with the prediction comparison, which is probably enough for this issue.

Basically the question is, is it really worth it? It's doable but there's no doubt the code is going to be a pain to maintain.

The code in #4841 ignores the comparison of some attributes, in particular the attributes that are themselves estimators (see here), and it's already pretty involved!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind it being approximate and efficient in most cases, rather than having the complexity of #4841... but it should be reusable, rather than repeated ad-hoc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A further option may be to make each estimator responsible for its tests of equivalence.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A further option may be to make each estimator responsible for its tests of equivalence

But I guess for those that have some deeply nested attributes that need to be checked for equality, the code would still be rather complex / adhoc, and duplicated across all those estimators?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think most of the time we can presume that identical predictions/transformation is strong enough (although if all attributes can be tested for equality that would be faster in many cases). Where those methods do not exist, then we may need an alternative.
Asserting that two models are not equal would be harder. A function is_model_equal returning {yes, likely, no, unknown} would be useful and could be strengthened over time

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what do you suggest exactly? Should I turn this test into is_model_equal and return either likely if it passes or unknown if it fails?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If transform etc are not equal we know the model is not equal. If we can test that the objects are identical we know the models are identical. If we show that predict, transform, etc, match on random data then it is likely they are equal, and if we have no way to test, then it is unknown.

Here the test would pass if anything but 'no' was returned.

Don't we have similar assertions elsewhere? I'd expect this to be used multiple times already.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we have similar assertions elsewhere?

I have no idea to be honest.

I still don't understand what you want to do regarding this PR:

  • Do you want to have is_model_equal?
  • If yes, in this PR? Another one? Should I create a new issue so we can discuss requirements / implementation details there?
  • Should we still merge this PR, and if no do you want to reimplement it once we have is_model_equal?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Omg. I'm sorry. I was coming at this thinking that we had tests with similar checks of model equivalence... But maybe they were only in pull requests, or my imagination, or aren't expressed like this. Okay. Let's not bother about making it generic in this issue.

@NicolasHug
Copy link
Member Author

Python 2 tests are failing... Do we care?

@amueller
Copy link
Member

It's actually a numpy error, np.stack doesn't exist. I think we're still supporting that version of numpy.

Copy link
Member
@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd still rather this structured to use a generic model comparator. It would be more readable apart from anything else. But I'm okay with it as it stands.

if hasattr(est, 'predict'):
pred_1 = est.predict(X_test)
if hasattr(est, 'predict_proba'):
pred_proba_1 = est.predict_proba(X_test)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that we don't need to test predict if predict_proba is tested. This is the kind of rationalisation and optimisation that writing a generic function would benefit from where here it would look ugly.

Copy link
Member
@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I've realised where we have logic like this. And I think the code is cleaner to read, if not something we should refactor into a reusable function: check_estimators_pickle. Can we make this look more like that, or refactor?

@NicolasHug
Copy link
Member Author

Thanks for the ref, it looks much better like this indeed.

I still kept my original data generation part, because using make_blobs like in check_estimators_pickle would result in consensus error from RANSACRegressor (I unsucessfully tried various parameters and random seed)

X_train, X_test, y_train, _ = train_test_split(X, y)
# some estimators expect a square matrix
X_train = pairwise_estimator_convert_X(X_train, estimator)
X_test = pairwise_estimator_convert_X(X_train, estimator)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

X_test = pairwise_estimator_convert_X(X_test, estimator)

@TomDLT
Copy link
Member
TomDLT commented Oct 15, 2018

Maybe we could improve the test by adding more diverse samples in X_test, such as sample from a shifted distribution, samples with large values, or samples used during training. What do you think?

@NicolasHug
Copy link
Member Author

Yes, I agree that ideally the tests should be more thorough.

In practice though I found it a bit tricky to have a dataset generation that would work for all estimators. In a lot of cases some would not converge, or don't reach a consensus, some expect non negative data, etc.

Maybe that's just an indication that those tests aren't really appropriate and that we should resort to a more in depth approach like in #4841 .

@amueller
Copy link
Member

I think they are appropriate, we just need to have estimator tags to check which data the estimators want ;)

Copy link
Member
@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM

# Fit for the first time
estimator.fit(X_train, y_train)

result = dict()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

= {} is more idiomatic

random_state=rng)
# some estimators expect a square matrix
X_train = pairwise_estimator_convert_X(X_train, estimator)
X_test = pairwise_estimator_convert_X(X_test, estimator)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can't be right. X_test for pairwise needs to be of shape (test samples, train samples)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're using test_size=.5 so it works...

But otherwise is there a builtin utility like pairwise_estimator_convert_X that I could use instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so?

Copy link
Member
@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a what's new under "changes to estimator checks" or whatever the heading was do v0.20

@TomDLT TomDLT merged commit f6b0c67 into scikit-learn:master Oct 29, 2018
@TomDLT
Copy link
Member
TomDLT commented Oct 29, 2018

Thanks @NicolasHug !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0