check_estimator is not sufficiently general

I've been writing a lot of scikit-learn compatible code lately, and I love the idea of the general checks in check_estimator to make sure that my code is scikit-learn compatible. But in nearly all cases, I'm finding that these checks are not general enough for the objects I'm creating (for example: MSTClutering, which fails because it doesn't support specification of the number of clusters through an n_clusters argument)

Digging through the code, there are a lot of hard-coded special-cases of estimators within scikit-learn itself; this would imply that absent those special-cases scikit-learn's own estimators would not pass the general estimator checks, which is obviously a huge issue.

Making these checks significantly general would be hard; I imagine it would be a rather large project, and probably even involve designing an API so that estimators themselves can tune the checks that are run on them.

In the meantime, I think it would be better for us not to recommend people running this code on their own estimators, or to let them know that failing estimator checks do not necessarily imply non-compatibility.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions