[MRG+2] Fix empty input data common checks #4245

ogrisel · 2015-02-13T23:54:05Z

This PR includes #4214 but also additional common tests that currently fail on some estimators that do not have a consistent behavior and that probably need to be fixed on a case by case basis.

agramfort · 2015-02-14T16:30:49Z

travis is not happy.

amueller · 2015-02-14T22:52:22Z

can you rebase?

raghavrv · 2015-02-14T23:15:06Z

I think these are intended to be smoke tests?

If so, summarizing the 11 failures ( 1 error msg included ) :

Needs handling of 0 samples

sklearn.preprocessing.data.StandardScaler (?)
sklearn.kernel_approximation.Nystroem

Needs handling of 0 features

sklearn.preprocessing.data.MinMaxScaler

Needs rewording of the error message for the 0 samples test

sklearn.linear_model.ridge.RidgeClassifierCV
sklearn.linear_model.ridge.RidgeClassifier

Needs rewording of the error message for the 0 features test

sklearn.ensemble.forest.ExtraTreesRegressor
sklearn.ensemble.forest.ExtraTreesClassifier
sklearn.ensemble.forest.RandomForestRegressor
sklearn.ensemble.forest.RandomForestClassifier

Others :

sklearn.linear_model.coordinate_descent.MultiTaskLassoCV
and
sklearn.linear_model.coordinate_descent.MultiTaskElasticNetCV complain that they aren't supposed to be used for single task case.

ogrisel · 2015-02-16T10:06:26Z

I rebased. I know the tests don't pass as @ragv summarized many estimators are not consistent with the 0-samples or 0-features edge cases.

I am under the impression that having support for any of those 2 cases is never useful and can be the source of confusion when the user is not properly aware of the shape of the datastructures he/she fed to sklearn estimators. This is why I think we should always raise an exception consistently with an informative error message.

I will work on fixing the failures. We can discuss if maintaining exceptions is desirable or not.

amueller · 2015-02-16T14:17:23Z

+1 on failing with a good error.

ogrisel · 2015-02-16T14:34:32Z

Somewhat related, I opened an issue to track that inconsistent handling of 1D input #4252. Fixing #4252 is probably not required to have to fix the remaining failures of this PR but it's good to keep in mind as we move closer to 1.0.

8000

ogrisel · 2015-02-16T16:33:08Z

sklearn/ensemble/forest.py

@amueller I removed ensure_2d=False here and no unit test check failed contrary to what the FIXME inline comment says. Do remember which tests used to fail?

This change is technically not required for this PR. I can revert it if you prefer to deal with this case in another PR such as the one tackling #4252 for instance.

That was a tree-specific test.

Or rather forest-specific.

It does no longer fail apparently.

ogrisel · 2015-02-16T17:07:42Z

I fixed all the failures revealed by the new checks.

amueller · 2015-02-24T21:32:16Z

sklearn/tests/test_common.py

why don't you use the "make_y_multioutput" helper?

I cannot find what you are talking about in the code base.

Sorry I meant y = multioutput_estimator_convert_y_2d(name, y)

amueller · 2015-02-24T21:33:41Z

Maybe add a whatsnew (helpful for looking up when something was fixed), and maybe use the multioutput helper. Otherwise +1

arjoly · 2015-02-25T12:19:25Z

sklearn/utils/estimator_checks.py

I would use the ignore warnings decorator.

Can you explain why you need this? I am missing something here.

I would use the ignore warnings decorator.

Indeed.

Can you explain why you need this? I am missing something here.

I will relaunch the tests without the catch to make sure.

ogrisel · 2015-02-25T15:21:15Z

@arjoly I addressed your comments.

@amueller I don't know which helper you are referring to...

arjoly · 2015-02-25T16:12:14Z

LGTM

amueller · 2015-02-25T18:32:24Z

sklearn/utils/estimator_checks.py

So here you could do
y = multioutput_estimator_convert_y_2d(name, y)
And remove the multioutput argument. This is how it is done in all the other tests.

ogrisel · 2015-02-26T09:50:01Z

~~Thanks @amueller, I addressed your comment, squashed and merged.~~

ogrisel mentioned this pull request Feb 13, 2015

[MRG+2] add validation for non-empty input data #4214

Merged

ogrisel force-pushed the fix-empty-input-data-common-checks branch from 39dbaf7 to 25898de Compare February 16, 2015 10:01

ogrisel force-pushed the fix-empty-input-data-common-checks branch from 25898de to 549d98c Compare February 16, 2015 14:18

ogrisel reviewed Feb 16, 2015
View reviewed changes

ogrisel changed the title ~~[WIP] Fix empty input data common checks~~ [MRG] Fix empty input data common checks Feb 16, 2015

ogrisel added the Bug label Feb 23, 2015

ogrisel added this to the 0.16 milestone Feb 23, 2015

amueller reviewed Feb 24, 2015
View reviewed changes

arjoly reviewed Feb 25, 2015
View reviewed changes

ogrisel force-pushed the fix-empty-input-data-common-checks branch from 090ad08 to 67f6ad1 Compare February 25, 2015 14:33

ogrisel changed the title ~~[MRG] Fix empty input data common checks~~ [MRG+1] Fix empty input data common checks Feb 25, 2015

arjoly changed the title ~~[MRG+1] Fix empty input data common checks~~ [MRG+2] Fix empty input data common checks Feb 25, 2015

amueller reviewed Feb 25, 2015
View reviewed changes

FIX test for consistent handling on empty input data

feceab6

ogrisel force-pushed the fix-empty-input-data-common-checks branch from 7d7b95a to feceab6 Compare February 26, 2015 09:46

ogrisel merged commit feceab6 into scikit-learn:master Feb 26, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[MRG+2] Fix empty input data common checks #4245

[MRG+2] Fix empty input data common checks #4245

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

[MRG+2] Fix empty input data common checks #4245

[MRG+2] Fix empty input data common checks #4245

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants