8000 Ensure that fitting on 1D input data is consistent across estimators · Issue #4252 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content
Ensure that fitting on 1D input data is consistent across estimators #4252
Closed
@ogrisel

Description

@ogrisel

As noted by @amueller in issue #3440 "Input validation refactoring", fitting on 1D input is currently not consistent across estimators.

Depending on the model the following case can either be treated as x is array of 100 samples and 1 feature (like MinMaxScaler does for instance) or as 1 sample with 100 features as most models currently do even if almost always counter-intuitive to do so.

x = np.linspace(0, 1, 100)
estimator.fit(x, y)

Most models use check_X_y(X, y) or check_array(X) and leave the ensure_2d=True kwarg to its default value. The current behavior of ensure_2d is to cast 1d array as row-vectors (1 sample with len(X) features). This was done so for backward compatibility reasons.

I think this behavior is counter-intuitive and we should break backward compat for this edge-case to always treat 1d arrays as multi-sample collections of a single features rather than the opposite.

I mark this issue as an API discussion. It should be tackled before version 1.0.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0