8000 [MRG+1] NaN handling MinMaxScaler by LucijaGregov · Pull Request #11005 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

[MRG+1] NaN handling MinMaxScaler #11005

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Apr 21, 2018

Conversation

LucijaGregov
Copy link
Contributor
@LucijaGregov LucijaGregov commented Apr 21, 2018

Reference Issues/PRs

partially adressed #10404

What does this implement/fix? Explain your changes.

Pass-through NaN value in MinMaxScaler

Any other comments?

< 8000 include-fragment loading="lazy" src="/scikit-learn/scikit-learn/issues/11005/edit_form?textarea_id=issue-316499314-body&comment_context=" data-nonce="v2:ba33b81c-35f6-6204-9fa0-0450df5cd6f9" data-view-component="true" class="previewable-comment-form js-comment-edit-form-deferred-include-fragment">

@LucijaGregov
Copy link
Contributor Author

@glemaitre

@glemaitre glemaitre changed the title (WiP) Nan handling minmax scaler [WIP] Nan handling minmax scaler Apr 21, 2018
@glemaitre glemaitre changed the title [WIP] Nan handling minmax scaler [WIP] NaN handling MinMaxScaler Apr 21, 2018
@@ -340,7 +340,8 @@ def partial_fit(self, X, y=None):
"You may consider to use MaxAbsScaler instead.")

X = check_array(X, copy=self.copy, warn_on_dtype=True,
estimator=self, dtype=FLOAT_DTYPES)
estimator=self, dtype=FLOAT_DTYPES,
force_all_finite="allow-nan")

data_min = np.min(X, axis=0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be np.nanmin because np.min([1, 2, 3, np.nan]) will return np.nan

@@ -340,7 +340,8 @@ def partial_fit(self, X, y=None):
"You may consider to use MaxAbsScaler instead.")

X = check_array(X, copy=self.copy, warn_on_dtype=True,
estimator=self, dtype=FLOAT_DTYPES)
estimator=self, dtype=FLOAT_DTYPES,
force_all_finite="allow-nan")

data_min = np.min(X, axis=0)
data_max = np.max(X, axis=0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

np.max -> np.nanmax

@glemaitre
Copy link
Member

@glemaitre
Copy link
Member

@glemaitre glemaitre changed the title [WIP] NaN handling MinMaxScaler [MRG] NaN handling MinMaxScaler Apr 21, 2018
@@ -276,6 +276,10 @@ class MinMaxScaler(BaseEstimator, TransformerMixin):

Notes
-----

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this empty line

@@ -94,6 +94,10 @@ Preprocessing
other features in a round-robin fashion. :issue:`8478` by
:user:`Sergey Feldman <sergeyf>`.

- Updated :class: `MinMaxScaler` to pass through NaN values. :issue: `10404`
by :user: `Lucija Gregov <LucijaGregov>`

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this empty line

@@ -73,7 +73,7 @@
'RANSACRegressor', 'RadiusNeighborsRegressor',
'RandomForestRegressor', 'Ridge', 'RidgeCV']

ALLOW_NAN = ['QuantileTransformer', 'Imputer', 'SimpleImputer', 'MICEImputer']
ALLOW_NAN = ['Imputer', 'SimpleImputer', 'MICEImputer', 'MinMaxScaler', 'QuantileTransformer', 'Imputer', 'SimpleImputer', 'MICEImputer']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There some repetition in this line.

@@ -94,6 +94,10 @@ Preprocessing
other features in a round-robin fashion. :issue:`8478` by
:user:`Sergey Feldman <sergeyf>`.

- Updated :class: `MinMaxScaler` to pass through NaN values. :issue: `10404`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. L 8000 earn more.

no space after :issue:

@@ -94,6 +94,10 @@ Preprocessing
other features in a round-robin fashion. :issue:`8478` by
:user:`Sergey Feldman <sergeyf>`.

- Updated :class: `MinMaxScaler` to pass through NaN values. :issue: `10404`
by :user: `Lucija Gregov <LucijaGregov>`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no space after :user:

@@ -94,6 +94,10 @@ Preprocessing
other features in a round-robin fashion. :issue:`8478` by
:user:`Sergey Feldman <sergeyf>`.

- Updated :class: `MinMaxScaler` to pass through NaN values. :issue: `10404`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:class:`preprocessing.MinMaxScaler`

@glemaitre glemaitre changed the title [MRG] NaN handling MinMaxScaler [MRG+ 1] NaN handling MinMaxScaler Apr 21, 2018
@glemaitre glemaitre changed the title [MRG+ 1] NaN handling MinMaxScaler [MRG+1] NaN handling MinMaxScaler Apr 21, 2018
@glemaitre
Copy link
Member

ping @rth

Copy link
Member
@rth rth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See minor comment below

LGTM otherwise.

@@ -94,6 +94,9 @@ Preprocessing
other features in a round-robin fashion. :issue:`8478` by
:user:`Sergey Feldman <sergeyf>`.

- Updated :class:`preprocessing.MinMaxScaler` to pass through NaN values. :issue:`10404`
by :user:`Lucija Gregov <LucijaGregov>`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dot missing at the end.

@rth rth merged commit f1aedf6 into scikit-learn:master Apr 21, 2018
@jnothman jnothman mentioned this pull request Jun 16, 2018
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0