-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
[MRG+1] NaN handling MinMaxScaler #11005
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG+1] NaN handling MinMaxScaler #11005
Conversation
sklearn/preprocessing/data.py
Outdated
@@ -340,7 +340,8 @@ def partial_fit(self, X, y=None): | |||
"You may consider to use MaxAbsScaler instead.") | |||
|
|||
X = check_array(X, copy=self.copy, warn_on_dtype=True, | |||
estimator=self, dtype=FLOAT_DTYPES) | |||
estimator=self, dtype=FLOAT_DTYPES, | |||
force_all_finite="allow-nan") | |||
|
|||
data_min = np.min(X, axis=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should be np.nanmin
because np.min([1, 2, 3, np.nan]) will return np.nan
sklearn/preprocessing/data.py
Outdated
@@ -340,7 +340,8 @@ def partial_fit(self, X, y=None): | |||
"You may consider to use MaxAbsScaler instead.") | |||
|
|||
X = check_array(X, copy=self.copy, warn_on_dtype=True, | |||
estimator=self, dtype=FLOAT_DTYPES) | |||
estimator=self, dtype=FLOAT_DTYPES, | |||
force_all_finite="allow-nan") | |||
|
|||
data_min = np.min(X, axis=0) | |||
data_max = np.max(X, axis=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np.max
-> np.nanmax
Add a what's new entry similarly to: https://github.com/scikit-learn/scikit-learn/pull/10437/files#diff-e1d2c7e554aa3e481389d1faa2eb5136 |
It would also be nice to have a note: |
sklearn/preprocessing/data.py
Outdated
@@ -276,6 +276,10 @@ class MinMaxScaler(BaseEstimator, TransformerMixin): | |||
|
|||
Notes | |||
----- | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this empty line
doc/whats_new/v0.20.rst
Outdated
@@ -94,6 +94,10 @@ Preprocessing | |||
other features in a round-robin fashion. :issue:`8478` by | |||
:user:`Sergey Feldman <sergeyf>`. | |||
|
|||
- Updated :class: `MinMaxScaler` to pass through NaN values. :issue: `10404` | |||
by :user: `Lucija Gregov <LucijaGregov>` | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this empty line
sklearn/utils/estimator_checks.py
Outdated
@@ -73,7 +73,7 @@ | |||
'RANSACRegressor', 'RadiusNeighborsRegressor', | |||
'RandomForestRegressor', 'Ridge', 'RidgeCV'] | |||
|
|||
ALLOW_NAN = ['QuantileTransformer', 'Imputer', 'SimpleImputer', 'MICEImputer'] | |||
ALLOW_NAN = ['Imputer', 'SimpleImputer', 'MICEImputer', 'MinMaxScaler', 'QuantileTransformer', 'Imputer', 'SimpleImputer', 'MICEImputer'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There some repetition in this line.
doc/whats_new/v0.20.rst
Outdated
@@ -94,6 +94,10 @@ Preprocessing | |||
other features in a round-robin fashion. :issue:`8478` by | |||
:user:`Sergey Feldman <sergeyf>`. | |||
|
|||
- Updated :class: `MinMaxScaler` to pass through NaN values. :issue: `10404` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. L 8000 earn more.
no space after :issue:
doc/whats_new/v0.20.rst
Outdated
@@ -94,6 +94,10 @@ Preprocessing | |||
other features in a round-robin fashion. :issue:`8478` by | |||
:user:`Sergey Feldman <sergeyf>`. | |||
|
|||
- Updated :class: `MinMaxScaler` to pass through NaN values. :issue: `10404` | |||
by :user: `Lucija Gregov <LucijaGregov>` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no space after :user:
doc/whats_new/v0.20.rst
Outdated
@@ -94,6 +94,10 @@ Preprocessing | |||
other features in a round-robin fashion. :issue:`8478` by | |||
:user:`Sergey Feldman <sergeyf>`. | |||
|
|||
- Updated :class: `MinMaxScaler` to pass through NaN values. :issue: `10404` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:class:`preprocessing.MinMaxScaler`
ping @rth |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See minor comment below
LGTM otherwise.
doc/whats_new/v0.20.rst
Outdated
@@ -94,6 +94,9 @@ Preprocessing | |||
other features in a round-robin fashion. :issue:`8478` by | |||
:user:`Sergey Feldman <sergeyf>`. | |||
|
|||
- Updated :class:`preprocessing.MinMaxScaler` to pass through NaN values. :issue:`10404` | |||
by :user:`Lucija Gregov <LucijaGregov>` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dot missing at the end.
Reference Issues/PRs
partially adressed #10404
What does this implement/fix? Explain your changes.
Pass-through NaN value in
MinMaxScaler
Any other comments?