_handle_zeros_in_scale causing improper scaling when using StandardScaler()

Describe the bug

There is no floating point tolerance in function handle_zeros_in_scale for checking if scale == 0.0. As a result, floating point precision can cause this check to incorrectly fail and not set scale to 1.0. The end result is to potentially have an incorrectly scaled values when using StandardScaler() since the value of scale will be near 0 instead of 1, introducing numerical instability.

Steps/Code to Reproduce

from sklearn.preprocessing import StandardScaler
import numpy as np



data_fails = np.full((1000, 1), 14.62, dtype=float).reshape(-1,1) #array filled with 14.62, causes issue
data_works = np.full((1000,1), 100.0 , dtype=float).reshape(-1,1) #array filled with 100.0, works as intended


scaler_fails = StandardScaler()
scaler_works = StandardScaler()


scaled_fails = scaler_fails.fit_transform(data_fails) #Returns array filled with -1.0 
scaled_works = scaler_works.fit_transform(data_works) #Returns array fill with 0.0


print('\n Results: \n\n')
print(scaled_fails[0][0])
print(scaled_works[0][0])

Expected Results

Expected both scaled results to be zero vector since both are constant-valued vectors.

Actual Results

Standard scaling subtracts mean and divides by standard deviation when appropriate flags are set as in example above. Variance of constant valued vector is 0 which should be caught and replaced by 1 in function handle_zeros_in_scale. However, this is not happening due variations introduced by floating point representation. Results in mean being divided by small floating point value resulting in incorrect scaling when using StandardScaler().

Error occurs at line number 77 in my version of _data inside function _handle_zeros_in_scale. Currently reads:
scale[scale == 0.0] = 1.0

Versions

Python dependencies:
pip: 20.0.2
setuptools: 47.1.1.post20200604
sklearn: 0.22.1
numpy: 1.18.1
scipy: 1.4.1
Cython: None
pandas: 1.0.3
matplotlib: 3.2.1
joblib: 0.15.1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Describe the bug

Steps/Code to Reproduce

Expected Results

Actual Results

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Description

Describe the bug

Steps/Code to Reproduce

Expected Results

Actual Results

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions