Closed
Description
The issue:
When applying StandardScaler to a big matrix, the memory requirements are expensive.
Example:
big = np.random.random([495982, 4098]) //this is around 8GB
scaler = StandardScaler()
scaler.fit(big) //this will require nearly another 16GB of RAM
I guess it uses some lookup tables to speed the standard deviation computations, but double the required RAM might be too much in some cases. A flag to enable a, slower, but less memory intensive version, would be nice.
Is there any solutions to reduce memory consumption?