8000 More memory efficient LinearClassifierMixin.predict · Issue #16381 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

More memory efficient LinearClassifierMixin.predict #16381

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
rth opened this issue Feb 4, 2020 · 1 comment
Open

More memory efficient LinearClassifierMixin.predict #16381

rth opened this issue Feb 4, 2020 · 1 comment

Comments

@rth
Copy link
Member
rth commented Feb 4, 2020

I'm running out of memory with LogisticRegression.predict on a dataset with,

  • n_samples = 1M, n_feature=8k, n_classes=800

This happens as LinearClassifierMixin.predict, needs to compute the decision function for all classes which involves a (n_samples, n_features) x (n_features, n_classes) multiplication, resulting in a (n_samples, n_classes) dense matrix (in my case ~6 GB).

Batching in n_samples is a solution, but it would have been nice to have a standard way of doing that, either as a util function that would be used to decorate e.g. batch_apply(LinearClassifierMixin.predict)(X) or a fit param LinearClassifierMixin.predict(X, batch_size=1000).

This is somewhat related to parallizing predict in general (since there batching also happens). Can't find the corresponding issue right now.

This particular case is made worse by the fact that LogisticRegression.coef_ is not enforced to be float32 for float32 sparse training input (with the liblinear solver) related #8769.

@rth rth added the Performance label Feb 4, 2020
@jnothman
Copy link
Member
jnothman commented Feb 4, 2020 via email

@cmarmo cmarmo added the module:linear_model label Mar 29, 2022 < 4F77 /div>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants
0