EFF Speed-up MiniBatchDictionaryLearning by avoiding multiple validation #25490

jeremiedbb · 2023-01-26T15:47:10Z

MinibatchDictionaryLearning calls public functions that call public classes themselves. We end up validating the parameters and the input/dict twice per minibatch. When the batch size is large it has barely no impact but for small batch sizes it can be very detrimental.

For instance, here's a profiling result in the extreme case batch_size=1

This PR removes the first one. It's a param validation coming from sparse_encode. The profiling now gives

the first block is gone. I intend to deal with the other ones in follow up PRs

jeremiedbb · 2023-01-26T17:37:27Z

I opened an alternative in #25493, which involves a lot less refactoring.

ogrisel

I think this refactoring can make sense whether or not we merge #25493. WDYT?

jeremiedbb · 2023-01-27T13:09:13Z

Well now that it's done :)

jjerphan

Nice observation, @jeremiedbb! This LGTM.

I just have one remark. I also think we can merge this PR independently from #25493 as @ogrisel mentioned in #25490 (review) once a changelog entry is added. What do you think?

sklearn/decomposition/_dict_learning.py

jeremiedbb · 2023-03-06T18:26:07Z

I added a what's new entry. Yes, let's merge this one first.

jeremiedbb added 2 commits January 26, 2023 16:19

refactor sparse encode to better avoid validation

313ff65

more explicit docstrings

50f7e6b

jeremiedbb added the Performance label Jan 26, 2023

github-actions bot added the module:decomposition label Jan 26, 2023

Merge branch 'main' into less-validation-sparse-coding

6c89e34

jeremiedbb mentioned this pull request Jan 26, 2023

EFF Speed-up MiniBatchDictionaryLearning by avoiding multiple validation #25493

Closed

ogrisel approved these changes Jan 27, 2023

View reviewed changes

ogrisel added the Waiting for Second Reviewer First reviewer is done, need a second one! label Feb 7, 2023

jjerphan approved these changes Mar 6, 2023

View reviewed changes

sklearn/decomposition/_dict_learning.py Show resolved Hide resolved

what's new entry

664b48e

Merge branch 'main' into less-validation-sparse-coding

ac7a542

jjerphan merged commit 408f561 into scikit-learn:main Mar 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

EFF Speed-up MiniBatchDictionaryLearning by avoiding multiple validation #25490

EFF Speed-up MiniBatchDictionaryLearning by avoiding multiple validation #25490

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

EFF Speed-up MiniBatchDictionaryLearning by avoiding multiple validation #25490

EFF Speed-up MiniBatchDictionaryLearning by avoiding multiple validation #25490

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants