-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
ENH Optimize dot product order for LogisticRegression for dense matrices #19571
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH Optimize dot product order for LogisticRegression for dense matrices #19571
Conversation
LogisticRegression
LogisticRegression
np.linalg.multi_dot quickly chooses the best order for the multiplication of three matrices. See: https://github.com/numpy/numpy/blob/860c8b82939d1535351f7b651c284a55efe21b10/numpy/linalg/linalg.py#L2742
3b60347
to
22a620d
Compare
Here are some results of a first simple benchmark on dense matrices. $ asv run -b MultiDotLogReg
· Creating environments
· Discovering benchmarks
· Running 1 total benchmarks (1 commits * 1 environments * 1 benchmarks)
[ 0.00%] ·· Benchmarking conda-py3.9-cython-joblib-numpy-scipy-threadpoolctl
[100.00%] ··· multi_dot_logreg.MultiDotLogReg.time_hessian_grad_prod ok
[100.00%] ··· =========== ============ ==================== ===================== ================== =================== ================== ===================
-- s_n_cols_prop / mult_strategy
------------------------ ------------------------------------------------------------------------------------------------------------------------
n_samples n_features 0.5 / chain_np_dot 0.5 / use_multi_dot 1 / chain_np_dot 1 / use_multi_dot 2 / chain_np_dot 2 / use_multi_dot
=========== ============ ==================== ===================== ================== =================== ================== ===================
100 100 96.3±20μs 126±30μs 158±10μs 183±10μs 278±30μs 257±60μs
100 1000 5.63±0.9ms 6.46±2ms 15.8±5ms 15.0±4ms 37.6±6ms 33.0±4ms
1000 100 939±400μs 1.84±0.5ms 1.44±0.5ms 811±200μs 3.10±1ms 877±200μs
1000 1000 55.5±10ms 50.9±10ms 117±10ms 114±9ms 234±20ms 169±20ms
10000 100 9.85±1ms 10.00±0.2ms 14.8±2ms 6.97±1ms 27.2±3ms 6.56±0.3ms
10000 1000 550±50ms 689±30ms 1.31±0.03s 740±50ms 2.55±0.09s 710±30ms
100000 100 107±7ms 123±20ms 199±30ms 75.3±8ms 346±40ms 102±20ms
100000 1000 5.98±0.1s 5.62±0.4s 13.9±0.5s 6.66±0.07s 27.5±0.4s 6.74±0.06s
=========== ============ ==================== ===================== ================== =================== ================== =================== Edit: link to the first revision of the asv script. |
Nice 4x speed-up! |
Note that this speed-up is present only for some matrices that are sufficiently big, and large for We, yet, have no clue for performances regarding sparse matrices as-well. 😉 |
Does the benchmark results change if the matrices were f-ordered? |
I just have updated the asv benchmark script to introduce memory layouts. Though the performances are a bit different in some configurations with Fortran-ordered matrices, It comes with speed-up for most cases.
|
Note that in both cases, as Moreover, the runtimes for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am +1 with moving forward with this since it is already a net benefit for dense matrices.
We can explore sparse matrices in another PR.
LogisticRegression
LogisticRegression
for dense matrices
@thomasjpfan: I have modified the title and description of the PR accordingly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add an entry to the change log at doc/whats_new/v1.0.rst
with tag |Efficiency|. Like the other entries there, please reference this pull request with :pr:
and credit yourself (and other contributors if applicable) with :user:
.
(It would be interesting to try the out
parameter of multi_dot
but it is only for np >= 1.19
)
+1 for adding a TODO comment explicitly mentioning the minimum numpy version requirement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also +1 once with the what's new entry for 1.0.
If we have not benchmarked the sparse case, why are we changing that code path? We have changed the operation order for the sparse case unless I'm missing something? I would rather we reverted to the previous code for sparse. |
I agree with @rth. The order should be kept the same for the sparse case. |
Alternatively, and if we aren't in a rush, can we have the sparse matrices case(s) treated in this PR as well? IMO, this would avoid adding intermediary commits meant to preserve the code path for sparse matrices, which might add complexity to the implementation. What do you think? |
I looked at it briefly at the time and as far as as I can tell the sparse case is not obvious, because it might depend on both matrix dimensions and sparsity of each array. It would probably be better to look into it in a separate PR in any case.
I'm not sure I follow, it would be, --- a/sklearn/
8000
linear_model/_logistic.py
+++ b/sklearn/linear_model/_logistic.py
@@ -233,7 +233,10 @@ def _logistic_grad_hess(w, X, y, alpha, sample_weight=None):
def Hs(s):
ret = np.empty_like(s)
- ret[:n_features] = X.T.dot(dX.dot(s[:n_features]))
+ if sparse.issparse(X):
+ ret[:n_features] = X.T.dot(dX.dot(s[:n_features]))
+ else:
+ ret[:n_features] = np.linalg.multi_dot([X.T, dX, s[:n_features]])
ret[:n_features] += alpha * s[:n_features]
# For the fit intercept case. wouldn't it? We don't care about individual commits since PRs are squashed anyway. |
Before we merge this quickly, can we first benchmark with |
Also, if the logic is as simple as https://github.com/numpy/numpy/blob/860c8b82939d1535351f7b651c284a55efe21b10/numpy/linalg/linalg.py#L2742, we can put that before |
@rth: I misread what you wrote initially; I agree with you. @lorentzenchr has a good point. We could have the test executed once by defining Please note that my benchmarks aren't comparing running times of this code path for I would suggest inspecting running times for smaller Let me know if it is a high priority as I am currently busy. If it's the case and if people want to have it move forward, I am not against other developers contributing to this PR. 🙂 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This small change (back) in addition to a benchmark with s_n_cols_prop=0.1
and we're good to merge, I think.
+1 for reverting the sparse case to the original code as suggested in #19571 (review) and merge the |
Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com>
Sorry for this week hiccups; here are the raw results from the latest benchmark with, as requested,
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Just one nitpick.
@jjerphan Thanks you very much.
Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com>
LogisticRegression
for dense matrices…ces (scikit-learn#19571) * Use multi_dot for Hessian and gradient product. np.linalg.multi_dot quickly chooses the best order for the multiplication of three matrices.
* ENH replace loss in linear logistic regression * MNT remove logistic regression's own loss functions * CLN remove comment * DOC add whatsnew * DOC more precise whatsnew * CLN restore improvements of #19571 * ENH improve fit time by separating mat-vec in multiclass * DOC update whatsnew * not only a bit ;-) * DOC note memory benefit for multiclass case * trigger CI * trigger CI * CLN rename variable to hess_prod * DOC address reviewer comments * CLN remove C/F for 1d arrays * CLN rename to gradient_per_sample * CLN rename alpha to l2_reg_strength * ENH respect F-contiguity * TST fix sag tests * CLN rename to LinearModelLoss * CLN improve comments according to review * CLN liblinear comment * TST add / move test to test_linear_loss.py * CLN comment placement * trigger CI * CLN add comment about contiguity of raw_prediction * DEBUG debian-32 * DEBUG test only linear_model module * Revert "DEBUG test only linear_model module" This reverts commit 9d6e698. * DEBUG test -k LogisticRegression * Revert "DEBUG test -k LogisticRegression" This reverts commit c203167. * Revert "DEBUG debian-32" This reverts commit ef0b98f. * DEBUG set n_jobs=1 * Revert "DEBUG set n_jobs=1" This reverts commit c7f6f72. * CLN always use n_threads=1 * CLN address review * ENH avoid array copy * CLN simplify L2 norm * CLN rename w to weights * CLN rename to hessian_sum and hx_sum * CLN address review * CLN rename to init arg and attribute to base_loss * CLN apply review suggestion Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org> * CLN base_loss instead of _loss attribute Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org>
* ENH replace loss in linear logistic regression * MNT remove logistic regression's own loss functions * CLN remove comment * DOC add whatsnew * DOC more precise whatsnew * CLN restore improvements of scikit-learn#19571 * ENH improve fit time by separating mat-vec in multiclass * DOC update whatsnew * not only a bit ;-) * DOC note memory benefit for multiclass case * trigger CI * trigger CI * CLN rename variable to hess_prod * DOC address reviewer comments * CLN remove C/F for 1d arrays * CLN rename to gradient_per_sample * CLN rename alpha to l2_reg_strength * ENH respect F-contiguity * TST fix sag tests * CLN rename to LinearModelLoss * CLN improve comments according to review * CLN liblinear comment * TST add / move test to test_linear_loss.py * CLN comment placement * trigger CI * CLN add comment about contiguity of raw_prediction * DEBUG debian-32 * DEBUG test only linear_model module * Revert "DEBUG test only linear_model module" This reverts commit 9d6e698. * DEBUG test -k LogisticRegression * Revert "DEBUG test -k LogisticRegression" This reverts commit c203167. * Revert "DEBUG debian-32" This reverts commit ef0b98f. * DEBUG set n_jobs=1 * Revert "DEBUG set n_jobs=1" This reverts commit c7f6f72. * CLN always use n_threads=1 * CLN address review * ENH avoid array copy * CLN simplify L2 norm * CLN rename w to weights * CLN rename to hessian_sum and hx_sum * CLN address review * CLN rename to init arg and attribute to base_loss * CLN apply review suggestion Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org> * CLN base_loss instead of _loss attribute Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org>
Reference Issues/PRs
Contributes to solve #17684 for
sklearn/linear_model/_logistic.py
.What does this implement/fix? Explain your changes.
np.linalg.multi_dot
quickly chooses the best order for the multiplication of three dense matrices, see its implementation.This PR uses it for the Hessian and gradient dot product.
Any other comments?
Sparse matrices are to be explored in another PR.