-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
PERF Optimize dot product order #17684
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Turns out [1] https://stackoverflow.com/questions/45852228/how-is-numpy-multi-dot-slower-than-numpy-dot |
Yes, indeed, that's why benchmarking is useful in any case. |
I replaced the the nested The code in my Jupyter notebook looks something like this: from sklearn.datasets import load_digits
from sklearn.decomposition import FastICA
X, y = load_digits(return_X_y=True)
X.shape
>>> (1797, 64)
transformer = FastICA(n_components=7, random_state=0)
%%timeit -r 20
X_transformed = transformer.fit_transform(X)
from sklearn import linear_model
clf = linear_model.BayesianRidge()
%%timeit -r 20
clf.fit(X, y)
# Try with a bigger dataset
X = np.tile(X, (8, 8))
y = np.tile(y, 8)
X.shape, y.shape
>>> ((14376, 512), (14376,))
|
Thanks for checking @d3b0unce! So it might be worth it for For the first occurrence in _samples_generator I did see a small performance improvement when I benchmarked it earlier. |
Hi @rth, I tested again with a (20000, 30) dataset, on a multi-core machine. The dataset was created like this np.random.seed(0)
X = np.random.rand(20000, 30)
y = np.random.rand(20000)
Should I commit the changes for the classes below as a part of PR #17737?
|
Great!
Yes, please it would be easier for reviewers to evaluate if it's in one branch.. |
We should benchmark when using sparse data as well. btw do you understand this ? import numpy as np
from scipy.sparse import random
X = random(m=10, n=100)
Y = random(m=100, n=100)
Z = random(m=100, n=10)
np.linalg.multi_dot([X, Y, Z])
|
Yeah, I guess it doesn't work on sparse, and doesn't raise a clean exception either. |
Yes, that makes sense. I got these errors when I changed them in the wrong places, such as in
I pushed the changes for FastICA, NMF, BayesianRidge, and ARDRegression in #17737. Do you want the changes for other modules that have not been benchmarked yet to go in a separate PR? |
Sounds good. Let's merge that one first, I'll try to review in a day or so. |
|
np.linalg.multi_dot has shown to be faster for the dot product of 3 matrices. See: scikit-learn#17684 (comment)
When multiplying 3 or more matrices, the order of parathesis doesn't impact the results but it can have a very significant impact on the number of operations and on performance see https://en.wikipedia.org/wiki/Matrix_chain_multiplication
For matrix multiplication of dense arrays there is numpy.linalg.multi_dot and we we should use it I think. To find existing occurrences where it could be used, see for instance the result of
Ideally each replacement should be benchmarked.
For matrix multiplication
safe_sparse_dot
using a combination of sparse and dense matrice, some of this could apply as well, though defining a general heuristic is probably a bit more difficult there.The text was updated successfully, but these errors were encountered: