10000 PERF: use higher level BLAS functions · Issue #13210 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

PERF: use higher level BLAS functions #13210

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jeremiedbb opened this issue Feb 21, 2019 · 10 comments
Closed

PERF: use higher level BLAS functions #13210

jeremiedbb opened this issue Feb 21, 2019 · 10 comments
Labels
help wanted Moderate Anything that requires some knowledge of conventions and best practices Performance

Comments

@jeremiedbb
Copy link
Member

Now that we have access to BLAS functions cython bindings from scipy, we have access to level 3 functions which weren't present in vendored CBLAS. It would be interesting to check if they could be used in some places instead of a loop of lower level functions.

For example, in cd_fast I saw that:

# XtA = np.dot(X.T, R) - beta * w
for i in range(n_features):
    XtA[i] = (_dot(n_samples, &X[0, i], 1, &R[0], 1)
              - beta * w[i])

loop of dot (level 1) which could be replaced by a gemv (level 2)

@rth
Copy link
Member
rth commented Feb 21, 2019

Good idea!

@jakirkham might be interested as well.

@jakirkham
Copy link
Contributor

So funny story. We had PR ( #11507 ), which does exactly as you suggest for this cd_fast case previously. However with the old vendored ATLAS reference BLAS it failed. Since you have nicely reworked scikit-learn to use SciPy's Cython BLAS, it now passes! 🎉 Thanks for reworking scikit-learn to use the Cython BLAS. 😄

@GaelVaroquaux
Copy link
Member

I think that this can be closed, as #11507 has been merged. No?

@rth
Copy link
Member
rth commented Feb 25, 2019

Yes, but are we sure there are no other places in the code where higher level BLAS functions could be used? Someone could look for it..

@GaelVaroquaux
Copy link
Member

Sounds right.

@jakirkham
Copy link
Contributor

Other potential candidates include...

cdef floating[:] H = np.dot(Q, w)

for ii in range(n_features):
tmp += w[ii] * H[ii]

# XtA = np.dot(X.T, R) - l2_reg * W.T
for ii in range(n_features):
for jj in range(n_tasks):
XtA[ii, jj] = _dot(
n_samples, X_ptr + ii * n_samples, 1,
&R[0, 0] + jj, n_tasks
) - l2_reg * W[jj, ii]

for center_idx in range(n_clusters):
center_squared_norms[center_idx] = _dot(
n_features, &centers[center_idx, 0], center_stride,
&centers[center_idx, 0], center_stride)

for center_idx in range(n_clusters):
center_squared_norms[center_idx] = _dot(
n_features, &centers[center_idx, 0], 1,
&centers[center_idx, 0], 1)

@cmarmo cmarmo added Moderate Anything that requires some knowledge of conventions and best practices help wanted and removed Sprint labels Apr 1, 2021
@melissawm
Copy link

Other potential candidates include...

Should each of these be in their own PRs, or can we group (say, all the _gemv substitutions) to submit one big PR?

@amueller
Copy link
Member

Hm do we want benchmarks for each of them? then doing them one by one might be best.

@lorentzenchr
Copy link
Member

The hints in #13210 (comment) in the coordinate descent solvers are addressed in #22972.

The one about kmeans are refactored (files renamed) in #17622 and - I think - obsolete with the work before that PR.

@jeremiedbb
Copy link
Member Author

closing based on the last comments in #22972

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Moderate Anything that requires some knowledge of conventions and best practices Performance
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants
0