8000 [cuBLAS] relax the restrictions on the use of cublasLt · Issue #153590 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content
[cuBLAS] relax the restrictions on the use of cublasLt #153590
@JacoCheung

Description

@JacoCheung

🚀 The feature, motivation and pitch

Hi team, while I used torch.addmm(input, mat1, mat2) (out-of-place operation), where all 3 inputs are of 2D tensor (i.e. matrix). I found out there was an explicit memcpy D2D because it fell back to non-cublasLt routine which only supports inplace operation.

See the nsys timeline: The D2D overhead would be nearly half of the actuall gemm .
Image

As for a conventional linear with bias, there is no such D2D.

The restriction of using cublasLt are listed here. Only when beta = 1 and input is 1D tensor, the cublasLt is activated. However, according to the latest cublasLt doc:

This function supports both in-place matrix multiplication (C == D and Cdesc == Ddesc) and out-of-place matrix multiplication (C != D, both matrices must have the same data type, number of rows, number of columns, batch size, and memory order). In the out-of-place case, the leading dimension of C can be different from the leading dimension of D. Specifically the leading dimension of C can be 0 to achieve row or column broadcast. If Cdesc is omitted, this function assumes it to be equal to Ddesc.

There are no restrictions to beta and the input dims. So I wondered if you can relax the condition to enable cublasLt. Thanks!

Alternatives

No response

Additional context

No response

cc @ptrblck @msaroufim @eqy @jerryzh168 @csarofeen @xwang233 @jianyuh @nikitaved @mruberry @walterddr @lezcano

Metadata

Metadata

Assignees

Labels

module: cublasProblem related to cublas supportmodule: cudaRelated to torch.cuda, and CUDA support in generalmodule: linear algebraIssues related to specialized linear algebra operations in PyTorch; includes matrix multiply matmultriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    0