-
Notifications
You must be signed in to change notification settings - Fork 24.8k
Description
🚀 The feature, motivation and pitch
This would make it more approachable experimenting with some new function variants (like quantized int8 gemms) earlier.
Example of such ctypes bindings: https://github.com/OpenBMB/cpm_kernels/tree/master/cpm_kernels/library
Including in core some bindings like this would be great! (maybe under some torch.cuda.ctypes.cublasLt
or something similar)
Examples of such C bindings: https://github.com/TimDettmers/bitsandbytes/blob/18e827d666fa2b70a12d539ccedc17aa51b2c97c/csrc/ops.cu#L434
Another set of bindings is now in bitsandbytes
, but having it directly available in Python would make it more approachable for experimentation and benchmarking
There might be problems with versions, but maybe then some bindings could be versioned as well: torch.cuda.ctypes.cublatltV8
or sth like that, so that the user is responsible for using the correct bindings for their experiments
Alternatives
No response
Additional context
No response