-
Notifications
You must be signed in to change notification settings - Fork 13.6k
Open
Labels
B-unstableBlocker: Implemented in the nightly compiler and unstable.Blocker: Implemented in the nightly compiler and unstable.C-tracking-issueCategory: An issue tracking the progress of sth. like the implementation of an RFCCategory: An issue tracking the progress of sth. like the implementation of an RFCO-NVPTXTarget: the NVPTX LLVM backend for running rust on GPUs, https://llvm.org/docs/NVPTXUsage.htmlTarget: the NVPTX LLVM backend for running rust on GPUs, https://llvm.org/docs/NVPTXUsage.htmlS-tracking-design-concernsStatus: There are blocking design concerns.Status: There are blocking design concerns.S-tracking-needs-summaryStatus: It's hard to tell what's been done and what hasn't! Someone should do some investigation.Status: It's hard to tell what's been done and what hasn't! Someone should do some investigation.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.
Description
Here's a suggestion for an update to the tracking issue to include concerns. Partially copied for japaric's original post and added concern A066 s from and links to relevant issues.
If you have the possibility you should take a look @RDambrosio016
Feature gate #![feature(abi_ptx)]
This ABI is intended to be used when generating code for device (GPU) targets like nvptx64-nvidia-cuda
. It is used to generate kernels ("global functions") that work as an entry point from host (cpu) code. Functions that do not use the "ptx-kernel" ABI are "device functions" and only callable from kernels and device functions. Device functions are specifically not usable from host (cpu) code.
Public API
The following code
#![no_std]
#![feature(abi_ptx)]
#[no_mangle]
pub extern "ptx-kernel" fn foo() {}
Produces
.version 3.2
.target sm_30
.address_size 64
// .globl foo
.visible .entry foo()
{
ret;
}
Steps / History
- Fix broken passing of kernel arguments (Fix codegen bug in "ptx-kernel" abi related to arg passing #94703)
- Replace
PassMode::Direct
with something else (nvptx "ptx-kernel" ABI (feature: abi_ptx) uses PassMode::Direct for Aggregates #117271) - Re-enable ptx CI tests to avoid future breakage (Re-enable nvptx tests #96842)
- Emit error for kernels with return value other than
()
- Emit error if a kernel is called directly.
- Fix the problem where Rust generates types the LLVM PTX cannot select (NVPTX: "LLVM ERROR: Cannot select" when returning struct with 3byte size from "device function" #97174)
- Resolve unresolved questions
- Create an RFC that specifies the safe way to use this abi (I assume this will be required @pnkfelix?)
- Document feature (https://doc.rust-lang.org/reference/items/external-blocks.html#abi)
- Stabilization PR
Unresolved Questions
- Resolve what kind of stability guarantees can be made about the generated ptx.
- The ABI of kernels have been previously changed for a major version bump and the ptx-interoperability doc is still outdated.
- PTX is an ISA with many versions. The newest is major version 7. Do we need to reserve the possibility of breaking things when moving to a new major version?
- Figure out what llvm does in relations to the
nvptx64-nvidia-cuda
target and the__global__
modifier.
- What kind of types should be allowed to use as arguments in kernels. Should it be a hard error to use these types or only a warning (Global and device kernels are unsound rust-cuda/wg#11)
- The most important part is to find a minimal but useful subset of Rust types that can be used in kernels. raw pointers, primitive types and
#[repr(C)]
types seems like a good start (no slices, tuples, references, etc). - Using mutable references is almost certain UB except for a few unusable special cases (spawning a single thread only)
- There are many convenient types in Rust which do not have a stable ABI (
&[T]
,(T, U)
, etc). Are there some types that do not have a stable representation but can be relied on having an identical representation for sequential compilation with a given rustc version? If so are there any way we could pass them safely between host and device code compiled with the same rustc version?
- The most important part is to find a minimal but useful subset of Rust types that can be used in kernels. raw pointers, primitive types and
- This unstable feature is one of the last stoppers to using
nvptx64-nvidia-cuda
on stable Rust. The target seems to still have a few bugs (NVPTX backend metabug #38789). Should this feature be kept unstable to avoid usage ofnvptx64-nvidia-cuda
until it has been verified to be usable. - How should shared be supported? Is it necessary to do that from the go?
Notes
- It is not possible to emulate kernels with
#[naked]
functions as the.entry
directive needs to be emited for nvptx kernels.
Metadata
Metadata
Assignees
Labels
B-unstableBlocker: Implemented in the nightly compiler and unstable.Blocker: Implemented in the nightly compiler and unstable.C-tracking-issueCategory: An issue tracking the progress of sth. like the implementation of an RFCCategory: An issue tracking the progress of sth. like the implementation of an RFCO-NVPTXTarget: the NVPTX LLVM backend for running rust on GPUs, https://llvm.org/docs/NVPTXUsage.htmlTarget: the NVPTX LLVM backend for running rust on GPUs, https://llvm.org/docs/NVPTXUsage.htmlS-tracking-design-concernsStatus: There are blocking design concerns.Status: There are blocking design concerns.S-tracking-needs-summaryStatus: It's hard to tell what's been done and what hasn't! Someone should do some investigation.Status: It's hard to tell what's been done and what hasn't! Someone should do some investigation.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.