Tracking issue for the "ptx-kernel" ABI

@RDambrosio016

Here's a suggestion for an update to the tracking issue to include concerns. Partially copied for japaric's original post and added concern A066 s from and links to relevant issues.

If you have the possibility you should take a look @RDambrosio016

Feature gate #![feature(abi_ptx)]

This ABI is intended to be used when generating code for device (GPU) targets like nvptx64-nvidia-cuda. It is used to generate kernels ("global functions") that work as an entry point from host (cpu) code. Functions that do not use the "ptx-kernel" ABI are "device functions" and only callable from kernels and device functions. Device functions are specifically not usable from host (cpu) code.

Public API

The following code

#![no_std]
#![feature(abi_ptx)]

#[no_mangle]
pub extern "ptx-kernel" fn foo() {}

Produces

.version 3.2
.target sm_30
.address_size 64

	// .globl	foo

.visible .entry foo()
{
	ret;
}

Steps / History

Unresolved Questions

Resolve what kind of stability guarantees can be made about the generated ptx.
- The ABI of kernels have been previously changed for a major version bump and the ptx-interoperability doc is still outdated.
- PTX is an ISA with many versions. The newest is major version 7. Do we need to reserve the possibility of breaking things when moving to a new major version?
- Figure out what llvm does in relations to the nvptx64-nvidia-cuda target and the __global__ modifier.
What kind of types should be allowed to use as arguments in kernels. Should it be a hard error to use these types or only a warning (Global and device kernels are unsound rust-cuda/wg#11)
- The most important part is to find a minimal but useful subset of Rust types that can be used in kernels. raw pointers, primitive types and #[repr(C)] types seems like a good start (no slices, tuples, references, etc).
- Using mutable references is almost certain UB except for a few unusable special cases (spawning a single thread only)
- There are many convenient types in Rust which do not have a stable ABI (&[T], (T, U), etc). Are there some types that do not have a stable representation but can be relied on having an identical representation for sequential compilation with a given rustc version? If so are there any way we could pass them safely between host and device code compiled with the same rustc version?
This unstable feature is one of the last stoppers to using nvptx64-nvidia-cuda on stable Rust. The target seems to still have a few bugs (NVPTX backend metabug #38789). Should this feature be kept unstable to avoid usage of nvptx64-nvidia-cuda until it has been verified to be usable.
How should shared be supported? Is it necessary to do that from the go?

Notes

It is not possible to emulate kernels with #[naked] functions as the .entry directive needs to be emited for nvptx kernels.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Public API

Steps / History

Unresolved Questions

Notes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Description

Public API

Steps / History

Unresolved Questions

Notes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions