8000 support of pipelined accesses in the runtime · Issue #1554 · oneapi-src/unified-runtime · GitHub
[go: up one dir, main page]

Skip to content
support of pipelined accesses in the runtime  #1554
Open
@jinz2014

Description

@jinz2014

I have a question about the runtime support that allows for the migration of the following cuda codes. Thanks.

auto pipe = cuda::make_pipeline();

  // pipeline load W/X and compute WX;
  pipe.producer_acquire();
  cuda::memcpy_async(W_shared + (threadIdx.y * tx + threadIdx.x) * vec_size,
                     W + (idx * feat_out + j) * feat_in +
                         (threadIdx.y * tx + threadIdx.x) * vec_size,
                     cuda::aligned_size_t<W_copy_size>(W_copy_size), pipe);
  cuda::memcpy_async(X_shared + (threadIdx.y * tx + threadIdx.x) * vec_size,
                     X + (batch_idx * feat_in) +
                         (threadIdx.y * tx + threadIdx.x) * vec_size,
                     cuda::aligned_size_t<X_copy_size>(X_copy_size), pipe);
  pipe.producer_commit();

Metadata

Metadata

Assignees

Labels

cudaCUDA adapter specific issuesquestionFurther information is requested

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    0