Description
Currently, unified runtime L0 adapter will only expose bfloat16 conversion extension on PVC device:
For other device supporting native bfloat16 conversions(BMG, DG2, LunarLake), checking bfloat16 conversion extension will fail.
Following is a simple reproducer:
#include <sycl/sycl.hpp>
#include
#include
int main() {
// Create a SYCL device (default device)
sycl::device device = sycl::device(sycl::default_selector{});
// Get the name of the device
std::string device_name = device.get_info<sycl::info::device::name>();
std::cout << "Device: " << device_name << std::endl;
// Get the list of supported extensions
std::vector<std::string> extensions = device.get_info<sycl::info::device::extensions>();
// Print the supported extensions
std::cout << "Supported extensions:" << std::endl;
for (const auto& ext : extensions) {
std::cout << " " << ext << std::endl;
}
return 0;
}
If we build with DPC++ compiler and run it on DG2 + L0, the output is:
Device: Intel(R) Arc(TM) A770 Graphics
Supported extensions:
cl_khr_il_program
cl_khr_subgroups
cl_intel_subgroups
cl_intel_subgroups_short
cl_intel_required_subgroup_size
cl_khr_fp16
cl_khr_int64_base_atomics
cl_khr_int64_extended_atomics
cl_khr_3d_image_writes
ur_exp_command_buffer
ur_exp_multi_device_compile
ur_exp_usm_p2p
If we run with ONEAPI_DEVICE_SELECTOR=opencl:1(correspond to Arc770), the output is:
Device: Intel(R) Arc(TM) A770 Graphics
Supported extensions:
cl_khr_byte_addressable_store
cl_khr_device_uuid
cl_khr_fp16
cl_khr_global_int32_base_atomics
cl_khr_global_int32_extended_atomics
cl_khr_icd
cl_khr_local_int32_base_atomics
cl_khr_local_int32_extended_atomics
cl_intel_command_queue_families
cl_intel_subgroups
cl_intel_required_subgroup_size
cl_intel_subgroups_short
cl_khr_spir
cl_intel_accelerator
cl_intel_driver_diagnostics
cl_khr_priority_hints
cl_khr_throttle_hints
cl_khr_create_command_queue
cl_intel_subgroups_char
cl_intel_subgroups_long
cl_khr_il_program
cl_intel_mem_force_host_memory
cl_khr_subgroup_extended_types
cl_khr_subgroup_non_uniform_vote
cl_khr_subgroup_ballot
cl_khr_subgroup_non_uniform_arithmetic
cl_khr_subgroup_shuffle
cl_khr_subgroup_shuffle_relative
cl_khr_subgroup_clustered_reduce
cl_intel_device_attribute_query
cl_khr_extended_bit_ops
cl_khr_suggested_local_work_size
cl_intel_split_work_group_barrier
cl_intel_spirv_media_block_io
cl_intel_spirv_subgroups
cl_khr_spirv_linkonce_odr
cl_khr_spirv_no_integer_wrap_decoration
cl_intel_unified_shared_memory
cl_khr_mipmap_image
cl_khr_mipmap_image_writes
cl_ext_float_atomics
cl_khr_external_memory
cl_intel_planar_yuv
cl_intel_packed_yuv
cl_khr_int64_base_atomics
cl_khr_int64_extended_atomics
cl_khr_image2d_from_buffer
cl_khr_depth_images
cl_khr_3d_image_writes
cl_intel_media_block_io
cl_intel_bfloat16_conversions
cl_intel_create_buffer_with_properties
cl_intel_subgroup_local_block_io
cl_intel_subgroup_matrix_multiply_accumulate
cl_intel_subgroup_split_matrix_multiply_accumulate
cl_khr_integer_dot_product
cl_khr_gl_sharing
cl_khr_gl_depth_images
cl_khr_gl_event
cl_khr_gl_msaa_sharing
cl_intel_va_api_media_sharing
cl_intel_sharing_format_query
cl_khr_pci_bus_info
Is it possible to change the behavior to check L0 bfloat16 conversion extension instead of checking device type?