8000 Ignore errors getting device memory using NVML by elezar · Pull Request #1374 · NVIDIA/k8s-device-plugin · GitHub
[go: up one dir, main page]

Skip to content

Conversation

elezar
Copy link
Member
@elezar elezar commented Aug 19, 2025

On certain systems, the NVML nvmlDeviceGetMemoryInformation API is not supported and returns an error. In these cases we ignore these errors and log a warning instead. This means that:

  • For the GPU Device Plugin, memory limits will be enforced for MPS partioning.
  • For GFD, no nvidia.com/gpu.memory label will be generated.

Backport of #1356

On certain systems, the NVML nvmlDeviceGetMemoryInformation API
is not supported and returns an error. In these cases we ignore
these errors and log a warning instead. This means that:

* For the GPU Device Plugin, memory limits will be enforced for
  MPS partioning.
* For GFD, no nvidia.com/gpu.memory label will be generated.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
@elezar elezar added this to the v0.17.x milestone Aug 19, 2025
Copy link
Collaborator
@ArangoGutierrez ArangoGutierrez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM-Backport

@elezar elezar merged commit 00b814c into NVIDIA:release-0.17 Aug 20, 2025
8 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0