8000 Add DeviceRequests to HostConfig to support NVIDIA GPUs by tiborvass · Pull Request #38828 · moby/moby · GitHub
[go: up one dir, main page]

Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Add DeviceRequests to HostConfig to support NVIDIA GPUs
This patch hard-codes support for NVIDIA GPUs.
In a future patch it should move out into its own Device Plugin.

Signed-off-by: Tibor Vass <tibor@docker.com>
  • Loading branch information
Tibor Vass committed Mar 18, 2019
commit 8f936ae8cf6c39bf3ce21a25b231383dac3212e6
42 changes: 42 additions & 0 deletions api/swagger.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -210,6 +210,43 @@ definitions:
PathInContainer: "/dev/deviceName"
CgroupPermissions: "mrw"

DeviceRequest:
type: "object"
description: "A request for devices to be sent to device drivers"
properties:
Driver:
type: "string"
example: "nvidia"
Count:
type: "integer"
example: -1
DeviceIDs:
type: "array"
items:
type: "string"
example:
- "0"
- "1"
- "GPU-fef8089b-4820-abfc-e83e-94318197576e"
Capabilities:
description: |
A list of capabilities; an OR list of AND lists of capabilities.
type: "array"
items:
type: "array"
items:
type: "string"
example:
# gpu AND nvidia AND compute
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need an example for OR here?

Suggested change
# gpu AND nvidia AND compute
# gpu AND nvidia AND compute, OR gpu AND intel
- ["gpu", "nvidia", "compute"]
- ["gpu", "intel"]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it's fine, the reason I put it there is so that we can support it in the future without breaking the API.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't want to support OR yet; we should error out if len(capabilities) > 1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant is that it is supported, but not from the CLI.

- ["gpu", "nvidia", "compute"]
Options:
description: |
Driver-specific options, specified as a key/value pairs. These options
are passed directly to the driver.
type: "object"
additionalProperties:
type: "string"

ThrottleDevice:
type: "object"
properties:
Expand Down Expand Up @@ -421,6 +458,11 @@ definitions:
items:
type: "string"
example: "c 13:* rwm"
DeviceRequests:
description: "a list of requests for devices to be sent to device drivers"
type: "array"
items:
$ref: "#/definitions/DeviceRequest"
DiskQuota:
description: "Disk limit (in bytes)."
type: "integer"
Expand Down
11 changes: 11 additions & 0 deletions api/types/container/host_config.go
Original file line number Diff line number Diff line change
Expand Up @@ -244,6 +244,16 @@ func (n PidMode) Container() string {
return ""
}

// DeviceRequest represents a request for devices from a device driver.
// Used by GPU device drivers.
type DeviceRequest struct {
Driver string // Name of device driver
Count int // Number of devices to request (-1 = All)
DeviceIDs []string // List of device IDs as recognizable by the device driver
Capabilities [][]string // An OR list of AND lists of device capabilities (e.g. "gpu")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We were discussing Capabilities as a name for this (as it could be confused for Capabilities on the container itself (i.e. Linux capabilities)), but I can't come up with good alternatives; perhaps Features, but not sure if that's a good match

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand it but on the other hand, it's literally a list of what the device is capable of doing, what capabilities it provides. In this case it provides "gpu" capability, as well as nvidia-specific capabilities like "compute", etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only other names I can think of is "requirements" or "constraints", but I'm unsure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this c 48D9 omment to others. Learn more.

It only matches if all of these are matched, correct? "constraints" could work, but possibly too generic? idk. Naming is really hard on this one

Options map[string]string // Options to pass onto the device driver
}

// DeviceMapping represents the device mapping between the host and the container.
type DeviceMapping struct {
PathOnHost string
Expand Down Expand Up @@ -327,6 +337,7 @@ type Resources struct {
CpusetMems string // CpusetMems 0-2, 0,1
Devices []DeviceMapping // List of devices to map inside the container
DeviceCgroupRules []string // List of rule to be added to the device cgroup
DeviceRequests []DeviceRequest // List of device requests for device drivers
DiskQuota int64 // Disk limit (in bytes)
KernelMemory int64 // Kernel memory limit (in bytes)
KernelMemoryTCP int64 // Hard limit for kernel TCP buffer memory (in bytes)
Expand Down
38 changes: 38 additions & 0 deletions daemon/devices_linux.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
package daemon // import "github.com/docker/docker/daemon"

import (
"github.com/docker/docker/api/types/container"
"github.com/docker/docker/pkg/capabilities"
specs "github.com/opencontainers/runtime-spec/specs-go"
)

var deviceDrivers = map[string]*deviceDriver{}

type deviceDriver struct {
capset capabilities.Set
updateSpec func(*specs.Spec, *deviceInstance) error
}

type deviceInstance struct {
req container.DeviceRequest
selectedCaps []string
}

func registerDeviceDriver(name string, d *deviceDriver) {
deviceDrivers[name] = d
}

func (daemon *Daemon) handleDevice(req container.DeviceRequest, spec *specs.Spec) error {
if req.Driver == "" {
for _, dd := range deviceDrivers {
if selected := dd.capset.Match(req.Capabilities); selected != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I'm wondering: here, we match capabilities against the driver. So if a machine has (e.g.) two GPUs, and one of them supports "capA" and one of them "capB", then the driver would register itself with all of those (so driver says: "I provide capA and capB") correct?

This could result in a situation where none of the GPUs support the requested list of capabilities, i.e.;

Request GPU-A GPU-B Driver Driver Match GPU Match
"capA,capB" "capA, capC" "capB, capC" "capA,capB,capC"

What would happen in that case? (i.e., conversion to OCI succeeds, hook is registered, but no GPU is found)? Will a proper error be produced?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could make it an OR list of ANDs as well instead of a map.

Copy link
Member
@thaJeztah thaJeztah Mar 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should if this is a concern, so in that case the driver would report itself as;

{
  "capabilities": [
    ["capA", "capB"],
    ["capB", "capC"]
  ]
}

Could even decide to make it just return a list of capabilities for each GPU (then we can even determine the number of GPUs available);

{
  "capabilities": [
    ["capA", "capB"],
    ["capA", "capB"],
    ["capA", "capB"],
    ["capA", "capB"],
    ["capB", "capC"]
  ]
}

But perhaps that breaks the abstraction

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we punt on the problem since the problem is extremely unlikely to happen at this time and the structure is that of the device driver so it's internal, we can change it. The API needs to be locked down.

return dd.updateSpec(spec, &deviceInstance{req: req, selectedCaps: selected})
}
}
} else if dd := deviceDrivers[req.Driver]; dd != nil {
if selected := dd.capset.Match(req.Capabilities); selected != nil {
return dd.updateSpec(spec, &deviceInstance{req: req, selectedCaps: selected})
}
}
return incompatibleDeviceRequest{req.Driver, req.Capabilities}
}
11 changes: 11 additions & 0 deletions daemon/errors.go
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,17 @@ func (e invalidIdentifier) Error() string {

func (invalidIdentifier) InvalidParameter() {}

type incompatibleDeviceRequest struct {
driver string
caps [][]string
}

func (i incompatibleDeviceRequest) Error() string {
return fmt.Sprintf("could not select device driver %q with capabilities: %v", i.driver, i.caps)
}

func (incompatibleDeviceRequest) InvalidParameter() {}

type duplicateMountPointError string

func (e duplicateMountPointError) Error() string {
Expand Down
107 changes: 107 additions & 0 deletions daemon/nvidia_linux.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
package daemon

import (
"os/exec"
"strconv"

"github.com/containerd/containerd/contrib/nvidia"
"github.com/docker/docker/pkg/capabilities"
"github.com/opencontainers/runtime-spec/specs-go"
"github.com/pkg/errors"
)

// TODO: nvidia should not be hard-coded, and should be a device plugin instead on the daemon object.
// TODO: add list of device capabilities in daemon/node info

var errConflictCountDeviceIDs = errors.New("cannot set both Count and DeviceIDs on device request")

// stolen from github.com/containerd/containerd/contrib/nvidia
const nvidiaCLI = "nvidia-container-cli"

// These are NVIDIA-specific capabilities stolen from github.com/containerd/containerd/contrib/nvidia.allCaps
var allNvidiaCaps = map[nvidia.Capability]struct{}{
nvidia.Compute: {},
nvidia.Compat32: {},
nvidia.Graphics: {},
nvidia.Utility: {},
nvidia.Video: {},
nvidia.Display: {},
}

func init() {
if _, err := exec.LookPath(nvidiaCLI); err != nil {
// do not register Nvidia driver if helper binary is not present.
return
}
capset := capabilities.Set{"gpu": struct{}{}, "nvidia": struct{}{}}
nvidiaDriver := &deviceDriver{
capset: capset,
updateSpec: setNvidiaGPUs,
}
for c := range capset {
nvidiaDriver.capset[c] = struct{}{}
}
registerDeviceDriver("nvidia", nvidiaDriver)
}

func setNvidiaGPUs(s *specs.Spec, dev *deviceInstance) error {
var opts []nvidia.Opts

req := dev.req
if req.Count != 0 && len(req.DeviceIDs) > 0 {
return errConflictCountDeviceIDs
}

if len(req.DeviceIDs) > 0 {
var ids []int
var uuids []string
for _, devID := range req.DeviceIDs {
id, err := strconv.Atoi(devID)
if err == nil {
ids = append(ids, id)
continue
}
// if not an integer, then assume UUID.
uuids = append(uuids, devID)
}
if len(ids) > 0 {
opts = append(opts, nvidia.WithDevices(ids...))
}
if len(uuids) > 0 {
opts = append(opts, nvidia.WithDeviceUUIDs(uuids...))
}
}

if req.Count < 0 {
opts = append(opts, nvidia.WithAllDevices)
} else if req.Count > 0 {
opts = append(opts, nvidia.WithDevices(countToDevices(req.Count)...))
}

var nvidiaCaps []nvidia.Capability
// req.Capabilities contains device capabilities, some but not all are NVIDIA driver capabilities.
for _, c := range dev.selectedCaps {
nvcap := nvidia.Capability(c)
if _, isNvidiaCap := allNvidiaCaps[nvcap]; isNvidiaCap {
nvidiaCaps = append(nvidiaCaps, nvcap)
continue
}
// TODO: nvidia.WithRequiredCUDAVersion
// for now we let the prestart hook verify cuda versions but errors are not pretty.
}

if nvidiaCaps != nil {
opts = append(opts, nvidia.WithCapabilities(nvidiaCaps...))
}

return nvidia.WithGPUs(opts...)(nil, nil, nil, s)
}

// countToDevices returns the list 0, 1, ... count-1 of deviceIDs.
func countToDevices(count int) []int {
devices := make([]int, count)
for i := range devices {
devices[i] = i
}
return devices
}
24 changes: 16 additions & 8 deletions daemon/oci_linux.go
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ func setResources(s *specs.Spec, r containertypes.Resources) error {
return nil
}

func setDevices(s *specs.Spec, c *container.Container) error {
func (daemon *Daemon) setDevices(s *specs.Spec, c *container.Container) error {
// Build lists of devices allowed and created within the container.
var devs []specs.LinuxDevice
devPermissions := s.Linux.Resources.Devices
Expand Down Expand Up @@ -122,6 +122,13 @@ func setDevices(s *specs.Spec, c *container.Container) error {

s.Linux.Devices = append(s.Linux.Devices, devs...)
s.Linux.Resources.Devices = devPermissions

for _, req := range c.HostConfig.DeviceRequests {
if err := daemon.handleDevice(req, s); err != nil {
return err
}
}

return nil
}

Expand Down Expand Up @@ -751,7 +758,7 @@ func (daemon *Daemon) createSpec(c *container.Container) (retSpec *specs.Spec, e
if err := daemon.initCgroupsPath(parentPath); err != nil {
return nil, fmt.Errorf("linux init cgroups path: %v", err)
}
if err := setDevices(&s, c); err != nil {
if err := daemon.setDevices(&s, c); err != nil {
return nil, fmt.Errorf("linux runtime spec devices: %v", err)
}
if err := daemon.setRlimits(&s, c); err != nil {
Expand Down Expand Up @@ -818,15 +825,16 @@ func (daemon *Daemon) createSpec(c *container.Container) (retSpec *specs.Spec, e
return nil, fmt.Errorf("linux mounts: %v", err)
}

if s.Hooks == nil {
s.Hooks = &specs.Hooks{}
}
for _, ns := range s.Linux.Namespaces {
if ns.Type == "network" && ns.Path == "" && !c.Config.NetworkDisabled {
target := filepath.Join("/proc", strconv.Itoa(os.Getpid()), "exe")
s.Hooks = &specs.Hooks{
Prestart: []specs.Hook{{
Path: target,
Args: []string{"libnetwork-setkey", "-exec-root=" + daemon.configStore.GetExecRoot(), c.ID, daemon.netController.ID()},
}},
}
s.Hooks.Prestart = append(s.Hooks.Prestart, specs.Hook{
Path: target,
Args: []string{"libnetwork-setkey", "-exec-root=" + daemon.configStore.GetExecRoot(), c.ID, daemon.netController.ID()},
})
}
}

Expand Down
2 changes: 2 additions & 0 deletions docs/api/version-history.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,8 @@ keywords: "API, Docker, rcli, REST, documentation"
* `GET /info` now returns information about `DataPathPort` that is currently used in swarm
* `GET /info` now returns `PidsLimit` boolean to indicate if the host kernel has
PID limit support enabled.
* `POST /containers/create` now accepts `DeviceRequests` as part of `HostConfig`.
Can be used to set Nvidia GPUs.
* `GET /swarm` endpoint now returns DataPathPort info
* `POST /containers/create` now takes `KernelMemoryTCP` field to set hard limit for kernel TCP buffer memory.
* `GET /service` now returns `MaxReplicas` as part of the `Placement`.
Expand Down
23 changes: 23 additions & 0 deletions pkg/capabilities/caps.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
// Package capabilities allows to generically handle capabilities.
package capabilities

// Set represents a set of capabilities.
type Set map[string]struct{}

// Match tries to match set with caps, which is an OR list of AND lists of capabilities.
// The matched AND list of capabilities is returned; or nil if none are matched.
func (set Set) Match(caps [][]string) []string {
if set == nil {
return nil
}
anyof:
for _, andList := range caps {
for _, cap := range andList {
if _, ok := set[cap]; !ok {
continue anyof
}
}
return andList
}
return nil
}
Loading
0