8000 fix: cadvisor can pickup oom events by michaellzc · Pull Request #121 · sourcegraph/deploy-sourcegraph-helm · GitHub
[go: up one dir, main page]

Skip to content

Conversation

@michaellzc
Copy link
Member
@michaellzc michaellzc commented May 5, 2022

Follow up sourcegraph/deploy-sourcegraph-docker#804

Checklist

Test plan

tested on a live gke cluster

deploy sourcegraph as usual

deploy a stress test pod

apiVersion: v1
kind: Pod
metadata:
  name: stress
  namespace: sourcegraph
spec:
  containers:
    - name: stress
      image: progrium/stress
      command:
        - sleep
        - infinity
      resources:
        requests:
          memory: 100Mi
        limits:
          memory: 128Mi

exec into the pod and run

stress --cpu 1 --vm-bytes 300M --timeout 10s

oom events show up as expected

CleanShot 2022-05-05 at 15 48 49

@michaellzc michaellzc force-pushed the 05-05-fix_cadvisor_can_pickup_oom_events branch from 7dbaeed to 7a9386c Compare May 5, 2022 23:00
@michaellzc
Copy link
Member Author

Current dependencies on/for this PR:

This comment was auto-generated by Graphite.

@michaellzc michaellzc force-pushed the 05-05-fix_cadvisor_can_pickup_oom_events branch from 7a9386c to ca565f2 Compare May 5, 2022 23:03
Comment on lines +85 to +89
{{- if .Values.cadvisor.containerSecurityContext.privileged }}
- name: kmsg
mountPath: /dev/kmsg
readOnly: true
{{- end }}
Copy link
Member Author
@michaellzc michaellzc May 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe I am overcomplicating things.

without privileged and we mount the /dev/kmsg, users would see the below logs from cadvisor.

k logs daemonset/cadvisor                                                     14:05:31
Found 2 pods, using pod/cadvisor-qkdlk
W0505 21:05:23.014872       1 machine_libipmctl.go:63] There are no NVM devices!
W0505 21:05:23.068489       1 manager.go:296] Could not configure a source for OOM detection, disabling OOM events: open /dev/kmsg: operation not permitted

If we don't mount /dev/kmsg, users will always see something like

 Could not configure a source for OOM detection, disabling OOM events: open /dev/kmsg: no such file or directory

either one of them is not ideal, so I figure we should just don't mount unnecessary stuff when it has no used (privielged=false)

Comment on lines +127 to +131
{{- if .Values.cadvisor.containerSecurityContext.privileged }}
- name: kmsg
hostPath:
path: /dev/kmsg
{{- end }}
Copy link
Member Author
@michaellzc michaellzc May 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's no docker's --device equivalent in Kubernetes, so we have to mount it as a regular volume.

@michaellzc michaellzc force-pushed the 05-05-fix_cadvisor_can_pickup_oom_events branch 2 times, most recently from aa07085 to 80903f6 Compare May 5, 2022 23:21
@michaellzc michaellzc force-pushed the 05-05-fix_cadvisor_can_pickup_oom_events branch from 80903f6 to 7773c31 Compare May 5, 2022 23:22
@michaellzc michaellzc marked this pull request as ready for review May 6, 2022 00:32
@michaellzc michaellzc requested a review from a team May 6, 2022 01:28
Co-authored-by: Crystal Augustus <91073224+caugustus-sourcegraph@users.noreply.github.com>
@michaellzc michaellzc merged commit c17e4ba into main May 6, 2022
@michaellzc michaellzc deleted the 05-05-fix_cadvisor_can_pickup_oom_events branch May 6, 2022 19:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

0