8000 Fix cadvisor unable to report oomkill events by michaellzc · Pull Request #804 · sourcegraph/deploy-sourcegraph-docker · GitHub
[go: up one dir, main page]

Skip to content

Conversation

@michaellzc
Copy link
Member
@michaellzc michaellzc commented Apr 23, 2022

context: https://sourcegraph.slack.com/archives/C02Q9G9A59S/p1650667926243749?thread_ts=1650662753.412789&cid=C02Q9G9A59S

on managed instances, all cadvisor is reporting such an error

$ docker logs cadvisor 2>&1  | grep "OOM"
W0420 16:11:04.614936       1 manager.go:296] Could not configure a source for OOM detection, disabling OOM events: open /dev/kmsg: no such file or directory

We think this is the cause of container_oom_events_total is always zero.

Checklist

Test plan

Pending for validation https://github.com/sourcegraph/deploy-sourcegraph-managed/pull/444

it works

CleanShot 2022-04-25 at 17 39 58

@michaellzc
Copy link
Member Author

Current dependencies on/for this PR:

This comment was auto-generated by Graphite.

# Uncomment to enable container monitoring on MacOS
# - '/var/run/docker.sock:/var/run/docker.sock:ro'
devices:
- '/dev/kmsg'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all flags are documented here https://github.com/google/cadvisor/blob/master/README.md#quick-start-running-cadvisor-in-a-docker-container

these two flags were not in the readme when MI was invented in the first place (I think) and that's how we missed them

google/cadvisor#2150
google/cadvisor#2545

@michaellzc michaellzc marked this pull request as ready for review April 26, 2022 00:40
@michaellzc michaellzc requested a review from a team April 26, 2022 00:40
Copy link
Contributor
@caugustus-sourcegraph caugustus-sourcegraph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Does deploy-sourcegraph need a similar change? Or pure-docker?

@michaellzc
Copy link
Member Author
michaellzc commented Apr 26, 2022

Nice! Does deploy-sourcegraph need a similar change? Or pure-docker?

Just added the changes for pure-docker as well

for k8s, we should be able to get it working https://stackoverflow.com/a/59291859, but I will wait until later this week or next week.

I suppose this would be useful for https://github.com/sourcegraph/sourcegraph/issues/33438 and https://github.com/sourcegraph/sourcegraph/issues/33437?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

0