8000 Add support for --user in systemd_units input plugin · Issue #12053 · influxdata/telegraf · GitHub
[go: up one dir, main page]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for --user in systemd_units input plugin #12053

Closed
tsarajar opened this issue Oct 19, 2022 · 7 comments · Fixed by #15458
Closed

Add support for --user in systemd_units input plugin #12053

tsarajar opened this issue Oct 19, 2022 · 7 comments · Fixed by #15458
Labels
feature request Requests for new plugin and for new features to existing plugins help wanted Request for community participation, code, contribution size/m 2-4 day effort

Comments

@tsarajar
Copy link

Use Case

We want to monitor the service running under a user not visible for root by default. After systemd version >= 248 one can apparently add --user --machine=@ to see the status of services under a given user. Telegraf doesn't support this yet however.

Expected behavior

Expecting to be able to monitor the services under a given user.

Actual behavior

This isn't supported yet.

Additional info

No response

@tsarajar tsarajar added the feature request Requests for new plugin and for new features to existing plugins label Oct 19, 2022
@powersj
Copy link
Contributor
powersj commented Oct 19, 2022

Hi,

--user --machine=@ to see the status of services under a given user.

You wish to do something like this in your Telegraf config:

[[inputs.systemd_units]]
   user = "ubuntu"

And then telegraf would run:

systemctl list-units --all --plain --type=service --machine=ubuntu@.host --user

Is that inline with your thinking?

However, unless run as root or the user you want to monitor as, Telegraf would get operating not permitted errors? I am concerned given I like to think most people install Telegraf and run it from a deb/rpm, which would be the telegraf user.

tester@j:~$ whoami
tester
tester@j:~$ systemctl list-units --all --plain --type=service --machine=ubuntu@.host --user
Failed to connect to bus: Operation not permitted (consider using --machine=<user>@.host --user to connect to bus of other user)
Failed to list units: Transport endpoint is not connected

Thoughts?

@powersj powersj added the waiting for response waiting for response from contributor label Oct 19, 2022
@tsarajar
Copy link
Author

Yeah, that's exactly what I'd wish for it to run and yes, you are right about the problem it being the telegraf user. Right out of the box it wouldn't work no. But could the user add that exact line to visudo allowing telegraf to run the user query with sudo without password? It clearly would be complicated to document that the user needs to do manual visudo magic in case they want to use that --user parameter, but what else would work? I'd still prefer that instead of having to jump to own scripts that would be invoked from telegraf, which would still have to do the sudo magic.

@telegraf-tiger telegraf-tiger bot removed the waiting for response waiting for response from contributor label Oct 20, 2022
@tsarajar
Copy link
Author
tsarajar commented Oct 20, 2022

The configuring of this input would have to be more complicated than that. What if you wanted to check for 2 things, a root run service like sshd.service and a user run service like jenkins.service. You couldn't just 'pattern = "sshd.service jenkins.service"', since you'd need to specify patterns for a given user, smt like 'pattern.<user> = "jenkins.service"'.
I have a gut feeling I just have to start working on a script that I launch with exec for this :D

@tsarajar
Copy link
Author
tsarajar commented Oct 20, 2022

#!/bin/bash

declare -A loadMap
loadMap[loaded]=0
loadMap[stub]=1
loadMap[not-found]=2
loadMap[bad-setting]=3
loadMap[error]=4
loadMap[merged]=5
loadMap[masked]=6

declare -A activeMap
activeMap[active]=0
activeMap[reloading]=1
activeMap[inactive]=2
activeMap[failed]=3
activeMap[activating]=4
activeMap[deactivating]=5

declare -A subMap
subMap[running]=0
subMap[dead]=1
subMap[start-pre]=2
subMap[start]=3
subMap[exited]=4
subMap[reload]=5
subMap[stop]=6
subMap[stop-watchdog]=7
subMap[stop-sigterm]=8
subMap[stop-sigkill]=9
subMap[stop-post]=10
subMap[final-sigterm]=11
subMap[failed]=12
subMap[auto-restart]=13

subMap[waiting]=16

subMap[tentative]=32
subMap[plugged]=33

subMap[mounting]=48
subMap[mounting-done]=49
subMap[mounted]=50
subMap[remounting]=51
subMap[unmounting]=52
subMap[remounting-sigterm]=53
subMap[remounting-sigkill]=54
subMap[unmounting-sigterm]=55
subMap[unmounting-sigkill]=56

subMap[abandoned]=80

subMap[active]=96

subMap[start-chown]=112
subMap[start-post]=113
subMap[listening]=114
subMap[stop-pre]=115
subMap[stop-pre-sigterm]=116
subMap[stop-pre-sigkill]=117
subMap[final-sigkill]=118

subMap[activating]=128
subMap[activating-done]=129
subMap[deactivating]=130
subMap[deactivating-sigterm]=131
subMap[deactivating-sigkill]=132

subMap[elapsed]=160

for service in $SYSTEMD_UNITS_PATTERN; do
if [[ ! -z "$SYSTEMD_UNITS_USER" ]]; then
output=$(sudo /bin/systemctl list-units --user --machine=$SYSTEMD_UNITS_USER@ $service --all --type=service --quiet)
else
output=$(/bin/systemctl list-units $service --all --type=service --quiet)
fi

IFS=' ' read -r -a status <<< "$output"
if [[ ! -z ${status[1]} && ! -z "$SYSTEMD_UNITS_USER" ]]; then
echo systemd_units,host=$HOSTNAME,name=$SYSTEMD_UNITS_PATTERN,load=${status[1]},active=${status[2]},sub=${status[3]} user="$SYSTEMD_UNITS_USER",load_code=${loadMap[${status[1]}]}i,active_code=${activeMap[${status[2]}]}i,sub_code=${subMap[${status[3]}]}i
elif [[ ! -z ${status[1]} ]]; then
echo systemd_units,host=$HOSTNAME,name=$SYSTEMD_UNITS_PATTERN,load=${status[1]},active=${status[2]},sub=${status[3]} load_code=${loadMap[${status[1]}]}i,active_code=${activeMap[${status[2]}]}i,sub_code=${subMap[${status[3]}]}i
fi
done

That seems to do the trick as a bash script.
The telegraf configuration looks like this:

[[inputs.exec]]
data_format = "influx"
commands = [ "/usr/local/bin/test.sh" ]
timeout = "50s"
interval = "60s"
environment = ["SYSTEMD_UNITS_PATTERN=test-service.service", "SYSTEMD_UNITS_USER=testuser" ]

And I had to add this to /etc/sudoers:

telegraf ALL=(root) NOPASSWD: /bin/systemctl list-units *

It's not pretty, but I got it to work and I now can show / track and alarm the state of services running under personal space in Grafana

@powersj
Copy link
Contributor
powersj commented Oct 20, 2022

But could the user add that exact line to visudo allowing telegraf to run the user query with sudo without password?

correct, but would need to specify every user and service or use wildcards. Something like what you used: /bin/systemctl list-units * to at least limit the action to list-units and not all of systemctl.

What if you wanted to check for 2 things, a root run service like sshd.service and a user run service like jenkins.service.

You would use two instances of this plugin: one for the root and one for the user? We would need to add the user as a tag to differentiate as well.

Still something you are interested in?

@powersj powersj added the waiting for response waiting for response from contributor label Oct 20, 2022
@tsarajar
Copy link
Author
tsarajar 8000 commented Oct 20, 2022

I am. Even though I made that bash script that does what I need, I would jump to a native solution just to skip possible maintaining of my own code 😊 Not to mention that my Ansible becomes prettier when I dont have own scripts mentioned there that I need to install.

@telegraf-tiger telegraf-tiger bot removed the waiting for response waiting for response from contributor label Oct 20, 2022
@powersj powersj added help wanted Request for community participation, code, contribution size/m 2-4 day effort labels Oct 24, 2022
@mvala
Copy link
mvala commented Sep 21, 2023

+1 for this. Any updates?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Requests for new plugin and for new features to existing plugins help wanted Request for community participation, code, contribution size/m 2-4 day effort
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants
0