You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a feature request but may mitigate instances that break due to current and feature bugs.
I use a proxy infront of my ollama instances to make it multi user/requests.
But some times the ollama server loses the connection with the GPU, and then the perfomance reduces alot.
This happens some times due to the cgroup issue that can be mitigated with the docker/daemon.json
"exec-opts": [
"native.cgroupdriver=cgroupfs"
]
And somtimes due to other reasons(GPU hangs mm).
So it would be nice to be able to query the instance throught the API.
To get the performance of the node CPU/GPU wise, this way the proxy can detect the performance of the instances and detect when performance decline. This data can be used from both a proxy and/or client.
This feature request is on the same theme and maybe could be combined with this request, to have one place to get information about the node. #2004
The text was updated successfully, but these errors were encountered:
This is a feature request but may mitigate instances that break due to current and feature bugs.
I use a proxy infront of my ollama instances to make it multi user/requests.
But some times the ollama server loses the connection with the GPU, and then the perfomance reduces alot.
This happens some times due to the cgroup issue that can be mitigated with the docker/daemon.json
And somtimes due to other reasons(GPU hangs mm).
So it would be nice to be able to query the instance throught the API.
To get the performance of the node CPU/GPU wise, this way the proxy can detect the performance of the instances and detect when performance decline. This data can be used from both a proxy and/or client.
This feature request is on the same theme and maybe could be combined with this request, to have one place to get information about the node.
#2004
The text was updated successfully, but these errors were encountered: