-
Notifications
You must be signed in to change notification settings - Fork 12k
Add proper implementation of ollama's /api/chat #13777
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Could you explain why this API is needed? AFAIK this API is specific to ollama and OAI doesn't have it (they moved to the new Response API instead) |
@@ -77,6 +77,7 @@ enum oaicompat_type { | |||
OAICOMPAT_TYPE_CHAT, | |||
OAICOMPAT_TYPE_COMPLETION, | |||
OAICOMPAT_TYPE_EMBEDDING, | |||
OAICOMPAT_TYPE_API_CHAT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If OAI does not support this API, then having the prefix OAICOMPAT here will be very confused for other contributors who doesn't know much about the story of ollama.
Tbh, I think this is not very necessary, as most applications nowadays will support OAI-compat API. If they don't, you can have a proxy to convert the API, I bet someone already made that.
Also, since OAI introduced the new Response API, I think we should keep things simple by only supporting OAI specs (which has good support for reasoning and multimodal models). The API for ollama can be added if more users ask for it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct with that name and it should be changed.
If the endpoint is not part of the OAI API then should it instead be removed completely?
This was mostly added for the cases where someone want to swap out ollama for llama-server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain why this API is needed? AFAIK this API is specific to ollama and OAI doesn't have it (they moved to the new Response API instead)
Tbh, I think this is not very necessary, as most applications nowadays will support OAI-compat API.
I second that. The main question that should be asked is if these ollama APIs enable any sort of new useful functionality, compared to the existing standard APIs. If the answer is no, then these APIs should not exist in the first place and we should not support them.
As an example, we introduced the /infill
API in llama-server
, because the existing /v1/completions
spec was not enough to support the needs for advanced local fill-in-the-middle use-cases (#9787).
Currently, there is rudimentary support added for /api/show
, /api/tags
and /api/chat
mainly because VS Code made the mistake to support require them. As soon as this is fixed (microsoft/vscode#249605), these endpoints should be removed from llama-server
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, that makes it clear why they currently exist.
Since /api/chat is already in there and it is most likely expected to work how Ollama uses it, and in cases where a tool or software have implemented Ollama support only (and not openai api), then this endpoint may be used. |
I'll close this for now, and re-open if the issue/demand comes up in the future. |
Apparently, the /api/chat endpoint that Ollama provide is the expected /chat/completions. These changes makes the response to /api/chat behave the same.
Here is an example of the response from Ollama on /api/chat
And with streaming off
With the updated code the responses looks like this:
Stream off:
Not exactly the same, as the date format is different but should be close enough. Since it is a string we could also convert it or simply use a dummy value like "".
This change can help in cases where the Openai compatible api is not supported, but Ollama is.