Closed
Description
Prerequisites
Please answer the following questions for yourself before submitting an issue.
- I am running the latest code. Development is very rapid so there are no tagged versions as of now.
- I carefully followed the README.md.
- I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- I reviewed the Discussions, and have a new bug or useful enhancement to share.
Feature Description
When I start the server as follows:
./server -m wizardlm-70b-v1.0.q4_K_S.gguf --threads 8 -ngl 100 -c 4096 --embedding
and make a request to /embedding
curl --request POST \
--url http://localhost:8080/embedding \
--header "Content-Type: application/json" \
--data '{"content": "Building a website can be done in 10 simple steps:"}'
I get back - as expected the vector of embeddings. Now If I make a request to /completion
as follows:
curl --request POST \
--url http://localhost:8080/completion \
--header "Content-Type: application/json" \
--data '{"prompt": "Building a website can be done in 10 simple steps:","n_predict": 128}'
I'd expect the normal completion to still work. But all I get is the embedding of the prompt (I tested it with above examples, and it is the same vector returned in both examples)
Motivation
I guess having both normal completion and the possibiilty to just get embeddings makes sense in a lot of applications with the server.