10000 JSON schema conversion: ⚡️ faster repetitions, min/maxLength for strings, cap number length by ochafik · Pull Request #6555 · ggml-org/llama.cpp · GitHub
[go: up one dir, main page]

Skip to content

JSON schema conversion: ⚡️ faster repetitions, min/maxLength for strings, cap number length #6555

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Apr 12, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
json: fix server/README (json_schema in /completion vs. result_format…
8000
 in /v1/chat/completions)
  • Loading branch information
Olivier Chafik committed Apr 9, 2024
commit 9c33ee99302caac14c79f12c43e7a61462dc0730
4 changes: 3 additions & 1 deletion examples/server/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -251,7 +251,7 @@ node index.js

`grammar`: Set grammar for grammar-based sampling. Default: no grammar

`response_format`: Set the response format. Only supports JSON (e.g. `{"type": "json_object"}`), optionally with a schema (e.g. `{"type": "json_object", "schema": {"type": "string", "minLength": 10, "maxLength": 100}}`). See [test-json-schema-to-grammar.cpp](../../tests/test-json-schema-to-grammar.cpp). Default: no response format.
`json_schema`: Set a JSON schema for grammar-based sampling (e.g. `{"items": {"type": "string"}, "minItems": 10, "maxItems": 100}` of a list of strings, or `{}` for any JSON). See [tests](../../tests/test-json-schema-to-grammar.cpp) for supported features. Default: no JSON schema.

`seed`: Set the random number generator (RNG) seed. Default: `-1`, which is a random seed.

Expand Down Expand Up @@ -368,6 +368,8 @@ Notice that each `probs` is an array of length `n_probs`.

See [OpenAI Chat Completions API documentation](https://platform.openai.com/docs/api-reference/chat). While some OpenAI-specific features such as function calling aren't supported, llama.cpp `/completion`-specific features such as `mirostat` are supported.

The `response_format` parameter supports both plain JSON output (e.g. `{"type": "json_object"}`) and schema-constrained JSON (e.g. `{"type": "json_object", "schema": {"type": "string", "minLength": 10, "maxLength": 100}}`), similar to other OpenAI-inspired API providers.

*Examples:*

You can use either Python `openai` library with appropriate checkpoints:
Expand Down
0