Alloy v0.3.1 — Provider‑native structured outputs; Gemini/Ollama fixes; coverage

Release date: 2025-09-06

Highlights

Anthropic structured outputs (provider‑native): precise nested‑key guidance derived from your JSON Schema; minimal object prefill ("{") only; no unsupported response_format usage. Single strict finalize turn remains when a typed result is missing.
Gemini tool responses: suppress SDK non‑text warnings by reading candidates.content.parts; ensure FunctionResponse.response is a dict (wrap scalars as { "result": ... }).
Ollama:
- Native path unchanged: strict JSON Schema enforcement via format.
- OpenAI‑compat path adds strict nested‑keys finalize hints where schema cannot be enforced.
Tool payload normalization: centralized helper normalizes dataclasses and containers to JSON‑safe structures; providers use consistent serialization.
Finalize predicate: detects missing required keys recursively for object schemas to reduce parse failures.

Scope clarifications (structured outputs)

OpenAI Chat Completions supports structured outputs via response_format={"type":"json_schema"} on supported models (e.g., gpt‑4o‑2024‑08‑06).
text.format is a Responses API concept (not part of Chat Completions).
Ollama’s OpenAI‑compat /v1/chat/completions does not implement OpenAI’s response_format structured outputs; reports show it is ignored.
Ollama’s native /api/chat supports schema enforcement via its own format field (JSON / JSON Schema).

Tests & coverage

Added unit tests for ask context (sync/async), env overrides, and finalize predicate presence checks.
Restored async ask context test.
Non‑integration coverage ≥ 80%.

Internal

Imports: moved non‑optional imports to module scope across backends; kept provider SDK imports lazy/guarded.

No breaking API changes.

Alloy v0.3.0 — Provider Parity, Finalize Centralization, Streaming Cleanup

Release date: 2025-09-02

Highlights

Finalize centralization: One canonical rule in should_finalize_structured_output handles strings (including wrapped primitives) vs. non-strings. Providers defer to it.
- For string outputs, finalize only when the text is empty (no extra follow-up turn for non-empty text).
- For non-strings, finalize when the text is empty or invalid JSON (after stripping code fences).
Ollama (headline): Dual API support
- Native /api/chat (Ollama SDK) and OpenAI‑compatible Chat Completions (openai_chat).
- Auto‑routes ollama:*gpt-oss* to openai_chat unless overridden via extra["ollama_api"].
- Tools supported in both paths; native path supports strict format={JSON Schema} for structured outputs.
Shared helpers across providers:
- build_tools_common for uniform tool-schema building.
- ensure_object_schema to wrap primitives into { "value": <primitive> } consistently.
- serialize_tool_payload to stringify tool results/errors uniformly.
Config extras normalization (generic-first):
- Use tool_choice, allowed_tools, disable_parallel_tool_use, ollama_api.
- Provider-prefixed fallbacks remain supported: openai_tool_choice, anthropic_tool_choice, anthropic_disable_parallel_tool_use, gemini_tool_choice, gemini_allowed_tools, ollama_tool_choice.
Provider parity & cleanup:
- Client getters standardized as _get_sync_client / _get_async_client.
- Streaming: robust resource cleanup (close/aclose) for Gemini and Ollama streams; OpenAI/Anthropic already use context managers.
- Anthropic: skip JSON prefill after tool errors so plain tool messages surface (e.g., DBC contract messages).
- Ollama: remove broad exception wrapping in complete(); runtime errors now propagate consistently.

Docs

Configuration guide: provider extras presented as tables; clarified finalize behavior and text-only streaming policy.
Production guide: clarified error surfaces (ConfigurationError, CommandError, ToolError, ToolLoopLimitExceeded) and retry semantics.
What’s New updated with a concise, ordered 0.3.0 section.

Tests

Updated provider tests to patch new client getters (_get_sync_client), replacing legacy _get_client patches.
Added unit tests for ensure_object_schema and build_tools_common.
Preserved OpenAI async-parallel tools behavior: two calls total (function calls, then final text) for output=str.

Breaking or notable changes

Internal/test APIs: prefer _get_sync_client / _get_async_client when patching/mocking provider clients.
Configuration: prefer generic keys (tool_choice, allowed_tools, disable_parallel_tool_use, ollama_api). Provider-prefixed keys remain as fallbacks but may be removed in a future minor.
Finalize behavior: string outputs across providers no longer trigger a follow-up finalize when non-empty; expect fewer unnecessary follow-up turns.

Upgrade notes

If you mock providers in tests, update patches to _get_sync_client/_get_async_client.
Use generic extras as the primary interface in Config.extra or ALLOY_EXTRA_JSON.
For typed outputs that previously returned raw primitives from some providers (e.g., Gemini), outputs now uniformly return JSON objects at the schema level (e.g., { "value": 123 }) and are parsed by Alloy into your requested Python types.

Acknowledgements

Thanks to everyone who reported parity gaps and helped validate the finalize and streaming changes.

v0.2.2 — Shared loop, Gemini alignment, streaming fix

Highlights:

Centralized tool loop via BaseLoopState; OpenAI (Responses), Anthropic, and Gemini migrated. Gemini now always uses the shared loop.
ask.stream_async() returns an AsyncIterable[str] directly (no await at call site).
Structured outputs: providers finalize only when auto_finalize_missing_output=True (default on). Turn‑limit semantics and text‑only streaming preserved.
OpenAI: ignore temperature for reasoning models (gpt‑5, o1, o3); a debug log is emitted when dropped.
Fake backend: fills nested required properties for stricter strict‑mode parity.

Docs:

Contributor note on the shared loop and BaseLoopState contract (Architecture → Provider Abstraction).
Streaming guide warning for text‑only streaming.
Configuration: provider extras and finalize behavior clarified.

No breaking API changes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Alloy v0.3.1 — Provider‑native structured outputs; Gemini/Ollama fixes; coverage

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Alloy v0.3.0 — Provider Parity, Finalize Centralization, Streaming Cleanup

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

v0.2.2 — Shared loop, Gemini alignment, streaming fix

Uh oh!

Releases: lydakis/alloy

v0.3.1

Alloy v0.3.1 — Provider‑native structured outputs; Gemini/Ollama fixes; coverage

Uh oh!

v0.3.0

Alloy v0.3.0 — Provider Parity, Finalize Centralization, Streaming Cleanup

Uh oh!

v0.2.2 — Shared loop, Gemini alignment, streaming fix

v0.2.2 — Shared loop, Gemini alignment, streaming fix

Uh oh!