8000 Releases · lydakis/alloy · GitHub
[go: up one dir, main page]

Skip to content

Releases: lydakis/alloy

v0.3.1

06 Sep 21:52

Choose a tag to compare

Alloy v0.3.1 — Provider‑native structured outputs; Gemini/Ollama fixes; coverage

Release date: 2025-09-06

Highlights

  • Anthropic structured outputs (provider‑native): precise nested‑key guidance derived from your JSON Schema; minimal object prefill ("{") only; no unsupported response_format usage. Single strict finalize turn remains when a typed result is missing.
  • Gemini tool responses: suppress SDK non‑text warnings by reading candidates.content.parts; ensure FunctionResponse.response is a dict (wrap scalars as { "result": ... }).
  • Ollama:
    • Native path unchanged: strict JSON Schema enforcement via format.
    • OpenAI‑compat path adds strict nested‑keys finalize hints where schema cannot be enforced.
  • Tool payload normalization: centralized helper normalizes dataclasses and containers to JSON‑safe structures; providers use consistent serialization.
  • Finalize predicate: detects missing required keys recursively for object schemas to reduce parse failures.

Scope clarifications (structured outputs)

  • OpenAI Chat Completions supports structured outputs via response_format={"type":"json_schema"} on supported models (e.g., gpt‑4o‑2024‑08‑06).
  • text.format is a Responses API concept (not part of Chat Completions).
  • Ollama’s OpenAI‑compat /v1/chat/completions does not implement OpenAI’s response_format structured outputs; reports show it is ignored.
  • Ollama’s native /api/chat supports schema enforcement via its own format field (JSON / JSON Schema).

Tests & coverage

  • Added unit tests for ask context (sync/async), env overrides, and finalize predicate presence checks.
  • Restored async ask context test.
  • Non‑integration coverage ≥ 80%.

Internal

  • Imports: moved non‑optional imports to module scope across backends; kept provider SDK imports lazy/guarded.

No breaking API changes.

v0.3.0

02 Sep 07:09

Choose a tag to compare

Alloy v0.3.0 — Provider Parity, Finalize Centralization, Streaming Cleanup

Release date: 2025-09-02

Highlights

  • Finalize centralization: One canonical rule in should_finalize_structured_output handles strings (including wrapped primitives) vs. non-strings. Providers defer to it.
    • For string outputs, finalize only when the text is empty (no extra follow-up turn for non-empty text).
    • For non-strings, finalize when the text is empty or invalid JSON (after stripping code fences).
  • Ollama (headline): Dual API support
    • Native /api/chat (Ollama SDK) and OpenAI‑compatible Chat Completions (openai_chat).
    • Auto‑routes ollama:*gpt-oss* to openai_chat unless overridden via extra["ollama_api"].
    • Tools supported in both paths; native path supports strict format={JSON Schema} for structured outputs.
  • Shared helpers across providers:
    • build_tools_common for uniform tool-schema building.
    • ensure_object_schema to wrap primitives into { "value": <primitive> } consistently.
    • serialize_tool_payload to stringify tool results/errors uniformly.
  • Config extras normalization (generic-first):
    • Use tool_choice, allowed_tools, disable_parallel_tool_use, ollama_api.
    • Provider-prefixed fallbacks remain supported: openai_tool_choice, anthropic_tool_choice, anthropic_disable_parallel_tool_use, gemini_tool_choice, gemini_allowed_tools, ollama_tool_choice.
  • Provider parity & cleanup:
    • Client getters standardized as _get_sync_client / _get_async_client.
    • Streaming: robust resource cleanup (close/aclose) for Gemini and Ollama streams; OpenAI/Anthropic already use context managers.
    • Anthropic: skip JSON prefill after tool errors so plain tool messages surface (e.g., DBC contract messages).
    • Ollama: remove broad exception wrapping in complete(); runtime errors now propagate consistently.

Docs

  • Configuration guide: provider extras presented as tables; clarified finalize behavior and text-only streaming policy.
  • Production guide: clarified error surfaces (ConfigurationError, CommandError, ToolError, ToolLoopLimitExceeded) and retry semantics.
  • What’s New updated with a concise, ordered 0.3.0 section.

Tests

  • Updated provider tests to patch new client getters (_get_sync_client), replacing legacy _get_client patches.
  • Added unit tests for ensure_object_schema and build_tools_common.
  • Preserved OpenAI async-parallel tools behavior: two calls total (function calls, then final text) for output=str.

Breaking or notable changes

  • Internal/test APIs: prefer _get_sync_client / _get_async_client when patching/mocking provider clients.
  • Configuration: prefer generic keys (tool_choice, allowed_tools, disable_parallel_tool_use, ollama_api). Provider-prefixed keys remain as fallbacks but may be removed in a future minor.
  • Finalize behavior: string outputs across providers no longer trigger a follow-up finalize when non-empty; expect fewer unnecessary follow-up turns.

Upgrade notes

  • If you mock providers in tests, update patches to _get_sync_client/_get_async_client.
  • Use generic extras as the primary interface in Config.extra or ALLOY_EXTRA_JSON.
  • For typed outputs that previously returned raw primitives from some providers (e.g., Gemini), outputs now uniformly return JSON objects at the schema level (e.g., { "value": 123 }) and are parsed by Alloy into your requested Python types.

Acknowledgements

  • Thanks to everyone who reported parity gaps and helped validate the finalize and streaming changes.

v0.2.2 — Shared loop, Gemini alignment, streaming fix

01 Sep 17:27

Choose a tag to compare

v0.2.2 — Shared loop, Gemini alignment, streaming fix

Highlights:

  • Centralized tool loop via BaseLoopState; OpenAI (Responses), Anthropic, and Gemini migrated. Gemini now always uses the shared loop.
  • ask.stream_async() returns an AsyncIterable[str] directly (no await at call site).
  • Structured outputs: providers finalize only when auto_finalize_missing_output=True (default on). Turn‑limit semantics and text‑only streaming preserved.
  • OpenAI: ignore temperature for reasoning models (gpt‑5, o1, o3); a debug log is emitted when dropped.
  • Fake backend: fills nested required properties for stricter strict‑mode parity.

Docs:

  • Contributor note on the shared loop and BaseLoopState contract (Architecture → Provider Abstraction).
  • Streaming guide warning for text‑only streaming.
  • Configuration: provider extras and finalize behavior clarified.

No breaking API changes.

0