10BC0 Semantic Memory SetID refactor: customization by o-love · Pull Request #882 · MemMachine/MemMachine · GitHub
[go: up one dir, main page]

Skip to content

Conversation

@o-love
Copy link
Contributor
@o-love o-love commented Dec 31, 2025

Adds support for dynamic set id configuration based on metadata and customization of set ids.
Categories are now also configurable.

Copilot main points:

  • Replaces the IsolationType enum (USER/ROLE/SESSION) with a new org/project-based DefaultType system (OrgSet/ProjectSet/OtherSet)
  • Introduces a SemanticConfigStorage layer backed by SQLAlchemy for persisting set-level configuration overrides
  • Updates SemanticSessionManager to derive set IDs from org_id, project_id, and metadata fields rather than fixed user/session/role properties

@o-love o-love changed the title Add set ID management and resource improvements Semantic Memory SetID refactor: customization Dec 31, 2025
@o-love o-love force-pushed the semantic-config branch 5 times, most recently from 3f7eeb3 to bf4aa20 Compare January 2, 2026 20:40
Copy link
Contributor
Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the semantic memory system to support dynamic set ID configuration with metadata-based routing and per-set customization of embedders, LLMs, and categories.

Key Changes:

  • Replaces the IsolationType enum (USER/ROLE/SESSION) with a new org/project-based DefaultType system (OrgSet/ProjectSet/OtherSet)
  • Introduces a SemanticConfigStorage layer backed by SQLAlchemy for persisting set-level configuration overrides
  • Updates SemanticSessionManager to derive set IDs from org_id, project_id, and metadata fields rather than fixed user/session/role properties

Reviewed changes

Copilot reviewed 43 out of 44 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/memmachine/semantic_memory/config_store/config_store.py Defines protocol for semantic configuration storage
src/memmachine/semantic_memory/config_store/config_store_sqlalchemy.py SQLAlchemy implementation of config storage with tables for set resources, categories, tags, and org tag sets
src/memmachine/semantic_memory/semantic_session_manager.py Refactored to use org/project-based set ID generation and metadata-driven routing
src/memmachine/semantic_memory/semantic_memory.py Updated to support per-set embedder/LLM configuration and dynamic category management
src/memmachine/common/resource_manager/semantic_manager.py Wiring changes to integrate config storage and new resource retrieval patterns
tests/memmachine/semantic_memory/config_store/test_semantic_config_storage.py New tests for config storage CRUD operations

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

is_org_level: bool,
metadata_tags: list[str],
) -> str:
semantic_session = await self._resources.get_semantic_session_manager()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When creating a new set, embedder, reranker and prompt can be specified.

category_name: str,
description: str,
) -> CategoryIdT:
stmt = (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check if set_id exists first?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit tricky. I don't want the user to need to call any create_set in order for messages to be ingested.

Instead I built it around an organization admin defining isolation which automatically ingest messages.

Because of that there isn't really a source of truth of what set_ids exist.
It's related to another comment you made as well.

I think the approach I'm goiing to take is the first message to be ingested into a set_id initializes the database with a default config.

@o-love o-love force-pushed the semantic-config branch 3 times, most recently from 4f3edb4 to 7be687c Compare January 8, 2026 20:01
self,
*,
session_data: SessionData,
is_org_level: bool = False,

This comment was marked as outdated.

logger.info("Deleting set ids %s", set_ids)

async with asyncio.TaskGroup() as tg:
tg.create_task(self._semantic_storage.delete_history_set(set_ids=set_ids))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: we may need transaction or garbage collection for this function.

@o-love o-love force-pushed the semantic-config branch 3 times, most recently from 29a0034 to 1ad4b89 Compare January 13, 2026 15:46
@o-love o-love force-pushed the semantic-config branch 2 times, most recently from 2014294 to 0ab6701 Compare January 14, 2026 14:23
@o-love o-love requested a review from Copilot January 14, 2026 15:42
Copy link
Contributor
Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 48 out of 49 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Introduced `SetIdNotEnabledError` and `InvalidSetIdConfigurationError` for enhanced error handling.
- Implemented async functions for managing set IDs, including `reset_set_ids`, `delete_set_id`, and `set_set_id_config`.
- Updated `SemanticService` and `SemanticSessionManager` to utilize new set ID configurations and resource retrieval methods.
- Added support for listing set IDs with a certain prefix and improved filtering capabilities using set IDs.
- Enhanced logging and resource management through new and refactored functions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

0