-
Notifications
You must be signed in to change notification settings - Fork 133
Semantic Memory SetID refactor: customization #882
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
3f7eeb3 to
bf4aa20
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR refactors the semantic memory system to support dynamic set ID configuration with metadata-based routing and per-set customization of embedders, LLMs, and categories.
Key Changes:
- Replaces the
IsolationTypeenum (USER/ROLE/SESSION) with a new org/project-basedDefaultTypesystem (OrgSet/ProjectSet/OtherSet) - Introduces a
SemanticConfigStoragelayer backed by SQLAlchemy for persisting set-level configuration overrides - Updates
SemanticSessionManagerto derive set IDs fromorg_id,project_id, and metadata fields rather than fixed user/session/role properties
Reviewed changes
Copilot reviewed 43 out of 44 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
src/memmachine/semantic_memory/config_store/config_store.py |
Defines protocol for semantic configuration storage |
src/memmachine/semantic_memory/config_store/config_store_sqlalchemy.py |
SQLAlchemy implementation of config storage with tables for set resources, categories, tags, and org tag sets |
src/memmachine/semantic_memory/semantic_session_manager.py |
Refactored to use org/project-based set ID generation and metadata-driven routing |
src/memmachine/semantic_memory/semantic_memory.py |
Updated to support per-set embedder/LLM configuration and dynamic category management |
src/memmachine/common/resource_manager/semantic_manager.py |
Wiring changes to integrate config storage and new resource retrieval patterns |
tests/memmachine/semantic_memory/config_store/test_semantic_config_storage.py |
New tests for config storage CRUD operations |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/memmachine/semantic_memory/config_store/config_store_sqlalchemy.py
Outdated
Show resolved
Hide resolved
src/memmachine/semantic_memory/config_store/config_store_sqlalchemy.py
Outdated
Show resolved
Hide resolved
| is_org_level: bool, | ||
| metadata_tags: list[str], | ||
| ) -> str: | ||
| semantic_session = await self._resources.get_semantic_session_manager() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When creating a new set, embedder, reranker and prompt can be specified.
| category_name: str, | ||
| description: str, | ||
| ) -> CategoryIdT: | ||
| stmt = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check if set_id exists first?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit tricky. I don't want the user to need to call any create_set in order for messages to be ingested.
Instead I built it around an organization admin defining isolation which automatically ingest messages.
Because of that there isn't really a source of truth of what set_ids exist.
It's related to another comment you made as well.
I think the approach I'm goiing to take is the first message to be ingested into a set_id initializes the database with a default config.
4f3edb4 to
7be687c
Compare
src/memmachine/semantic_memory/config_store/config_store_sqlalchemy.py
Outdated
Show resolved
Hide resolved
7be687c to
409f865
Compare
409f865 to
6a5ee54
Compare
src/memmachine/main/memmachine.py
Outdated
| self, | ||
| *, | ||
| session_data: SessionData, | ||
| is_org_level: bool = False, |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| logger.info("Deleting set ids %s", set_ids) | ||
|
|
||
| async with asyncio.TaskGroup() as tg: | ||
| tg.create_task(self._semantic_storage.delete_history_set(set_ids=set_ids)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: we may need transaction or garbage collection for this function.
29a0034 to
1ad4b89
Compare
2014294 to
0ab6701
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 48 out of 49 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/memmachine/semantic_memory/config_store/config_store_sqlalchemy.py
Outdated
Show resolved
Hide resolved
- Introduced `SetIdNotEnabledError` and `InvalidSetIdConfigurationError` for enhanced error handling. - Implemented async functions for managing set IDs, including `reset_set_ids`, `delete_set_id`, and `set_set_id_config`. - Updated `SemanticService` and `SemanticSessionManager` to utilize new set ID configurations and resource retrieval methods. - Added support for listing set IDs with a certain prefix and improved filtering capabilities using set IDs. - Enhanced logging and resource management through new and refactored functions.
67dc2ae to
006926b
Compare
b7ed231 to
7636701
Compare
7636701 to
5f14d63
Compare
Adds support for dynamic set id configuration based on metadata and customization of set ids.
Categories are now also configurable.
Copilot main points:
IsolationTypeenum (USER/ROLE/SESSION) with a new org/project-basedDefaultTypesystem (OrgSet/ProjectSet/OtherSet)SemanticConfigStoragelayer backed by SQLAlchemy for persisting set-level configuration overridesSemanticSessionManagerto derive set IDs fromorg_id,project_id, and metadata fields rather than fixed user/session/role properties