-
Notifications
You must be signed in to change notification settings - Fork 25.5k
Allow updating inference_id of semantic_text fields #136120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Allow updating inference_id of semantic_text fields #136120
Conversation
Previously the `inference_id` of `semantic_text` fields was not updatable. This commit allows users to update the `inference_id` of a `semantic_text` field. This is particularly useful for scenarios where the user wants to switch to using the same model but from a different service. There are two circumstances when the update is allowed. - No values have been written for the `semantic_text` field. The inference endpoint can be changed freely as there is no need for compatibility between the current and the new endpoint. - The new inference endpoint is compatible with the previous one. The `model_settings` of the new inference endpoint are compatible with those of the current endpoint, thus the update is allowed.
Pinging @elastic/search-relevance (Team:Search - Relevance) |
Pinging @elastic/es-search-foundations (Team:Search Foundations) |
Hi @dimitris-athanasiou, I've created a changelog YAML for you. |
PR is ready for review. However, I intend to add documentation changes soon. |
@@ -0,0 +1,5 @@ | |||
pr: 136120 | |||
summary: Allow updating `inference_id` of `semantic_text` fields | |||
area: "Search" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Search
or Mapping
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say mapping
🔍 Preview links for changed docs |
ℹ️ Important: Docs version tagging👋 Thanks for updating the docs! Just a friendly reminder that our docs are now cumulative. This means all 9.x versions are documented on the same page and published off of the main branch, instead of creating separate pages for each minor version. We use applies_to tags to mark version-specific features and changes. Expand for a quick overviewWhen to use applies_to tags:✅ At the page level to indicate which products/deployments the content applies to (mandatory) What NOT to do:❌ Don't remove or replace information that applies to an older version 🤔 Need help?
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of very minor wording suggestions from me :)
docs/reference/elasticsearch/mapping-reference/semantic-text.md
Outdated
Show resolved
Hide resolved
docs/reference/elasticsearch/mapping-reference/semantic-text.md
Outdated
Show resolved
Hide resolved
Co-authored-by: Liam Thompson <leemthompo@gmail.com>
Co-authored-by: Liam Thompson <leemthompo@gmail.com>
|
||
endpoint will only be used at index time. | ||
|
||
::::{applies-switch} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot to say that I think you can indent this whole applies-switch section a little bit, otherwise LGTM :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! I've left a few comments but I think it's pretty close.
@@ -0,0 +1,5 @@ | |||
pr: 136120 | |||
summary: Allow updating `inference_id` of `semantic_text` fields | |||
area: "Search" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say mapping
You can update the inference endpoint if no values have been indexed or if the new endpoint is compatible with the current one. | ||
|
||
::::{warning} | ||
The endpoint is validated for compatibility, but you must verify it produces the correct embeddings for your use case. This typically means using the same underlying model. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should make this warning more strong - e.g.
The endpoint is validated for compatibility, but you must verify it produces the correct embeddings for your use case. This typically means using the same underlying model. | |
When updating an `inference_id` it is important to ensure the new {{infer}} endpoint produces the correct embeddings for your use case. This typically means using the same underlying model. |
/** | ||
* A second sparse service allows testing updates from one service to another. | ||
*/ | ||
public static class TestInferenceService2 extends AbstractSparseTestInferenceService { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: Maybe name TestAlternateInferenceService
?
+ "] does not exist." | ||
); | ||
} | ||
if (canMergeModelSettings(currentModelSettings, updatedModelSettings, conflicts) == false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now, the implementation of canMergeModelSettings
will return true if previous
is null or if current
is null, with no other checks. I think there's a potential edge case to consider here, where we might set a dense vector model (previous) and then try to null it out which would then default to ELSER. Let's make sure to test for that edge case?
* @param modelSettings the new model settings. If null the mapper will be returned unchanged. | ||
* @return A mapper with the copied settings applied | ||
*/ | ||
private SemanticTextFieldMapper copyWithNewModelSettingsIfNotSet( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we're doing a null check here before we copy the settings, but we silently ignore if that case happens. Should we throw if it's not null and this method is called?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't review this file, assume it's the same as the other yaml test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we please add some tests trying to update dense vector to dense vector with different dimensions, etc?
Previously the
inference_id
ofsemantic_text
fields was not updatable. This commit allows users to update theinference_id
of asemantic_text
field. This is particularly useful for scenarios where the user wants to switch to using the same model but from a different service.There are two circumstances when the update is allowed.
semantic_text
field.The inference endpoint can be changed freely as there is no need for compatibility between the current and the new endpoint.
The
model_settings
of the new inference endpoint are compatible with those of the current endpoint, thus the update is allowed.