10000 InvalidArgument: 400 when using 'keyword_similarity_score' in Discovery Engine custom ranking · Issue #14550 · googleapis/google-cloud-python · GitHub
[go: up one dir, main page]

Skip to content

InvalidArgument: 400 when using 'keyword_similarity_score' in Discovery Engine custom ranking #14550

@lore2601

Description

@lore2601

Determine this is the right repository

  • I determined this is the correct repository in which to report this bug.

Summary of the issue

I'm trying to use the SearchServiceClient within Discovery Engine to perform a search with a custom ranking expression. According to the documentation, I should be able to use the keyword_similarity_score signal, which is described as implementing the Best Match 25 (BM25) ranking function for keyword matching. However, when I execute a search with this signal, the API returns an InvalidArgument: 400 error.
I tryed discoveryengine, discoveryengine_v1, discoveryengine_v1alpha and discoveryengine_v1beta

Is this an issue with the documentation or a bug in the API?

  • Is the keyword_similarity_score signal no longer supported?
  • Has it been officially replaced by topicality_rank * -1.0?
    If so, the documentation should be updated to reflect this change and clarify the correct way to implement BM25-based ranking. Any guidance on how topicality_rank * -1.0 achieves the same effect as BM25 would also be highly appreciated.

API client name and version

google-cloud-discoveryengine v0.13.12

Reproduction steps: code

from google.cloud import discoveryengine_v1 as discoveryengine

serving_config =f"projects/{project_id}/locations/{datastore_location}/collections/default_collection/engines/{app_id}/servingConfigs/default_search"


search_request = discoveryengine.SearchRequest(
    ranking_expression_backend=discoveryengine.SearchRequest.RankingExpressionBackend.RANK_BY_FORMULA,
    ranking_expression="keyword_similarity_score",
    query=query,
    serving_config=serving_config,
)

client = discoveryengine.SearchServiceClient(
    client_options = ClientOptions()
)

response = client.search(search_request)

Reproduction steps: supporting files

Reproduction steps: actual results

The API call fails with the following error:
InvalidArgument: 400 Invalid signal: keyword_similarity_score. Please use topicality_rank * -1.0 instead.

Reproduction steps: expected results

I expected the search to execute successfully, with the results ranked based on the keyword_similarity_score. The documentation implies that this signal is a valid and supported option for custom ranking expressions.

OS & version + platform

No response

Python environment

3.12

Python dependencies

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0