8000 Scores to vector search · Issue #136 · pgvector/pgvector-python · GitHub
[go: up one dir, main page]

Skip to content
Scores to vector search #136
Closed
Closed
@ottusp

Description

@ottusp

Disclaimer: this is a feature request. I am able to open a PR, but first I want to check whether this makes sense to other people as well.

Problem

Commonly when using vector search, I want to know how relevant the results are. Not only do I want to get the most relevant results, but I also want to add a threshold, and cap any result that does not surpasses this value.

Other frameworks such as langchain offer support to querying with scores (and, therefore, filtering by thresholds) by annotating the query with a distance parameter, and then normalizing this value and filtering the results on python layer. This is useful, because it does not change query complexity - same indexes can be used, regardless of ANN indexing or not indexing at all.

My use cases involve querying over Django. While a Django specific solution will help, it is possible to consider this support for other connectors as well.

Proposition

Add a Model Queryset that adds automatic annotation + filtering for distance and thresholds. The interface can look like this:

from django.db import models
from pgvector.django import VectorField, PgVectorModelMixin

# Model definition
class MyModel(PgVectorModelMixin, models.Model):
    embedding = VectorField(dimensions=3)

# database entries
good_match_embedding = MyModel(embedding=[1, 2, 3])
bad_match_embedding = MyModel(embedding=[-100, -100, -100])

good_match_embedding.save()
bad_match_embedding.save()

# embedding similar to `good_match_embedding`
query_embedding = [1, 2, 4]

best_matches = MyModel.objects.similarity(embedding=query_embedding, threshold=0.7)

print(best_matches)  # A list with only good_match_embedding

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0