Need some suggestion regarding usage of embedding models with qdrant #6772

Greatz08 · 2025-06-27T15:41:09Z

Greatz08
Jun 27, 2025

I recently downloaded qwen3 embedding model 0.6 B, 4B quantized Q4 and Qwen3-Embedding-8B:Q5_K_M . These three embeddings model are latest from qwen team and they generate embeddings in different dimensions. I wanted to develop better vector database for my markdown notes and i choose qdrant as vector db. Performance difference between all these three embeddings model were almost similar. My question to you is that - does it matter if i use bigger embedding model which support more dimensions ?

generall · 2025-06-27T16:14:40Z

generall
Jun 27, 2025
Maintainer

hey @Greatz08, qdrant is agnostic to the choice of models. If on your benchmarks bigger models doesn't give much precision boost, you can stick to it. If you don't have enough data to evaluate each model, you can try to lookup model performance on public benchmarks

0 replies

onestardao · 2025-07-29T06:06:46Z

onestardao
Jul 29, 2025

Great question — and welcome to the Temple of Too Many Dimensions™

So here’s the thing:

Higher dimensional embeddings don’t magically mean better results.

If your use case (markdown notes, as you said) has low entropy or semantic diversity, adding 1024D to a 384D job is like bringing a rocket launcher to a sandwich party.

Most retrieval problems suffer more from bad preprocessing or weak chunking than "not enough embedding width".

Bigger isn’t better unless:

Your downstream search requires ultra-fine-grained disambiguation (like: differentiating "neural pruning" from "neural gardening").

Your documents have very rich internal structure or compositionality.

You have enough samples to justify the added sparsity and compute.

Oh — and if you're using qwen3, note that different quantizations can shift the relative weight of token influence. So sometimes it’s not the dimension count, but how it’s packed and projected.

TL;DR:
Test with your own retrieval targets. If 384D already hits 💯 top-k precision — don’t waste RAM just for prestige.
Unless you're doing reranking / multi-embedding fusion, bigger might just be... louder.

—
P.S. I’m building my own reasoning engine on top of Qdrant (with some custom semantic fusion logic). If you're playing with hybrids (dense + sparse + symbolic), feel free to ping.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qdrant

Need some suggestion regarding usage of embedding models with qdrant #6772

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Qdrant

Need some suggestion regarding usage of embedding models with qdrant #6772

Uh oh!

Greatz08 Jun 27, 2025

Replies: 2 comments

Uh oh!

generall Jun 27, 2025 Maintainer

Uh oh!

onestardao Jul 29, 2025

Greatz08
Jun 27, 2025

generall
Jun 27, 2025
Maintainer

onestardao
Jul 29, 2025