---
title: How Does Qdrant Work with Embedding Models?
description: Qdrant is an open-source vector database that stores embedding vectors alongside payload metadata and enables fast approximate nearest neighbour (ANN) search, filtered search, and hybrid (dense + sparse) search. It works with embedding models by receiving the vectors they produce — generated by SIE — and indexing th...
canonical_url: https://superlinked.com/glossary/how-does-qdrant-work-with-embedding-models
last_updated: 2026-06-02
---

# How Does Qdrant Work with Embedding Models?

Qdrant is an open-source vector database that stores embedding vectors alongside payload metadata and enables fast approximate nearest neighbour (ANN) search, filtered search, and hybrid (dense + sparse) search. It works with embedding models by receiving the vectors they produce — generated by SIE — and indexing them in an HNSW graph for millisecond-latency retrieval at scale.

---

## Why Qdrant?

Qdrant is a strong default choice for production semantic search and RAG pipelines because:

- **Written in Rust** — low latency, high throughput, predictable performance under load
- **Native hybrid search** — combines dense vector search with sparse BM25-style search in one query
- **Multi-vector support** — stores ColBERT-style token vectors for late interaction retrieval
- **Filterable ANN** — filter by metadata without sacrificing recall (adaptive strategy selection)
- **Open source + cloud** — run self-hosted or use Qdrant Cloud
- **Active development** — among the fastest-evolving vector databases in the ecosystem

---

## How Qdrant and SIE work together

SIE handles the encoding; Qdrant handles the storage and retrieval:

```
Documents → [SIE: BGE-M3] → vectors → [Qdrant: HNSW index] → stored
Query     → [SIE: BGE-M3] → vector  → [Qdrant: ANN search] → results
```

Full pipeline example:

```python
from sie_sdk import SIEClient
from sie_sdk.types import Item
from qdrant_client import QdrantClient
from qdrant_client.models import VectorParams, Distance, PointStruct

sie = SIEClient("http://localhost:8080")
qdrant = QdrantClient("http://localhost:6333")

# 1. Create collection
qdrant.create_collection(
    collection_name="documents",
    vectors_config=VectorParams(size=1024, distance=Distance.COSINE)
)

# 2. Encode and index documents
encode_results = sie.encode("BAAI/bge-m3", [Item(text=c) for c in document_chunks])
vectors = [r["dense"] for r in encode_results]

qdrant.upsert(
    collection_name="documents",
    points=[
        PointStruct(
            id=i,
            vector=v.tolist(),
            payload={"text": chunk, "source": source, "date": date}
        )
        for i, (v, chunk, source, date) in enumerate(zip(vectors, document_chunks, sources, dates))
    ]
)

# 3. Search
query_vector = sie.encode("BAAI/bge-m3", Item(text=user_query), is_query=True)["dense"]

results = qdrant.search(
    collection_name="documents",
    query_vector=query_vector,
    query_filter={"must": [{"key": "date", "range": {"gte": "2024-01-01"}}]},
    limit=20
)
```

---

## Hybrid search with Qdrant and SIE

BGE-M3 produces both dense and sparse vectors. Qdrant's hybrid search combines them:

```python
from sie_sdk.types import Item
from qdrant_client.models import NamedVector, NamedSparseVector, SparseVector

# Encode with both dense and sparse outputs
query_result = sie.encode(
    "BAAI/bge-m3",
    Item(text=user_query),
    output_types=["dense", "sparse"],
    is_query=True,
)
sparse = query_result["sparse"]

# Search with both
results = qdrant.query_points(
    collection_name="documents",
    prefetch=[
        # Dense retrieval
        {"query": query_result["dense"], "using": "dense", "limit": 50},
        # Sparse retrieval
        {"query": SparseVector(indices=sparse["indices"], values=sparse["values"]),
         "using": "sparse", "limit": 50},
    ],
    query={"fusion": "rrf"},  # Reciprocal Rank Fusion
    limit=20
)
```

---

## Multi-vector (ColBERT) with Qdrant

Qdrant supports multi-vector storage for ColBERT-style late interaction retrieval:

```python
from sie_sdk.types import Item
from qdrant_client.models import MultiVectorConfig, MultiVectorComparator

# Create collection with multi-vector support
qdrant.create_collection(
    collection_name="documents_colbert",
    vectors_config={
        "colbert": MultiVectorConfig(
            size=128,
            distance=Distance.COSINE,
            multivector_config=MultiVectorComparator.MAX_SIM
        )
    }
)

# Index ColBERT token vectors
colbert_results = sie.encode(
    "BAAI/bge-m3",
    [Item(text=d) for d in documents],
    output_types=["multivector"],
)
colbert_mvs = [r["multivector"] for r in colbert_results]
# Upsert token vectors per document
```

---

## Qdrant configuration for production

Key settings to tune for production deployments:

```python
# Collection with tuned HNSW parameters
qdrant.create_collection(
    collection_name="production",
    vectors_config=VectorParams(
        size=1024,
        distance=Distance.COSINE,
        hnsw_config={"m": 16, "ef_construct": 128},
        quantization_config={"scalar": {"type": "int8", "quantile": 0.99}}
    )
)

# Set search ef at query time
results = qdrant.search(
    collection_name="production",
    query_vector=query_vector,
    search_params={"hnsw_ef": 128, "exact": False},
    limit=20
)
```

**Quantisation** (INT8) reduces memory by ~4× with minimal recall loss — recommended for large corpora.

---

## Frequently asked questions

**Does Qdrant support real-time updates?**
Yes. Qdrant's HNSW index supports incremental inserts and deletes. New vectors are immediately searchable after insertion.

**What is Qdrant's payload filtering performance like?**
Qdrant uses an adaptive strategy that selects between pre-filtering and post-filtering based on filter selectivity. This typically maintains 95%+ recall even with highly selective filters.

**Can I run Qdrant alongside SIE on the same infrastructure?**
Yes. Both can run in the same Kubernetes cluster. SIE handles the GPU workloads; Qdrant runs on CPU nodes. They communicate over the cluster's internal network.

---

## Related resources

- [SIE + Qdrant full integration guide](/docs/integrations/qdrant)
- [What is a vector database?](/glossary/what-is-a-vector-database)
- [What is hybrid search?](/glossary/what-is-hybrid-search)
- [What is BGE-M3?](/glossary/what-is-bge-m3)
- [What is approximate nearest neighbour search?](/glossary/what-is-approximate-nearest-neighbour-search)
