---
title: What is a Vector Index?
description: A vector index is a data structure that organises high-dimensional vectors to enable fast approximate nearest neighbour (ANN) search. Instead of comparing a query vector to every stored vector, the index groups or graphs vectors in ways that allow search to skip irrelevant regions of the vector space — reducing quer...
canonical_url: https://superlinked.com/glossary/what-is-a-vector-index
last_updated: 2026-06-02
---

# What is a Vector Index?

A vector index is a data structure that organises high-dimensional vectors to enable fast approximate nearest neighbour (ANN) search. Instead of comparing a query vector to every stored vector, the index groups or graphs vectors in ways that allow search to skip irrelevant regions of the vector space — reducing query time from linear to sub-linear. HNSW and IVF are the two most widely used vector index types in production systems.

---

## Why do vector indexes matter?

Without an index, searching for the nearest vector among 10 million 768-dimensional vectors requires computing 10 million dot products per query — too slow for real-time search. A vector index pre-organises the vectors so that search can focus on the most likely relevant region of the space, typically reducing comparisons by 99%+ while finding 95–99% of the true nearest neighbours.

Choosing and configuring the right index directly affects:
- **Query latency** — how fast each search request completes
- **Recall** — how many relevant results are returned
- **Memory usage** — how much RAM the index requires
- **Index build time** — how long it takes to index new vectors

---

## HNSW: the dominant production index

HNSW (Hierarchical Navigable Small World) builds a multi-layer proximity graph. Vectors with similar neighbours are connected by edges. Search navigates from a sparse top layer down to a dense bottom layer:

```
Layer 2:  sparse graph — few nodes, long-range connections
Layer 1:  medium graph — more nodes, medium connections
Layer 0:  dense graph — all nodes, short-range connections

Search: enter at layer 2, greedily navigate to nearest node,
        drop to layer 1, repeat, drop to layer 0, find k-NN
```

**When to use HNSW:**
- Production semantic search with real-time inserts
- Corpus size up to ~100M vectors
- When recall > 95% is required
- When you need to add new vectors without rebuilding the index

**Key parameters:**
- `M` — number of bidirectional edges per node (16–32 typical). Higher = better recall, more memory.
- `ef_construction` — beam width during index build (128–200 typical). Higher = better index quality, slower build.
- `ef` — beam width during search. Higher = better recall, slower query.

---

## IVF: cluster-based indexing

IVF (Inverted File Index) partitions vectors into clusters using k-means, then at query time only searches the nearest clusters:

```
Build: k-means clustering → nlist centroids + inverted lists
Query: find nprobe nearest centroids → search their inverted lists
```

**When to use IVF:**
- Very large corpora (100M–1B+ vectors)
- Batch indexing workflows (corpus changes infrequently)
- When memory is constrained (IVF uses less memory than HNSW)

**Key parameters:**
- `nlist` — number of clusters. Recommended: sqrt(n) to 4×sqrt(n).
- `nprobe` — clusters searched per query. Higher = better recall, slower.

---

## Product Quantisation (PQ): memory compression

PQ compresses vectors by dividing them into sub-vectors and replacing each with a codebook index. A 768-dimensional float32 vector (3KB) can be compressed to 96 bytes — a 32× reduction.

Often combined with IVF as **IVF-PQ** for billion-scale retrieval where memory is the primary constraint.

**Trade-off:** significant accuracy loss compared to full-precision HNSW. Use only when memory constraints make uncompressed indexing infeasible.

---

## Index types by use case

| Scenario | Recommended index | Why |
|---|---|---|
| General production search | HNSW | Best recall-latency balance, incremental inserts |
| Real-time inserts | HNSW | Supports incremental updates |
| 100M–1B vectors | IVF-HNSW or IVF-PQ | HNSW memory too high at this scale |
| Memory-constrained | IVF-PQ | 32× compression |
| Development / small corpus | Flat (exact) | No approximation needed under ~100K vectors |

---

## How vector indexes are managed in SIE pipelines

SIE produces the vectors; your vector database manages the index. When setting up a new collection:

```python
from qdrant_client import QdrantClient
from qdrant_client.models import VectorParams, Distance

qdrant = QdrantClient("http://localhost:6333")

# Create collection with HNSW config
qdrant.create_collection(
    collection_name="documents",
    vectors_config=VectorParams(
        size=1024,           # BGE-M3 output dimension
        distance=Distance.COSINE,
        hnsw_config={
            "m": 16,
            "ef_construct": 128
        }
    )
)
```

At query time, set `ef` (search beam width) to balance recall vs latency for your specific requirements.

---

## Frequently asked questions

**Do I need to rebuild the index when adding new vectors?**
With HNSW, no — new vectors are inserted into the graph incrementally. With IVF, the index must be rebuilt or the new vectors temporarily use a flat fallback index. This is why HNSW dominates production systems with frequent updates.

**What happens to index performance as the corpus grows?**
HNSW query time grows slowly (O(log n)) as corpus size increases. IVF query time is more predictable (searching a fixed number of clusters) but recall may degrade if `nlist` isn't scaled with corpus size.

**How do I choose between cosine similarity and dot product distance?**
Cosine similarity is equivalent to dot product when vectors are L2-normalised. SIE normalises output vectors by default, so both are equivalent. Cosine is safer for robustness if vector magnitude varies.

---

## Related resources

- [What is approximate nearest neighbour search?](/glossary/what-is-approximate-nearest-neighbour-search)
- [What is a vector database?](/glossary/what-is-a-vector-database)
- [SIE + Qdrant integration](/docs/integrations/qdrant)
- [What is semantic search?](/glossary/what-is-semantic-search)
- [Browse embedding models on SIE](/models)
