---
title: Multi-Vector Reranking
description: ColBERT MaxSim scoring for multi-vector reranking.
canonical_url: https://superlinked.com/docs/score/multivector
last_updated: 2026-05-20
---

ColBERT-style models can rerank via MaxSim scoring. This uses pre-computed multi-vector embeddings instead of cross-encoder forward passes.

## MaxSim Scoring

Source: [packages/sie_sdk/src/sie_sdk/scoring.py](https://github.com/superlinked/sie/blob/main/packages/sie_sdk/src/sie_sdk/scoring.py)

MaxSim computes the maximum similarity between each query token embedding and all document token embeddings, then sums across query tokens. This gives a fine-grained relevance score without requiring a cross-encoder forward pass per document.

#### Python

```python
from sie_sdk import SIEClient
from sie_sdk.types import Item
from sie_sdk.scoring import maxsim

client = SIEClient("http://localhost:8080")

# Encode query and documents with multivector output
query_result = client.encode(
    "jinaai/jina-colbert-v2",
    Item(text="What is ColBERT?"),
    output_types=["multivector"],
    is_query=True,
)

doc_results = client.encode(
    "jinaai/jina-colbert-v2",
    documents,
    output_types=["multivector"]
)

# Score with MaxSim
query_mv = query_result["multivector"]
doc_mvs = [r["multivector"] for r in doc_results]
scores = maxsim(query_mv, doc_mvs)

# Rank by score
ranked = sorted(enumerate(scores), key=lambda x: -x[1])
```

#### TypeScript

```typescript
import { SIEClient, maxsim } from "@superlinked/sie-sdk";

const client = new SIEClient("http://localhost:8080");

// Encode query and documents with multivector output
const queryResult = await client.encode(
  "jinaai/jina-colbert-v2",
  { text: "What is ColBERT?" },
  { outputTypes: ["multivector"], isQuery: true }
);

const docResults = await client.encode(
  "jinaai/jina-colbert-v2",
  documents,
  { outputTypes: ["multivector"] }
);

// Score with MaxSim using SDK helper
const queryMv = queryResult.multivector!;
const scores = docResults.map((r) => maxsim(queryMv, r.multivector!));

// Rank by score
const ranked = scores
  .map((score, idx) => ({ idx, score }))
  .sort((a, b) => b.score - a.score);

await client.close();
```

## When to Use MaxSim vs Cross-Encoders

| Factor | MaxSim (ColBERT) | Cross-Encoder |
|--------|-----------------|---------------|
| **Speed** | Fast - reuses cached embeddings | Slower - forward pass per pair |
| **Pre-computation** | Embeddings can be stored | Must recompute for each query |
| **Quality** | Strong token-level matching | Deeper cross-attention |
| **Best for** | Large candidate sets, real-time | Small candidate sets, max quality |

Use MaxSim when you already have multi-vector embeddings stored (e.g., from indexing with ColBERT). Use cross-encoders when you need the highest possible quality on a small candidate set.

See [Multi-vector embeddings](/docs/encode/multivector/) for details on encoding with ColBERT models.

## What's Next

- [Reranker models](/docs/score/models/) - model selection guide
- [Multi-vector embeddings](/docs/encode/multivector/) - encoding with ColBERT models
- [Full model catalog](/models#task=score) - all supported models
