---
title: Fastembed → SIE
description: Move from in-process Fastembed (ONNX) to out-of-process SIE serving. Same checkpoint, GPU-shared, multi-model.
canonical_url: https://superlinked.com/docs/migrate/fastembed
last_updated: 2026-05-07
---

[Fastembed](https://github.com/qdrant/fastembed) is the
Qdrant-maintained library that runs ONNX embedding models in-process
via onnxruntime. SIE serves the same models out-of-process over HTTP.

## Why migrate

- **Out-of-process serving.** Every Python worker that imports
  Fastembed gets its own copy of the model in RAM. SIE loads weights
  once per worker pod regardless of how many app processes connect.
- **Shared GPU.** Fastembed is CPU-only by default (GPU support
  requires a separate ONNX runtime build). SIE serves on CPU, MPS, or
  CUDA without changing client code.
- **Multi-model.** SIE can serve dense, sparse, ColBERT, rerankers, and
  vision models from one cluster. Fastembed does not cover sparse or
  cross-encoder rerankers natively.

## What stays the same

- Model checkpoint (e.g. `BAAI/bge-small-en-v1.5`).
- Vector dimension.
- Cosine semantics. Embeddings between Fastembed (ONNX) and SIE
  (PyTorch) sit within ~1e-3 cosine, so existing indexes do not need
  re-embedding.

## Before

```python
from fastembed import TextEmbedding

encoder = TextEmbedding(model_name="BAAI/bge-small-en-v1.5")
[vec] = list(encoder.embed(["The mitochondrion is the powerhouse of the cell."]))
```

## After

```python
from sie_sdk import SIEClient
from sie_sdk.types import Item

client = SIEClient("http://localhost:8080")
result = client.encode(
    "BAAI/bge-small-en-v1.5",
    Item(text="The mitochondrion is the powerhouse of the cell."),
)
vec = result["dense"]  # np.ndarray, shape [384]
```

## Re-embed required?

**No** if you keep the same checkpoint. **Yes** if you take the
migration as a chance to upgrade to a stronger model.

## Run it yourself

`sentence-transformers/all-MiniLM-L6-v2` is the common-denominator
small model both Fastembed and SIE ship by default.

```bash
mise run serve -- -m sentence-transformers/all-MiniLM-L6-v2
uv add fastembed
```

Run the 'before' and 'after' snippets from this page. Expected:
identical dim (384), cosine at or above 0.999.

### Using `BAAI/bge-small-en-v1.5` (or any other model)

Most Fastembed users actually run `BAAI/bge-small-en-v1.5`. SIE doesn't
ship that bundle by default, so add one. Migration mechanics are
otherwise identical. Drop a YAML file at
`packages/sie_server/models/BAAI__bge-small-en-v1.5.yaml`:

```yaml
sie_id: BAAI/bge-small-en-v1.5
hf_id: BAAI/bge-small-en-v1.5
inputs:
  text: true
  image: false
  audio: false
  video: false
tasks:
  encode:
    dense:
      dim: 384
    sparse: null
    multivector: null
  score: null
  extract: null
max_sequence_length: 512
profiles:
  default:
    max_batch_tokens: 16384
    compute_precision: null
    adapter_path: sie_server.adapters.sentence_transformer:SentenceTransformerDenseAdapter
    adapter_options:
      loadtime:
        trust_remote_code: false
      runtime:
        pooling: cls
        normalize: true
```

Then `mise run serve -- -m BAAI/bge-small-en-v1.5`. The
`sentence-transformers__all-MiniLM-L6-v2.yaml` bundle is the closest
working reference; copy it and adjust `dim`, `max_sequence_length`,
and `pooling`.
