---
title: Migrate to SIE
description: Switch from a managed embedding API or single-model server to a self-hosted SIE cluster. Working before/after code per provider.
canonical_url: https://superlinked.com/docs/migrate
last_updated: 2026-05-08
---

If you're already running embedding inference somewhere else, these
guides give you the smallest path to running it on SIE. Working
before/after code per provider, plus a mapping from every provider
concept to its SIE counterpart.

Pick your starting point:

<CardGrid>
  <LinkCard title="OpenAI" description="Drop-in via OpenAI-compatible endpoint, or the native SIE SDK. Eliminate per-token cost and rate limits." href="/docs/migrate/openai/" />
  <LinkCard title="Cohere" description="Replace embed-v3 and rerank-v3.5 with catalog embedding models and bge-reranker-v2-m3. Self-hosted, no rate limits." href="/docs/migrate/cohere/" />
  <LinkCard title="TEI (HuggingFace)" description="Replace N single-model TEI containers with one SIE cluster. Typed sparse and multivector outputs in one call." href="/docs/migrate/tei/" />
  <LinkCard title="Infinity" description="Same OpenAI-compatible API, multi-model in one process, with managed deployment tooling." href="/docs/migrate/infinity/" />
  <LinkCard title="Fastembed" description="Move from in-process ONNX to out-of-process serving. Share GPU memory across app processes." href="/docs/migrate/fastembed/" />
  <LinkCard title="Modal" description="Consolidate N Modal @app.function endpoints into one SIE deployment. Flat cost, no cold starts." href="/docs/migrate/modal/" />
</CardGrid>

## How to verify a migration

Each guide ships before/after code in the page. Run both legs on a
small corpus from your own domain, print the embeddings, and check
they look sane. That is a sanity check, not sign-off.

For sign-off, run your own retrieval eval against both legs:

- **Same checkpoint** (Fastembed, TEI, Infinity, Modal-with-same-model).
  Cosine should sit at 0.999 or higher across items. If it doesn't,
  the config differs (pooling, normalization, dtype); the guide's
  caveats section calls out where.
- **Different model** (OpenAI → E5, Cohere → Stella or E5). Absolute
  cosine carries no signal across spaces. Run recall@k on a labeled
  set you trust, or a BEIR/MTEB slice that resembles your domain.

## Choosing a target model

| Source model                          | Closest SIE model                              | Re-embed |
|---------------------------------------|------------------------------------------------|----------|
| `text-embedding-3-small`              | `intfloat/e5-base-v2`                          | yes      |
| `text-embedding-3-large`              | `Alibaba-NLP/gte-Qwen2-1.5B-instruct`          | yes      |
| `embed-english-v3.0` (Cohere)         | `NovaSearch/stella_en_400M_v5`                 | yes      |
| `rerank-v3.0` (Cohere)                | `BAAI/bge-reranker-v2-m3`                      | n/a      |
| TEI / Infinity / Fastembed `bge-*`    | same checkpoint on SIE                         | no       |
| sentence-transformers on Modal        | same checkpoint on SIE                         | no       |

Browse the [full model catalog](/docs/models/) for everything SIE
serves out of the box.

## Need a guide that isn't here?

Open an issue at
[`superlinked/sie`](https://github.com/superlinked/sie/issues),
or send a PR adding a new page to the `migrate/` directory in
[`superlinked/sie-web`](https://github.com/superlinked/sie-web).
