---
title: OpenAI → SIE
description: Migrate from OpenAI Embeddings to a self-hosted SIE cluster. Drop-in via the OpenAI-compatible endpoint, or use the native SIE SDK.
canonical_url: https://superlinked.com/docs/migrate/openai
last_updated: 2026-05-08
---

OpenAI Embeddings is a paid managed API. SIE is a self-hosted inference
engine. There are two migration paths:

1. **Drop-in shim.** Point the OpenAI SDK at SIE's `/v1/embeddings`
   endpoint. Two-line change, every other call site untouched.
2. **Native SDK.** Use `sie_sdk.SIEClient` directly to access sparse,
   multivector, ColBERT, rerankers, and extraction.

## Why migrate

- **Cost crosses over at moderate volume.** OpenAI bills per token;
  SIE has flat hourly cost. The right answer depends on your workload;
  see the [worked example](#worked-cost-example) below. Plug in your
  own numbers before quoting either side.
- **Data residency.** Embeddings of your text never leave your network.
- **No rate limits.** Your ceiling is whatever GPU capacity you
  provision. You also own the SLA: OpenAI's published 99.9% becomes
  whatever your platform team operates.
- **Model breadth.** OpenAI ships 3 embedding models. SIE serves 100+
  out of the box (109 bundle configs as of writing) across dense,
  sparse, ColBERT/multivector, vision, and rerankers.

## TL;DR

```diff
- client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
+ client = OpenAI(api_key="not-needed", base_url="http://sie:8080/v1")
```

…and `model="text-embedding-3-small"` becomes `model="intfloat/e5-base-v2"` (or
whichever SIE model you pick).

## Before

```python
import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
resp = client.embeddings.create(
    model="text-embedding-3-small",
    input=["The mitochondrion is the powerhouse of the cell."],
)
vector = resp.data[0].embedding  # 1536-dim
```

## After

#### Drop-in (OpenAI SDK + SIE)

```python
from openai import OpenAI

client = OpenAI(api_key="not-needed", base_url="http://localhost:8080/v1")
resp = client.embeddings.create(
    model="intfloat/e5-base-v2",
    input=["The mitochondrion is the powerhouse of the cell."],
)
vector = resp.data[0].embedding  # 768-dim
```

#### Native SIE SDK

```python
from sie_sdk import SIEClient
from sie_sdk.types import Item

client = SIEClient("http://localhost:8080")
result = client.encode(
    "intfloat/e5-base-v2",
    Item(text="The mitochondrion is the powerhouse of the cell."),
)
vector = result["dense"].tolist()
```

## Mapping

| OpenAI                                | SIE equivalent                                                   |
|---------------------------------------|------------------------------------------------------------------|
| `text-embedding-3-small` (1536)       | `intfloat/e5-base-v2` (768)                                      |
| `text-embedding-3-large` (3072)       | `Alibaba-NLP/gte-Qwen2-1.5B-instruct` or `NovaSearch/stella_en_1.5B_v5` |
| `dimensions=N` truncation             | Slice client-side, or pick a smaller-dim model                   |
| `encoding_format="base64"`            | Supported on `/v1/embeddings` with the same field                |
| `user="..."` for abuse tracking       | Use SIE telemetry / your own tracing                             |

## Worked cost example

Public list prices, no negotiated discounts, single region:

| Workload          | OpenAI (`text-embedding-3-small` @ $0.02 / 1M tokens) | SIE on one `g5.xlarge` (1× A10G, ~$1.00/hr on-demand) |
|-------------------|-------------------------------------------------------|--------------------------------------------------------|
| 10M tokens / day  | ~$0.20 / day · ~$6 / month                            | ~$24 / day · ~$730 / month                             |
| 100M tokens / day | ~$2 / day · ~$60 / month                              | ~$24 / day · ~$730 / month                             |
| 1B tokens / day   | ~$20 / day · ~$600 / month                            | ~$24 / day · ~$730 / month                             |
| 5B tokens / day   | ~$100 / day · ~$3,000 / month                         | needs ≥1 GPU; still ~$730–$2,200 / month               |

Crossover sits around 1.2B tokens / day on this size class. Below that, OpenAI is cheaper *and* you don't run a
GPU. Above that, SIE wins on cost and the gap widens linearly. **None
of this counts the engineering cost of operating a GPU pool**, which
is the actually-load-bearing variable for most teams. Plug in your
own utilization, GPU size, and reserved-instance discount before
quoting a number to your manager.

## Re-embed required?

**Yes.** Different model → different vector space. Even
`text-embedding-3-small` truncated to 1024 dims is not interchangeable
with any open-source model. Plan a re-embed window before cutting over.

## Run it yourself

```bash
# Start SIE with E5 loaded.
mise run serve -- -m intfloat/e5-base-v2

# Run the OpenAI 'before' script and the SIE 'after' script
# from this page. Compare the printed embeddings.
export OPENAI_API_KEY=sk-...
uv add openai
```

Cosine across spaces carries no signal, so don't expect 1.0. For
sign-off, run your retrieval eval against both legs.