---
title: Quickstart
description: First embeddings in 2 minutes.
canonical_url: https://superlinked.com/docs/quickstart
last_updated: 2026-05-22
---

## Start the Server

Source: [packages/sie_server/src/sie_server/cli.py](https://github.com/superlinked/sie/blob/main/packages/sie_server/src/sie_server/cli.py)

SIE's primary target is x86 Linux nodes with NVIDIA GPUs. The CPU image lets you try everything locally; for production deployment with autoscaling and multi-GPU, see [Deployment](/docs/deployment/).

#### macOS (Apple Silicon)

```bash
docker run --platform linux/amd64 -p 8080:8080 \
  -v sie-hf-cache:/app/.cache/huggingface \
  ghcr.io/superlinked/sie-server:latest-cpu-default
```

#### Linux (CPU)

```bash
docker run -p 8080:8080 \
  -v sie-hf-cache:/app/.cache/huggingface \
  ghcr.io/superlinked/sie-server:latest-cpu-default
```

#### Linux (NVIDIA GPU)

```bash
docker run --gpus all -p 8080:8080 \
  -v sie-hf-cache:/app/.cache/huggingface \
  ghcr.io/superlinked/sie-server:latest-cuda12-default
```

The server starts on port 8080 with all models available. Models load on first request.

> **Caution — Apple Silicon is emulated:**
>
> The `--platform linux/amd64` flag runs the image under QEMU/Rosetta emulation, which is significantly slower than native. Sufficient for trying SIE; for real workloads run on x86 Linux with CUDA. Native arm64 tracked in [superlinked/sie#177](https://github.com/superlinked/sie/issues/177).

> **Note — Kubernetes clusters:**
>
> If connecting to a Kubernetes cluster with scale-to-zero, first requests may return `202 Accepted` while workers provision. Use `wait_for_capacity=True` (Python) or `waitForCapacity: true` (TypeScript) in the SDK, or expect 5-7 minute cold starts. See [Scale-from-Zero](/docs/deployment/autoscaling/).

## Install the SDK

#### Python

```bash
pip install sie-sdk
```

#### TypeScript

```bash
pnpm add @superlinked/sie-sdk
```

## Generate Embeddings

Source: [packages/sie_sdk/src/sie_sdk/client/sync.py](https://github.com/superlinked/sie/blob/main/packages/sie_sdk/src/sie_sdk/client/sync.py)

#### Python

```python
from sie_sdk import SIEClient
from sie_sdk.types import Item

client = SIEClient("http://localhost:8080")

# Single item
result = client.encode("BAAI/bge-m3", Item(text="Hello world"))
print(result["dense"].shape)  # (1024,)

# Batch
results = client.encode("BAAI/bge-m3", [
    Item(text="First document"),
    Item(text="Second document"),
])
print(len(results))  # 2
```

#### TypeScript

```typescript
import { SIEClient } from "@superlinked/sie-sdk";

const client = new SIEClient("http://localhost:8080");

// Single item
const result = await client.encode("BAAI/bge-m3", { text: "Hello world" });
console.log(result.dense?.length);  // 1024

// Batch
const results = await client.encode("BAAI/bge-m3", [
  { text: "First document" },
  { text: "Second document" },
]);
console.log(results.length);  // 2
```

## Rerank Search Results

Source: [packages/sie_sdk/src/sie_sdk/client/sync.py](https://github.com/superlinked/sie/blob/main/packages/sie_sdk/src/sie_sdk/client/sync.py)

#### Python

```python
query = Item(text="What is machine learning?")
items = [
    Item(text="Machine learning uses algorithms to learn from data."),
    Item(text="The weather is sunny today."),
]

result = client.score("BAAI/bge-reranker-v2-m3", query, items)

for entry in result["scores"]:
    print(f"Rank {entry['rank']}: score={entry['score']:.3f}")
# Rank 0: score=0.998
# Rank 1: score=0.012
```

#### TypeScript

```typescript
const query = { text: "What is machine learning?" };
const items = [
  { text: "Machine learning uses algorithms to learn from data." },
  { text: "The weather is sunny today." },
];

const result = await client.score("BAAI/bge-reranker-v2-m3", query, items);

for (const entry of result.scores) {
  console.log(`Rank ${entry.rank}: score=${entry.score.toFixed(3)}`);
}
// Rank 0: score=0.998
// Rank 1: score=0.012
```

## Extract Entities

Source: [packages/sie_sdk/src/sie_sdk/client/sync.py](https://github.com/superlinked/sie/blob/main/packages/sie_sdk/src/sie_sdk/client/sync.py)

#### Python

```python
result = client.extract(
    "urchade/gliner_multi-v2.1",
    Item(text="Tim Cook is the CEO of Apple."),
    labels=["person", "organization"]
)

for entity in result["entities"]:
    print(f"{entity['label']}: {entity['text']}")
# person: Tim Cook
# organization: Apple
```

#### TypeScript

```typescript
const result = await client.extract(
  "urchade/gliner_multi-v2.1",
  { text: "Tim Cook is the CEO of Apple." },
  { labels: ["person", "organization"] }
);

for (const entity of result.entities ?? []) {
  console.log(`${entity.label}: ${entity.text}`);
}
// person: Tim Cook
// organization: Apple
```

## What's Next

- [Choosing a Model](/docs/choosing/) - pick the right model for your use case
- [Dense Embeddings](/docs/encode/) - output types, query vs document encoding
- [Model Catalog](/models) - all 85+ supported models
- [Integrations](/docs/integrations/) - LangChain, LlamaIndex, Haystack, and more
- [Deployment](/docs/deployment/) - Docker, Kubernetes, cloud deployment
- [Sparse / Hybrid Search](/docs/encode/sparse/) - combine dense and sparse for better retrieval