---
title: How to generate embeddings with SIE
description: "Dense embeddings are fixed-dimension float vectors that capture semantic meaning. SIE's encode primitive converts text or images into vectors for semantic search, RAG, and recommendation pipelines."
canonical_url: https://superlinked.com/docs/encode
last_updated: 2026-05-18
---

**Dense embeddings are fixed-dimension float vectors that capture semantic meaning.** SIE's `encode` primitive converts text or images into these vectors using any of 85+ supported models. The resulting embeddings power semantic search, RAG retrieval, and recommendation systems.

#### Python

```python
from sie_sdk import SIEClient
from sie_sdk.types import Item

client = SIEClient("http://localhost:8080")
result = client.encode("BAAI/bge-m3", Item(text="Your text here"))
print(f"Dimensions: {len(result['dense'])}")  # 1024
```

#### TypeScript

```typescript
import { SIEClient } from "@superlinked/sie-sdk";

const client = new SIEClient("http://localhost:8080");
const result = await client.encode("BAAI/bge-m3", { text: "Your text here" });
console.log(`Dimensions: ${result.dense?.length}`); // 1024

await client.close();
```

Not sure which model to use? See the [Model Selection Guide](https://superlinked.com/docs/choosing/) or the full [model catalog](https://superlinked.com/models#task=encode).

---

## When Should I Use Dense Embeddings?

**Use dense embeddings when:**
- You need semantic similarity matching, not just exact keyword matching
- Your vector database supports ANN search (Qdrant, Weaviate, Chroma, and LanceDB all work with SIE)
- You want a simple, fast retrieval baseline to start with

**Consider a different approach when:**
- You need hybrid search (keyword and semantic combined): see [Sparse and Hybrid Search](https://superlinked.com/docs/encode/sparse/)
- You need maximum retrieval accuracy: see [Multi-vector and ColBERT](https://superlinked.com/docs/encode/multivector/)
- You are searching over images: see [Multimodal Embeddings](https://superlinked.com/docs/encode/multimodal/)

---

## Basic Usage

### Single Item

#### Python

```python
result = client.encode("BAAI/bge-m3", Item(text="Hello world"))
print(result["dense"][:5])  # First 5 dimensions
```

#### TypeScript

```typescript
const result = await client.encode("BAAI/bge-m3", { text: "Hello world" });
console.log(result.dense?.slice(0, 5)); // First 5 dimensions
```

### Batch Encoding

Pass a list of items for efficient GPU-batched processing:

#### Python

```python
items = [
    Item(text="First document"),
    Item(text="Second document"),
    Item(text="Third document"),
]
results = client.encode("BAAI/bge-m3", items)
```

#### TypeScript

```typescript
const results = await client.encode("BAAI/bge-m3", [
  { text: "First document" },
  { text: "Second document" },
  { text: "Third document" },
]);
```

The server batches requests automatically. You do not need to manage batch sizes manually.

### Tracking Items by ID

#### Python

```python
items = [
    Item(id="doc-1", text="First document"),
    Item(id="doc-2", text="Second document"),
]
results = client.encode("BAAI/bge-m3", items)
for result in results:
    print(f"{result['id']}: {len(result['dense'])} dims")
```

#### TypeScript

```typescript
const results = await client.encode("BAAI/bge-m3", [
  { id: "doc-1", text: "First document" },
  { id: "doc-2", text: "Second document" },
]);
for (const result of results) {
  console.log(`${result.id}: ${result.dense?.length} dims`);
}
```

---

## Should I Encode Queries and Documents Differently?

Yes, for asymmetric models. Queries are short and question-like; documents are longer content. Many models are trained to distinguish these and perform better when you tell them which is which.

#### Python

```python
# Encode a search query
query = client.encode(
    "BAAI/bge-m3",
    Item(text="What is machine learning?"),
    is_query=True,
)

# Encode documents (default, no is_query flag needed)
documents = client.encode(
    "BAAI/bge-m3",
    [Item(text="Machine learning is..."), Item(text="Deep learning uses...")],
)
```

#### TypeScript

```typescript
// Encode a search query
const query = await client.encode(
  "BAAI/bge-m3",
  { text: "What is machine learning?" },
  { isQuery: true },
);

// Encode documents (default, no isQuery flag needed)
const documents = await client.encode(
  "BAAI/bge-m3",
  [{ text: "Machine learning is..." }, { text: "Deep learning uses..." }],
);
```

For instruction-tuned models like `Alibaba-NLP/gte-Qwen2-1.5B-instruct`, pass an explicit instruction string to guide embedding behaviour:

#### Python

```python
result = client.encode(
    "Alibaba-NLP/gte-Qwen2-1.5B-instruct",
    Item(text="What is Python?"),
    instruction="Represent this query for retrieving programming tutorials:"
)
```

#### TypeScript

```typescript
const result = await client.encode(
  "Alibaba-NLP/gte-Qwen2-1.5B-instruct",
  { text: "What is Python?" },
  { instruction: "Represent this query for retrieving programming tutorials:" },
);
```

---

## What Output Types Are Available?

By default, `encode` returns dense embeddings. Models that support it (such as `BAAI/bge-m3`) can return sparse and multi-vector outputs in a single call:

#### Python

```python
result = client.encode(
    "BAAI/bge-m3",
    Item(text="text"),
    output_types=["dense", "sparse", "multivector"]
)

print(result["dense"])        # 1024-dim float array
print(result["sparse"])       # {"indices": [...], "values": [...]}
print(result["multivector"])  # [num_tokens, 1024] array
```

#### TypeScript

```typescript
const result = await client.encode(
  "BAAI/bge-m3",
  { text: "text" },
  { outputTypes: ["dense", "sparse", "multivector"] },
);

console.log(result.dense);        // Float32Array, 1024 elements
console.log(result.sparse);       // { indices: Int32Array, values: Float32Array }
console.log(result.multivector);  // Float32Array[], [num_tokens][1024]
```

Not all models support all output types. `BAAI/bge-m3` is the main model supporting all three. Most models support dense only.

### Response Fields

| Field | Type | Description |
| --- | --- | --- |
| `id` | `str or None` | Item ID if provided |
| `dense` | `NDArray[float32]` | Dense embedding vector |
| `sparse` | `SparseResult or None` | Sparse indices and values |
| `multivector` | `NDArray[float32] or None` | Per-token embeddings (ColBERT) |
| `timing` | `TimingInfo` | Request timing breakdown |

---

## Good Starting Models

| Model | Dims | Max Length | Notes |
| --- | --- | --- | --- |
| `BAAI/bge-m3` | 1024 | 8192 | Multilingual; supports dense, sparse, multivector |
| `NovaSearch/stella_en_400M_v5` | 1024 | 512 | Best English quality per GB of VRAM |
| `intfloat/e5-base-v2` | 768 | 512 | Solid all-rounder |
| `sentence-transformers/all-MiniLM-L6-v2` | 384 | 256 | Fastest and most lightweight |

See [How do I choose the right model?](https://superlinked.com/docs/choosing/) or the [model catalog](https://superlinked.com/models#task=encode).

---

## HTTP API

The server defaults to msgpack for efficient numpy transport. To use plain JSON:

```bash
curl -X POST http://localhost:8080/v1/encode/BAAI/bge-m3 \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -d '{"items": [{"text": "Your text here"}]}'
```

See the full [HTTP API Reference](https://superlinked.com/docs/reference/api/).

---

## Frequently Asked Questions

**What is the difference between dense and sparse embeddings?**
Dense embeddings represent meaning as a fixed-length float vector (for example, 1024 numbers). Sparse embeddings represent text as a weighted set of vocabulary tokens, which is useful for keyword matching. Most use cases start with dense. Add sparse when you need hybrid search. See [Sparse and Hybrid Search](https://superlinked.com/docs/encode/sparse/).

**What embedding dimensions should I use?**
Higher dimensions capture more nuance but use more memory and slow down ANN search. 384-dim models like `all-MiniLM` are fast but less precise. 1024-dim models like `bge-m3` and `stella` are the standard production choice. 4096-dim models like `NV-Embed-v2` give the best quality at high memory cost. Start at 1024.

**Can SIE generate image embeddings?**
Yes. SIE supports multimodal models like `google/siglip-so400m-patch14-384` that embed both text and images into the same vector space. See [Multimodal Embeddings](https://superlinked.com/docs/encode/multimodal/).

**Does SIE integrate with LangChain, LlamaIndex, or Haystack?**
Yes. SIE has first-class integrations with LangChain, LlamaIndex, Haystack, Qdrant, Weaviate, Chroma, LanceDB, and more. See [Integrations](https://superlinked.com/docs/integrations/).