---
title: TypeScript SDK Reference
description: Complete reference for the SIE TypeScript SDK.
canonical_url: https://superlinked.com/docs/reference/typescript-sdk
last_updated: 2026-05-20
---

The TypeScript SDK provides an async client for interacting with the SIE server from Node.js and browser environments.

## Installation

```bash
pnpm add @superlinked/sie-sdk
```

Or with npm:

```bash
npm install @superlinked/sie-sdk
```

## SIEClient

Source: [packages/sie_ts_sdk/src/client.ts](https://github.com/superlinked/sie/blob/main/packages/sie_ts_sdk/src/client.ts)

Async client for the SIE server. All methods return Promises.

### Constructor

```typescript

const client = new SIEClient(
  baseUrl: string,                    // Server URL (e.g., "http://localhost:8080")
  options?: {
    timeout?: number,                 // Request timeout in milliseconds (default: 30000)
    apiKey?: string,                  // API key for authentication
    gpu?: string,                     // Default GPU type for routing
    pool?: PoolSpec,                  // Resource pool configuration
    waitForCapacity?: boolean,        // Auto-retry on 202 (default: false)
    provisionTimeout?: number,        // Max wait for provisioning in ms (default: 300000)
  }
);
```

### Methods

#### encode()

Generate embeddings.

```typescript
async encode(
  model: string,                      // Model name
  items: Item | Item[],               // Items to encode
  options?: {
    outputTypes?: OutputType[],       // ["dense", "sparse", "multivector"]
    instruction?: string,             // Task instruction for instruction-tuned models
    outputDtype?: DType,              // "float32", "float16", "int8", "binary"
    isQuery?: boolean,                // Query vs document encoding
    gpu?: string,                     // GPU routing
    waitForCapacity?: boolean,        // Wait for scale-up
  }
): Promise<EncodeResult | EncodeResult[]>
```

**Returns:** Single `EncodeResult` if single item passed, otherwise array.

**Example:**

```typescript
// Single item
const result = await client.encode("BAAI/bge-m3", { text: "Hello" });
console.log(result.dense?.slice(0, 5)); // Float32Array

// Batch
const results = await client.encode("BAAI/bge-m3", [
  { text: "First" },
  { text: "Second" },
]);
```

#### score()

Rerank items against a query using a cross-encoder or late interaction model. Returns items sorted by relevance score (highest first).

```typescript
async score(
  model: string,                      // Model name (e.g., "BAAI/bge-reranker-v2-m3")
  query: Item,                        // Query item with text or multivector
  items: Item[],                      // Items to score against query
  options?: {
    topK?: number,                    // Return only top K results
    gpu?: string,
    waitForCapacity?: boolean,
  }
): Promise<ScoreResult>
```

**Example:**

```typescript
const result = await client.score(
  "BAAI/bge-reranker-v2-m3",
  { text: "What is Python?" },
  [{ text: "Python is..." }, { text: "Java is..." }]
);

// Scores are sorted by relevance (rank 0 = most relevant)
for (const entry of result.scores) {
  console.log(`Rank ${entry.rank}: ${entry.score.toFixed(3)}`);
}
```

**Note:** For ColBERT-style models, you can pass pre-computed multivectors to score client-side without a server round-trip. See the Scoring Utilities section.

:::note
The `ScoreOptions` type does not currently declare `instruction` (for instruction-tuned cross-encoders) or `options` (e.g., `{ profile: "..." }`). The underlying wire request supports both; pass them via a `as never` / `as ScoreOptions` cast until the SDK types are updated.
:::

#### extract()

Extract entities or structured data from text. Supports Named Entity Recognition (NER) models like GLiNER.

```typescript
async extract(
  model: string,                      // Model name (e.g., "urchade/gliner_multi-v2.1")
  items: Item | Item[],               // Items to extract from
  options: {
    labels: string[],                 // Entity types to extract (e.g., ["person", "org"])
    threshold?: number,               // Minimum confidence (0-1)
    adapterOptions?: Record<string, unknown>,  // Adapter knobs (e.g., { overflow_policy: "truncate_text" })
    gpu?: string,
    waitForCapacity?: boolean,
  }
): Promise<ExtractResult | ExtractResult[]>
```

**Returns:** Single `ExtractResult` if single item passed, otherwise array.

:::note
`ExtractOptions` does not currently declare `instruction` (for Donut Document-QA prompts) or `options` (for `{ task: "..." }` modes on PaddleOCR/Florence-2). The wire request supports both; pass via a cast until the SDK types are updated. See `extract/vision.mdx` for examples.
:::

**Example:**

```typescript
const result = await client.extract(
  "urchade/gliner_multi-v2.1",
  { text: "Tim Cook leads Apple." },
  { labels: ["person", "organization"] }
);

for (const entity of result.entities) {
  console.log(`${entity.label}: ${entity.text} (score: ${entity.score.toFixed(2)})`);
}
// Output:
// person: Tim Cook (score: 0.95)
// organization: Apple (score: 0.92)
```

#### listModels()

Get available models.

```typescript
async listModels(): Promise<ModelInfo[]>
```

**Example:**

```typescript
const models = await client.listModels();
for (const model of models) {
  console.log(`${model.name}: ${model.outputs.join(", ")}`);
}
```

#### getCapacity()

Get cluster capacity information.

```typescript
async getCapacity(gpu?: string): Promise<CapacityInfo>
```

**Example:**

```typescript
const capacity = await client.getCapacity();
console.log(`Workers: ${capacity.workerCount}, GPUs: ${capacity.liveGpuTypes}`);

// Check if L4 GPUs are available
const l4Capacity = await client.getCapacity("l4");
if (l4Capacity.workerCount > 0) {
  console.log("L4 workers available");
}
```

#### waitForCapacity()

Wait for GPU capacity to become available. This is useful for pre-warming the cluster before running benchmarks.

```typescript
async waitForCapacity(
  gpu: string,
  options?: {
    model?: string,                   // If provided, sends a warmup encode request
    timeout?: number,                 // Default: 300000ms
    pollInterval?: number,            // Default: 5000ms
  }
): Promise<CapacityInfo>
```

**Example:**

```typescript
// Wait for L4 capacity before running benchmarks
const capacity = await client.waitForCapacity("l4", { timeout: 300000 });
console.log(`Ready with ${capacity.workerCount} L4 workers`);

// Wait and pre-load a model
const capacityWithModel = await client.waitForCapacity("l4", { model: "BAAI/bge-m3" });
```

#### close()

Close the client and cleanup resources.

```typescript
async close(): Promise<void>
```

---

## Types

Source: [packages/sie_ts_sdk/src/types.ts](https://github.com/superlinked/sie/blob/main/packages/sie_ts_sdk/src/types.ts)

### Item

Input item for encode, score, and extract operations.

```typescript
interface Item {
  id?: string;                        // Client-provided ID (echoed in response)
  text?: string;                      // Text content
  images?: Uint8Array[];              // Image data as byte arrays (for multimodal models)
  multivector?: Float32Array[];       // Pre-computed vectors (for client-side MaxSim)
  metadata?: Record<string, unknown>; // Custom metadata
}
```

**Common patterns:**

```typescript
// Simple text
{ text: "Hello world" }

// With ID for tracking
{ id: "doc-1", text: "Document text" }

// Multimodal (for CLIP, ColPali, etc.)
{ text: "Description", images: [imageBytes] }
```

### EncodeResult

```typescript
interface EncodeResult {
  id?: string;                        // Echoed item ID
  dense?: Float32Array;               // Dense embedding
  sparse?: SparseResult;              // Sparse embedding
  multivector?: Float32Array[];       // Per-token embeddings
  timing?: TimingInfo;                // Timing breakdown
}
```

### SparseResult

```typescript
interface SparseResult {
  indices: Int32Array;                // Token IDs
  values: Float32Array;               // Token weights
}
```

### ScoreResult

```typescript
interface ScoreResult {
  model?: string;                     // Model used for scoring
  queryId?: string;                   // Query ID (if provided in request)
  scores: ScoreEntry[];               // Sorted by score descending
}
```

### ScoreEntry

```typescript
interface ScoreEntry {
  itemId: string;                     // ID of the item
  score: number;                      // Relevance score
  rank: number;                       // Position (0 = most relevant)
}
```

### ExtractResult

```typescript
interface ExtractResult {
  id?: string;                        // Echoed item ID
  entities: Entity[];                 // Extracted entities
}
```

### Entity

```typescript
interface Entity {
  text: string;                       // Extracted span
  label: string;                      // Entity type
  score: number;                      // Confidence (0-1)
  start?: number;                     // Start character offset
  end?: number;                       // End character offset
  bbox?: number[];                    // Bounding box [x, y, width, height] for vision models
}
```

### ModelInfo

```typescript
interface ModelInfo {
  name: string;                       // Model name/identifier
  loaded: boolean;                    // Whether model weights are in memory
  inputs: string[];                   // Input types: ["text"], ["text", "image"], etc.
  outputs: string[];                  // Output types: ["dense"], ["dense", "sparse"], etc.
  dims?: ModelDims;                   // Dimension info for each output type
  maxSequenceLength?: number;         // Maximum input sequence length
}
```

### CapacityInfo

```typescript
interface CapacityInfo {
  status: string;                     // "healthy", "degraded", "no_workers"
  workerCount: number;                // Number of healthy workers
  gpuCount: number;                   // Number of GPUs available
  modelsLoaded: number;               // Unique models loaded across workers
  configuredGpuTypes: string[];       // GPU types configured in cluster
  liveGpuTypes: string[];             // GPU types currently running
  workers: WorkerInfo[];              // Worker details
}
```

### TimingInfo

```typescript
interface TimingInfo {
  totalMs?: number;                   // Total request time
  queueMs?: number;                   // Time waiting in queue
  tokenizationMs?: number;            // Tokenization time
  inferenceMs?: number;               // Model inference time
}
```

### OutputType

```typescript
type OutputType = "dense" | "sparse" | "multivector";
```

### DType

```typescript
type DType = "float32" | "float16" | "bfloat16" | "int8" | "uint8" | "binary" | "ubinary";
```

### Utility Functions

```typescript
// Convert typed arrays to regular number arrays (for JSON serialization)
function toNumberArray(arr: Float32Array | Int32Array): number[];

// Convert number array to Float32Array
function toFloat32Array(arr: number[]): Float32Array;
```

---

## Scoring Utilities

Source: [packages/sie_ts_sdk/src/scoring.ts](https://github.com/superlinked/sie/blob/main/packages/sie_ts_sdk/src/scoring.ts)

Client-side scoring for multi-vector embeddings.

### maxsim()

Compute MaxSim scores for ColBERT-style retrieval. MaxSim finds the maximum similarity between each query token and any document token, then sums these maximums.

```typescript
function maxsim(
  query: Float32Array[],              // [numQueryTokens][dim]
  document: Float32Array[]            // [numDocTokens][dim]
): number
```

**Example:**

```typescript

const client = new SIEClient("http://localhost:8080");

// Encode query with isQuery=true for ColBERT models
const queryResult = await client.encode(
  "jinaai/jina-colbert-v2",
  { text: "What is ColBERT?" },
  { outputTypes: ["multivector"], isQuery: true }
);

// Encode documents (no isQuery needed for documents)
const docResults = await client.encode(
  "jinaai/jina-colbert-v2",
  documents,
  { outputTypes: ["multivector"] }
);

// Compute MaxSim scores client-side
const queryMv = queryResult.multivector!;
const scores = docResults.map((r) => maxsim(queryMv, r.multivector!));

// Rank by score (higher is more relevant)
const ranked = scores
  .map((score, idx) => ({ score, idx }))
  .sort((a, b) => b.score - a.score);
```

### maxsimDocuments()

Score a query against multiple documents.

```typescript
function maxsimDocuments(
  query: Float32Array[],
  documents: Float32Array[][]
): number[]
```

### maxsimBatch()

Batch version for multiple queries against multiple documents.

```typescript
function maxsimBatch(
  queries: Float32Array[][],
  documents: Float32Array[][]
): Float32Array  // Flattened [numQueries * numDocuments]
```

---

## Errors

Source: [packages/sie_ts_sdk/src/errors.ts](https://github.com/superlinked/sie/blob/main/packages/sie_ts_sdk/src/errors.ts)

Exception hierarchy for SDK errors.

### SIEError

Base class for all SDK errors.

```typescript
class SIEError extends Error {
  name: "SIEError";
}
```

### SIEConnectionError

Cannot connect to server.

```typescript
class SIEConnectionError extends SIEError {
  name: "SIEConnectionError";
}
```

### RequestError

Invalid request (4xx responses).

```typescript
class RequestError extends SIEError {
  name: "RequestError";
  code?: string;
  statusCode?: number;
}
```

### InputTooLongError

Extraction input exceeds the model's context. Subclass of `RequestError`, so existing 4xx handlers continue to work; new code can branch on `InputTooLongError` for tailored handling. Thrown on HTTP `400 INPUT_TOO_LONG` from `/v1/extract` for the `gliclass-*` family.

```typescript
class InputTooLongError extends RequestError {
  name: "InputTooLongError";
  code?: string;        // "INPUT_TOO_LONG"
  statusCode?: number;  // 400
  model?: string;
}
```

Pass `adapterOptions: { overflow_policy: "truncate_text" }` to `extract()` to truncate the input server-side instead. See [Relations & Classification → Overflow policy](/docs/extract/relations/#overflow-policy).

### ServerError

Server error (5xx responses).

```typescript
class ServerError extends SIEError {
  name: "ServerError";
  code?: string;
  statusCode?: number;
}
```

### ProvisioningError

No capacity available or timeout waiting for scale-up.

```typescript
class ProvisioningError extends SIEError {
  name: "ProvisioningError";
  gpu?: string;
  retryAfter?: number;
}
```

### PoolError

Resource pool operation failed.

```typescript
class PoolError extends SIEError {
  name: "PoolError";
  poolName?: string;
  state?: string;
}
```

### LoraLoadingError

LoRA adapter loading timeout.

```typescript
class LoraLoadingError extends SIEError {
  name: "LoraLoadingError";
  lora?: string;
  model?: string;
}
```

### Handling Errors

```typescript
import {
  SIEClient,
  InputTooLongError,
  RequestError,
  ProvisioningError,
} from "@superlinked/sie-sdk";

const client = new SIEClient("http://localhost:8080");

try {
  const result = await client.encode("unknown-model", { text: "test" });
} catch (error) {
  if (error instanceof InputTooLongError) {
    console.log(`Input too long for ${error.model}: ${error.message}`);
  } else if (error instanceof RequestError) {
    console.log(`Invalid request: ${error.code} (${error.statusCode})`);
  } else if (error instanceof ProvisioningError) {
    console.log(`No capacity for GPU ${error.gpu}, retry after ${error.retryAfter}ms`);
  }
}
```

---

## GPU Routing

For cluster deployments with multiple GPU types, specify the target GPU:

```typescript
// Per-request GPU selection
const result = await client.encode(
  "BAAI/bge-m3",
  items,
  { gpu: "a100-80gb" }
);

// Default GPU for all requests
const client = new SIEClient("http://gateway.example.com", {
  gpu: "l4"
});
```

Available GPU types depend on your cluster configuration.

---

## Resource Pools

Source: [packages/sie_ts_sdk/src/client.ts](https://github.com/superlinked/sie/blob/main/packages/sie_ts_sdk/src/client.ts)

Create isolated worker sets for testing or tenant isolation:

```typescript

const client = new SIEClient("http://gateway.example.com");
await client.createPool("my-test-pool", { l4: 2, "a100-40gb": 1 });

// Route requests to the pool
const result = await client.encode(
  "BAAI/bge-m3",
  items,
  { gpu: "my-test-pool/l4" }
);

// Check pool status
const pool = await client.getPool("my-test-pool");
console.log(`Pool state: ${pool?.status.state}`);
console.log(`Workers: ${pool?.status.assignedWorkers.length}`);

// Clean up
await client.deletePool("my-test-pool");
await client.close();
```

---

## Complete Example

```typescript

// Initialize client
const client = new SIEClient("http://localhost:8080", { timeout: 60000 });

// Dense embeddings
const documents = [
  "Machine learning is a subset of artificial intelligence.",
  "Python is a popular programming language.",
  "Neural networks are inspired by the human brain.",
];

const embeddings = await client.encode(
  "BAAI/bge-m3",
  documents.map((text, i) => ({ id: `doc-${i}`, text }))
);

// Store in vector database
for (const result of embeddings) {
  if (result.dense) {
    // vectorDb.insert(result.id, result.dense);
    console.log(`Stored ${result.id}: ${result.dense.length} dimensions`);
  }
}

// Query with reranking
const query = { text: "What is machine learning?" };

// Stage 1: Vector search
const queryEmb = await client.encode("BAAI/bge-m3", query, { isQuery: true });
// const candidates = await vectorDb.search(queryEmb.dense, { topK: 100 });

// Stage 2: Rerank (using documents directly for this example)
const rerankResult = await client.score(
  "BAAI/bge-reranker-v2-m3",
  query,
  documents.map((text, i) => ({ id: `doc-${i}`, text }))
);

// Top results
console.log("\nTop results:");
for (const entry of rerankResult.scores.slice(0, 3)) {
  console.log(`  ${entry.rank + 1}. ${entry.itemId} (score: ${entry.score.toFixed(3)})`);
}

// Entity extraction
const extractResult = await client.extract(
  "urchade/gliner_multi-v2.1",
  { text: "Elon Musk founded SpaceX and leads Tesla." },
  { labels: ["person", "organization"] }
);

console.log("\nExtracted entities:");
for (const entity of extractResult.entities) {
  console.log(`  ${entity.label}: ${entity.text}`);
}

// Clean up
await client.close();
```