---
title: CLI Reference
description: Complete reference for all SIE command-line tools.
canonical_url: https://superlinked.com/docs/reference/cli
last_updated: 2026-05-20
---

SIE provides CLI tools for different roles: server operation, benchmarking, administration, monitoring, and gateway operation. Python tools use [typer](https://typer.tiangolo.com/) for argument parsing; `sie-gateway` is a Rust binary using `clap`.

## sie-server

Source: [packages/sie_server/src/sie_server/cli.py](https://github.com/superlinked/sie/blob/main/packages/sie_server/src/sie_server/cli.py)

The inference server. Start with `sie-server serve`.

### serve

```bash
sie-server serve [OPTIONS]
```

Start the SIE inference server.

| Option | Default | Description |
|--------|---------|-------------|
| `--port`, `-p` | `8080` | Port to listen on |
| `--host` | `0.0.0.0` | Host to bind to |
| `--device`, `-d` | `auto` | Device for inference: `auto` (detect GPU), `cuda`, `mps`, `cpu` |
| `--models-dir` | `./models` | Models config directory (local path, `s3://`, or `gs://`) |
| `--bundle`, `-b` | None | Bundle name to load from `bundles/` dir (e.g., `default`, `legacy`) |
| `--models`, `-m` | None | Comma-separated model names to load (mutually exclusive with `--bundle`) |
| `--local-cache` | `HF_HOME` | Local cache directory for model weights |
| `--cluster-cache` | None | Cluster cache URL for model weights (`s3://` or `gs://`) |
| `--hf-fallback`/`--no-hf-fallback` | `true` | Enable/disable HuggingFace Hub fallback for weight downloads |
| `--instrumentation`, `-i` | `false` | Enable batch instrumentation logging |
| `--reload` | `false` | Enable auto-reload for development (uses uvicorn reload) |
| `--tracing` | `false` | Enable OpenTelemetry tracing (exports to `localhost:4317`) |
| `--verbose`, `-v` | `false` | Enable verbose logging |
| `--json-logs` | `false` | Enable structured JSON logging (for Loki compatibility) |

**Examples:**

```bash
# Start with defaults (auto-detect GPU, port 8080)
sie-server serve

# Specific port and device
sie-server serve --port 8081 --device cuda

# Load specific bundle
sie-server serve --bundle legacy

# Load specific models only
sie-server serve --models BAAI/bge-m3,BAAI/bge-reranker-v2-m3

# Use cloud model configs
sie-server serve --models-dir s3://my-bucket/sie-models/

# Development mode with auto-reload
sie-server serve --reload --verbose
```

### resolve-deps

```bash
sie-server resolve-deps [OPTIONS]
```

Resolve and print dependencies for a bundle or model list. Used by deployment scripts.

| Option | Description |
|--------|-------------|
| `--bundle`, `-b` | Bundle name to resolve deps for |
| `--models`, `-m` | Comma-separated model names |
| `--models-dir` | Models directory |
| `--json` | Output as JSON |

---

## sie-bench

Source: [packages/sie_bench/src/sie_bench/cli.py](https://github.com/superlinked/sie/blob/main/packages/sie_bench/src/sie_bench/cli.py)

Evaluation and benchmarking CLI. Runs quality and performance evaluations.

### eval

```bash
sie-bench eval MODEL --task TASK --type TYPE [OPTIONS]
```

Run evaluation against multiple sources.

| Argument/Option | Description |
|-----------------|-------------|
| `MODEL` | Model name (e.g., `BAAI/bge-m3`) |
| `--task`, `-t` | Namespaced task (e.g., `mteb/NFCorpus`, `beir/SciFact`) |
| `--type` | Evaluation type: `quality` or `perf` |
| `--sources`, `-s` | Comma-separated sources: `sie`, `tei`, `infinity`, `fastembed`, `benchmark`, `targets`, `measurements`, or a URL (default: `sie`) |
| `--batch-size`, `-b` | Batch size for performance evaluation (default: 1) |
| `--concurrency`, `-c` | Concurrency level (default: 16) |
| `--device`, `-d` | Device for inference (default: `cuda:0`) |
| `--output`, `-o` | Output format: `table`, `json`, `md` (default: `table`) |
| `--profile`, `-p` | Named profile from model config (e.g., `sparse`, `muvera`). Controls runtime options including output types. |
| `--lang` | Language filter (ISO 639-3, e.g., `eng` for English only). For multilingual tasks. |
| `--timeout` | Request timeout in seconds (default: 120, use 600+ for VLMs) |
| `--verbose`, `-v` | Enable verbose logging |

**Target management:**

| Option | Description |
|--------|-------------|
| `--save-targets SOURCE` | Save results from SOURCE (e.g., `tei`, `benchmark`) as targets in model config |
| `--save-measurements SOURCE` | Save results from SOURCE (e.g., `sie`) as measurements in model config |
| `--check-targets` | Exit non-zero if SIE results are below targets. Requires `targets` in `--sources`. |
| `--check-measurements` | Exit non-zero if SIE results are below past measurements. Requires `measurements` in `--sources`. |
| `--print` | Print summary table of all targets and measurements from model configs |
| `--print-json` | Print JSON with task metadata and model results for website integration |
| `--models-dir` | Path to models directory (for target/measurement operations) |

**Cluster options:**

| Option | Description |
|--------|-------------|
| `--cluster` | Cluster gateway URL for elastic cloud deployments (e.g., `https://gateway.example.com`) |
| `--gpu` | Target GPU type for cluster routing (e.g., `l4`, `a100-80gb`). Requires `--cluster`. |
| `--provision` | Wait for GPU capacity if not immediately available. Requires `--cluster`. |
| `--provision-timeout` | Max seconds to wait for GPU provisioning (default: 300) |
| `--wait-ready` | Wait for cluster GPU capacity before starting benchmark. Requires `--cluster`. |

**Experiment tracking:**

| Option | Description |
|--------|-------------|
| `--wandb-project` | W&B project name |
| `--wandb-entity` | W&B entity/team name |
| `--mlflow-experiment` | MLflow experiment name |
| `--mlflow-uri` | MLflow tracking URI |

**Examples:**

```bash
# Quality evaluation
sie-bench eval BAAI/bge-m3 -t mteb/NFCorpus --type quality

# Compare SIE vs TEI vs published benchmark
sie-bench eval BAAI/bge-m3 -t mteb/NFCorpus --type quality -s sie,tei,benchmark

# Performance benchmark
sie-bench eval BAAI/bge-m3 -t mteb/NFCorpus --type perf -s sie

# Save results as targets
sie-bench eval BAAI/bge-m3 -t mteb/NFCorpus --type quality --save-targets sie

# CI regression check
sie-bench eval BAAI/bge-m3 -t mteb/NFCorpus --type quality -s sie,targets --check-targets

# Print summary of all configured targets
sie-bench eval --print --type quality

# Evaluate on cluster with specific GPU
sie-bench eval BAAI/bge-m3 -t mteb/NFCorpus --type perf \
  --cluster http://gateway:8080 --gpu l4 --provision
```

### matrix

```bash
sie-bench matrix CONFIG --cluster URL [OPTIONS]
```

Run matrix evaluation across models, profiles, tasks, and GPUs.

| Argument/Option | Description |
|-----------------|-------------|
| `CONFIG` | Path to matrix config YAML |
| `--cluster`, `-c` | Cluster gateway URL (required) |
| `--workers`, `-w` | Number of parallel workers per GPU type (default: 1) |
| `--pool-timeout` | Timeout waiting for pools to become active, in seconds (default: 300) |
| `--models-dir` | Path to models directory |
| `--save-measurements`/`--no-save-measurements` | Save results to model configs (default: enabled) |
| `--output`, `-o` | Output format: `table`, `json`, `md` (default: `table`) |
| `--verbose`, `-v` | Enable verbose logging |

**Example:**

```bash
sie-bench matrix configs/eval-matrix.yaml --cluster http://gateway:8080 --workers 2
```

### loadtest

```bash
sie-bench loadtest SCENARIO --cluster URL [OPTIONS]
```

Run load test scenario against a SIE cluster.

| Argument/Option | Description |
|-----------------|-------------|
| `SCENARIO` | Path to load test scenario YAML |
| `--cluster`, `-c` | Cluster gateway URL |
| `--duration`, `-d` | Override scenario duration (seconds) |
| `--output`, `-o` | Output directory for reports |
| `--verbose`, `-v` | Verbose output |

**Example:**

```bash
sie-bench loadtest scenario.yaml --cluster http://gateway:8080 --duration 300
```

---

## sie-admin

> **Caution — Work in progress:**
>
> `sie-admin` is not yet released. The CLI and package name may change before general availability.

Source: [packages/sie_admin/src/sie_admin/cli.py](https://github.com/superlinked/sie/blob/main/packages/sie_admin/src/sie_admin/cli.py)

Cluster administration and cache management. Has three subcommand groups: `cache`, `cluster`, and `models`.

### cache populate

Source: [packages/sie_admin/src/sie_admin/commands/cache.py](https://github.com/superlinked/sie/blob/main/packages/sie_admin/src/sie_admin/commands/cache.py)

```bash
sie-admin cache populate [MODEL] [OPTIONS]
```

Download model weights to local cache or cluster cache.

| Argument/Option | Description |
|-----------------|-------------|
| `MODEL` | Model ID to populate (e.g., `BAAI/bge-m3`) |
| `--bundle`, `-b` | Bundle name to populate all models |
| `--target`, `-t` | Target S3/GCS URL for cluster cache |

**Examples:**

```bash
# Download single model to local cache
sie-admin cache populate BAAI/bge-m3

# Download all models in a bundle
sie-admin cache populate --bundle default

# Download and upload to cluster cache
sie-admin cache populate BAAI/bge-m3 --target s3://my-bucket/sie-cache/
```

### cache sync

```bash
sie-admin cache sync PATH --target URL [OPTIONS]
```

Sync model configs from local path to cluster storage.

| Argument/Option | Description |
|-----------------|-------------|
| `PATH` | Local path to model configs |
| `--target`, `-t` | Target S3/GCS URL |
| `--dry-run`, `-n` | Show what would be synced |

**Examples:**

```bash
# Sync configs to S3
sie-admin cache sync ./models --target s3://my-bucket/sie-models/

# Dry run to preview
sie-admin cache sync ./models -t s3://bucket/configs --dry-run
```

### cache status

```bash
sie-admin cache status
```

Show cache status including local and cluster cache contents, with model sizes and download status.

### cluster status

Source: [packages/sie_admin/src/sie_admin/commands/cluster.py](https://github.com/superlinked/sie/blob/main/packages/sie_admin/src/sie_admin/commands/cluster.py)

```bash
sie-admin cluster status GATEWAY [OPTIONS]
```

Show cluster status (workers, GPUs, models).

| Argument/Option | Description |
|-----------------|-------------|
| `GATEWAY` | Gateway URL (e.g., `gateway.example.com:8080`) |
| `--json`, `-j` | Output as JSON |

**Example:**

```bash
sie-admin cluster status gateway:8080
```

### cluster models

```bash
sie-admin cluster models GATEWAY [OPTIONS]
```

Show model availability across workers.

| Argument/Option | Description |
|-----------------|-------------|
| `GATEWAY` | Gateway URL |
| `--json`, `-j` | Output as JSON |

### models validate

```bash
sie-admin models validate PATH
```

Validate model config YAML files against the schema.

| Argument/Option | Description |
|-----------------|-------------|
| `PATH` | Path to model config(s) - supports glob patterns, local dirs, or cloud URLs |

**Examples:**

```bash
# Validate all models in a directory
sie-admin models validate ./models/

# Validate a single config
sie-admin models validate ./models/BAAI__bge-m3.yaml

# Validate configs in S3
sie-admin models validate s3://my-bucket/models/
```

### models list

```bash
sie-admin models list PATH [OPTIONS]
```

List models in a directory or bucket with their metadata.

| Argument/Option | Description |
|-----------------|-------------|
| `PATH` | Path to model configs (local or S3/GCS) |
| `--json`, `-j` | Output as JSON |

**Examples:**

```bash
# List models in local directory
sie-admin models list ./models

# List models in S3 bucket
sie-admin models list s3://my-bucket/models/

# Output as JSON for scripting
sie-admin models list ./models --json
```

---

## sie-top

Source: [packages/sie_admin/src/sie_admin/top/cli.py](https://github.com/superlinked/sie/blob/main/packages/sie_admin/src/sie_admin/top/cli.py)

Real-time TUI monitor for SIE servers and clusters.

```bash
sie-top [HOST:PORT] [OPTIONS]
```

| Argument/Option | Default | Description |
|-----------------|---------|-------------|
| `HOST:PORT` | `localhost:8080` | Server address |
| `--cluster`, `-c` | - | Force cluster mode (connect to gateway) |
| `--worker`, `-w` | - | Force worker mode (connect to single server) |

Mode is auto-detected by probing the gateway `/health` endpoint (falls back to worker mode if unavailable).

**Examples:**

```bash
# Monitor local server (auto-detect mode)
sie-top

# Monitor specific server
sie-top localhost:8080

# Force cluster mode (connect to gateway)
sie-top --cluster gateway.example.com:8080

# Force worker mode
sie-top --worker worker-0:8080
```

**Installation:**

The TUI requires optional dependencies:

```bash
pip install 'sie-admin[top]'
```

---

## sie-gateway

Source: [packages/sie_gateway/src/main.rs](https://github.com/superlinked/sie/blob/main/packages/sie_gateway/src/main.rs)

Stateless Rust request gateway for elastic cloud deployments.

### serve

```bash
sie-gateway serve [OPTIONS]
```

Start the SIE Gateway server.

| Option | Default | Description |
|--------|---------|-------------|
| `--port`, `-p` | `8080` | Port to listen on |
| `--host` | `0.0.0.0` | Host to bind to |
| `--worker`, `-w` | None | Worker URLs (can specify multiple times) |
| `--kubernetes` | `false` | Use Kubernetes service discovery |
| `--k8s-namespace` | `default` | Kubernetes namespace for discovery |
| `--k8s-service` | `sie-worker` | Kubernetes service name to discover |
| `--k8s-port` | `8080` | Worker port for K8s-discovered endpoints |
| `--log-level`, `-l` | `info` | Log level: `debug`, `info`, `warning`, `error` |
| `--json-logs` | `false` | Enable structured JSON logging (for Loki compatibility) |
| `--health-mode` | `ws` | Worker health mode (`ws` for WebSocket health, `nats` for NATS status) |
| `--bundles-dir` | None | Optional filesystem seed for bundle configs |
| `--models-dir` | None | Optional filesystem seed for model configs |

**Examples:**

```bash
# Static worker discovery
sie-gateway serve -w http://worker-0:8080 -w http://worker-1:8080

# Kubernetes discovery
sie-gateway serve --kubernetes --k8s-service sie-worker

# Development with local worker discovery
sie-gateway serve -w http://localhost:8080
```

### version

```bash
sie-gateway version
```

Show version information.

---

## Environment Variables

Many CLI options can be set via environment variables. CLI arguments override environment variables, which override defaults.

**Server (sie-server):**

| Variable | CLI Equivalent | Description |
|----------|----------------|-------------|
| `SIE_DEVICE` | `--device` | Inference device (`cuda`, `mps`, `cpu`) |
| `SIE_MODELS_DIR` | `--models-dir` | Models config directory |
| `SIE_MODEL_FILTER` | `--models` | Comma-separated model names to load |
| `SIE_LOCAL_CACHE` | `--local-cache` | Local cache directory for weights |
| `SIE_CLUSTER_CACHE` | `--cluster-cache` | Cluster cache URL (`s3://` or `gs://`) |
| `SIE_HF_FALLBACK` | `--hf-fallback` | Enable HF Hub fallback (`true`/`false`) |
| `SIE_LOG_JSON` | `--json-logs` | Enable JSON logging (`true`/`false`) |
| `SIE_TRACING_ENABLED` | `--tracing` | Enable OpenTelemetry tracing |
| `SIE_GPU_TYPE` | - | Override detected GPU type |
| `SIE_MEMORY_PRESSURE_THRESHOLD_PERCENT` | - | GPU memory pressure threshold (0-100) |
| `SIE_IMAGE_WORKERS` | - | Image preprocessing worker count (default: 4) |
| `SIE_INSTRUMENTATION` | - | Enable detailed instrumentation |

**Gateway (sie-gateway):**

| Variable | CLI Equivalent | Description |
|----------|----------------|-------------|
| `SIE_GATEWAY_WORKERS` | `--worker` | Comma-separated worker URLs |
| `SIE_GATEWAY_KUBERNETES` | `--kubernetes` | Enable K8s discovery (`true`/`false`) |
| `SIE_GATEWAY_K8S_NAMESPACE` | `--k8s-namespace` | K8s namespace |
| `SIE_GATEWAY_K8S_SERVICE` | `--k8s-service` | K8s service name |
| `SIE_GATEWAY_K8S_PORT` | `--k8s-port` | K8s worker port |
| `SIE_GATEWAY_ENABLE_POOLS` | - | Enable resource pools (`true`/`false`) |
| `SIE_GATEWAY_CONFIGURED_GPUS` | - | Comma-separated configured GPU types |
| `SIE_CONFIG_SERVICE_URL` | - | Config service URL for bootstrap and drift polling |
| `SIE_LOG_JSON` | `--json-logs` | Enable JSON logging |

See [Configuration](/docs/reference/configuration/) for the complete list.