Config API

Add models to a running SIE cluster with a single API call. If the model’s adapter is already in a deployed bundle, no adapter image rebuild is needed. The change is written to sie-config, distributed over NATS, and mirrored by every gateway replica.

The Config API is split across two services:

Service	Role
`sie-config`	Authoritative control plane. Owns writes, persistence, bundle metadata, snapshots, epoch, and NATS publishing.
`sie-gateway`	Read-side cache. Serves config reads, resolve, and per-replica SIE server sidecar readiness status. It does not handle config writes.

Quick Example

# Add a model at runtime through sie-config
curl -X POST http://sie-config:8080/v1/configs/models \
  -H "Content-Type: application/x-yaml" \
  -H "Authorization: Bearer $SIE_ADMIN_TOKEN" \
  -H "Idempotency-Key: add-e5-base-001" \
  -d '
sie_id: intfloat/multilingual-e5-base
hf_id: intfloat/multilingual-e5-base
profiles:
  default:
    adapter_path: sie_server.adapters.sentence_transformer:SentenceTransformerAdapter
    max_batch_tokens: 8192
    adapter_options:
      loadtime: {}
      runtime:
        pooling: mean
        normalize: true
'

Response:

{
  "model_id": "intfloat/multilingual-e5-base",
  "created_profiles": ["default"],
  "existing_profiles_skipped": [],
  "warnings": [],
  "routable_bundles_by_profile": {"default": ["default"]},
  "router_id": "sie-config"
}

This response means the config was accepted, persisted, and applied to sie-config’s registry. NATS publish failures are surfaced in warnings; a fully unavailable publisher returns 503. The response does not mean every eligible SIE server sidecar has reported readiness for the model. To check serving readiness, poll a gateway replica:

curl http://sie-gateway:8080/v1/configs/models/intfloat/multilingual-e5-base/status \
  -H "Authorization: Bearer $SIE_AUTH_TOKEN"

{
  "model_id": "intfloat/multilingual-e5-base",
  "config_epoch": 42,
  "all_bundles_acked": true,
  "no_bundles": false,
  "bundles": [
    {
      "bundle_id": "default",
      "expected_bundle_config_hash": "sha256...",
      "total_eligible_workers": 3,
      "acked_workers": ["worker-a", "worker-b", "worker-c"],
      "pending_workers": [],
      "acked": true
    }
  ],
  "source": "gateway-registry"
}

How It Works

Admin client
  -> POST /v1/configs/models on sie-config
  -> persist model YAML to ConfigStore
  -> mutate sie-config ModelRegistry
  -> increment config epoch
  -> publish NATS deltas:
       sie.config.models.{bundle_id} -> SIE server sidecars inside worker pods
       sie.config.models._all        -> gateways

Gateways
  -> apply _all deltas to their ModelRegistry
  -> poll /v1/configs/epoch for missed deltas or bundle drift
  -> expose /v1/configs/models/{id}/status for readiness

Admin tooling sends POST /v1/configs/models to sie-config.
sie-config validates that every new profile’s adapter_path is routable by at least one known bundle.
A single-process asyncio write lock serializes persist, registry mutation, epoch increment, and NATS publish.
SIE server sidecars subscribed to sie.config.models.{bundle_id} receive bundle-scoped config notifications and forward accepted YAML to the sie-server adapter over IPC.
Gateways subscribed to sie.config.models._all update their in-memory registries.
SIE server sidecars publish the updated bundle_config_hash in sie.health.<worker_id> after the sie-server adapter accepts the config, or after export reconciliation catches up.
Gateway /status endpoints expose whether this replica has eligible SIE server sidecar health records with the expected hash.

When to Use

Scenario	Use Config API?	Alternative
Add a model with an existing adapter	Yes	-
Add a new profile to an existing model	Yes	-
Add a model that needs a new adapter	No	Create adapter, rebuild bundle image
Add a new bundle	No	Define in repo, rebuild images
Change a model’s adapter_path	No	Append-only; create a new profile instead

The Config API is append-only. You can add models and profiles, but not modify or delete existing ones.

Endpoints

Endpoint Placement

Endpoint	`sie-config`	`sie-gateway`
`POST /v1/configs/models`	Yes	No, returns `405 Method Not Allowed`
`GET /v1/configs/models`	Yes	Yes, from gateway registry
`GET /v1/configs/models/{id}`	Yes	Yes, from gateway registry
`GET /v1/configs/models/{id}/status`	No	Yes, per-replica config-hash readiness
`GET /v1/configs/bundles`	Yes	Yes
`GET /v1/configs/bundles/{id}`	Yes	Yes
`POST /v1/configs/resolve`	Yes	Yes
`GET /v1/configs/export`	Yes	No, consumed by gateways
`GET /v1/configs/epoch`	Yes	No, consumed by gateways

List Models

curl http://sie-gateway:8080/v1/configs/models

{
  "models": [
    {
      "model_id": "BAAI/bge-m3",
      "profiles": ["default", "sparse"],
      "source": "gateway-registry"
    },
    {
      "model_id": "intfloat/multilingual-e5-base",
      "profiles": ["default"],
      "source": "gateway-registry"
    }
  ]
}

On the gateway, source: "gateway-registry" means the response comes from that replica’s in-memory config mirror. Call sie-config directly if you need to distinguish persisted API-added models from filesystem seed models.

Get Model

curl http://sie-gateway:8080/v1/configs/models/BAAI/bge-m3

On the gateway, this returns a minimal YAML registry view with sie_id, source: gateway-registry, and compatible bundles. Call sie-config directly for the full stored model YAML with profile definitions.

Add Model

curl -X POST http://sie-config:8080/v1/configs/models \
  -H "Content-Type: application/x-yaml" \
  -H "Authorization: Bearer $SIE_ADMIN_TOKEN" \
  -d @model-config.yaml

Status	Meaning
`201`	Model or profiles created
`200`	All profiles already existed (idempotent)
`400`	Invalid YAML
`401`	`SIE_ADMIN_TOKEN` is configured but the request is missing bearer auth
`403`	Write attempted with only the inference token configured
`409`	Profile exists with different content (content-equality check)
`422`	Validation failed (unroutable adapter, missing fields)
`503`	NATS unavailable or config store unavailable

The gateway does not register this route. If you send the same POST to sie-gateway, the response is 405 Method Not Allowed.

List Bundles

curl http://sie-gateway:8080/v1/configs/bundles

{
  "bundles": [
    {
      "bundle_id": "default",
      "priority": 10,
      "adapter_count": 18,
      "source": "gateway-registry",
      "connected_workers": 3
    }
  ]
}

Get Bundle

curl http://sie-gateway:8080/v1/configs/bundles/default

Returns bundle metadata as YAML including the adapter list.

Resolve Routing

curl -X POST http://sie-gateway:8080/v1/configs/resolve \
  -H "Content-Type: application/json" \
  -d '{"model": "BAAI/bge-m3", "bundle": "default"}'

Returns the bundle that would be selected for a request without executing inference. Omit bundle to use the registry’s default bundle priority, or use the default:/BAAI/bge-m3 model-spec form for an explicit bundle override.

Config YAML Format

The model config format is the same as static model configs. For runtime writes, sie-config validates the YAML schema and requires new profiles to be routable by existing bundle adapters. Full metadata such as hf_id, inputs, and tasks is recommended for catalog quality; many adapters can run from sie_id plus profiles alone.

Minimal Config

sie_id: intfloat/multilingual-e5-base
profiles:
  default:
    adapter_path: sie_server.adapters.sentence_transformer:SentenceTransformerAdapter
    max_batch_tokens: 8192

Full Config

sie_id: intfloat/multilingual-e5-base
hf_id: intfloat/multilingual-e5-base

inputs:
  text: true

tasks:
  encode:
    dense:
      dim: 768

profiles:
  default:
    adapter_path: sie_server.adapters.sentence_transformer:SentenceTransformerAdapter
    max_batch_tokens: 8192
    adapter_options:
      loadtime: {}
      runtime:
        pooling: mean
        normalize: true
  financial:
    extends: default
    adapter_options:
      runtime:
        pooling: mean
        normalize: true
        instruction: "Retrieve financial documents"

Profile Append

POST the same sie_id with additional profiles. Existing profiles are skipped; new ones are created.

sie_id: intfloat/multilingual-e5-base
profiles:
  default:
    adapter_path: sie_server.adapters.sentence_transformer:SentenceTransformerAdapter
    max_batch_tokens: 8192
  medical:
    extends: default
    adapter_options:
      runtime:
        instruction: "Retrieve medical literature"

Response: 201 with created_profiles: ["medical"] and existing_profiles_skipped: ["default"].

Serving Readiness

POST /v1/configs/models does not wait for SIE server sidecar convergence. sie-config has no sidecar-health registry, so readiness is a read-side concern on each gateway replica. SIE server sidecars advertise the local bundle_config_hash after the sie-server adapter applies a config delta or replays a missed entry from GET /v1/configs/export.

Field	Description
`config_epoch`	Highest control-plane epoch applied on this gateway
`all_bundles_acked`	`true` when every eligible bundle has at least one healthy SIE server sidecar health record with the expected hash
`no_bundles`	`true` when the model resolves to zero bundles on this gateway
`bundles[].expected_bundle_config_hash`	Hash SIE server sidecars must report for the bundle
`bundles[].acked_workers`	Healthy worker IDs whose reported hash matches
`bundles[].pending_workers`	Healthy eligible worker IDs that have not reported the expected hash

all_bundles_acked: false does not mean the write failed. The model can already be in the catalog while SIE server sidecar health is still catching up or the worker pod is scaling from zero. Admin tooling that needs a fleet-wide view should poll every gateway replica.

Persistence

API-added models are persisted by sie-config, not by the gateway. On sie-config startup, SIE_CONFIG_RESTORE=true restores model configs from the configured store. Gateways do not read the store directly; they fetch snapshots from sie-config.

Storage Backends

Backend	Config	Use Case
Local filesystem	`SIE_CONFIG_STORE_DIR=/data/config`	Development or Kubernetes PVC
S3	`SIE_CONFIG_STORE_DIR=s3://bucket/prefix`	AWS production persistence
GCS	`SIE_CONFIG_STORE_DIR=gs://bucket/prefix`	GCP production persistence

sie-config runs as a single writer. The local backend writes atomically with a temp file, fsync, and replace; cloud backends use object-store PUT semantics.

Environment Variables

Variable	Default	Description
`SIE_CONFIG_STORE_DIR`	Local pod filesystem	Config store path used by `sie-config`
`SIE_CONFIG_RESTORE`	`false`	Set to `true` to restore API-added models from the store on `sie-config` startup
`SIE_NATS_URL`	None	NATS server URL for config distribution
`SIE_BUNDLES_DIR`	`/app/bundles`	Bundle YAML directory baked into the `sie-config` image
`SIE_MODELS_DIR`	`/app/models`	Baseline model YAML directory baked into the `sie-config` image

NATS Distribution

Config changes are distributed to SIE server sidecars and gateways via NATS Core pub/sub. NATS is transport for config deltas, not the durable source of truth.

Subject	Subscribers	Purpose
`sie.config.models.{bundle_id}`	SIE server sidecars in that bundle	Per-bundle config notifications
`sie.config.models._all`	All gateways	Gateway registry sync

Gateway Recovery

Gateways recover missed messages by polling sie-config:

GET /v1/configs/epoch returns the authoritative epoch plus a bundles_hash.
If the epoch or bundle hash drifts, the gateway re-runs bootstrap.
Bootstrap fetches bundles from GET /v1/configs/bundles{,/{id}} and models from GET /v1/configs/export.

NATS Unavailable

If NATS is configured but temporarily unavailable:

Config writes return 503 with {"detail": {"error": "nats_unavailable", "message": "..."}} rather than persisting a change that cannot be distributed.
Existing inference depends on the separate JetStream work queue and continues only if that queue path is healthy.
Once config pub/sub recovers, gateways close any missed-delta gap through the epoch poller.

If only some bundle publishes fail, the write can still return 201 with a warnings entry such as nats_publish_partial. The config is durable, and gateways recover through the epoch/export path; SIE server sidecars on the affected bundle may lag until their live subscriber or export reconciler catches up.

Authentication

Config API uses the same auth tokens as the rest of the SIE API:

Operation	Token Required
`GET /v1/configs/*`	`SIE_AUTH_TOKEN` or `SIE_ADMIN_TOKEN` depending on deployment auth mode
`POST /v1/configs/models` on `sie-config`	`SIE_ADMIN_TOKEN`
`GET /v1/configs/export` on `sie-config`	`SIE_ADMIN_TOKEN`

If neither token is configured, all endpoints are open (development mode). If SIE_AUTH_TOKEN is set but SIE_ADMIN_TOKEN is not, writes are rejected with 403; the inference token never grants config-write access.

Helm Configuration

Kubernetes deployments run sie-config and sie-gateway as separate deployments. Enable NATS-based config distribution and persistent config storage in Helm values:

nats:
  enabled: true

config:
  enabled: true
  configStore:
    enabled: true
    size: 10Gi

gateway:
  replicas: 2

The chart’s built-in persistence path is the config.configStore PVC. The sie-config service also supports SIE_CONFIG_STORE_DIR=s3://... or gs://..., but wiring that environment variable requires a chart overlay or custom deployment because the stock values file does not expose an extraEnv knob for the config service.

Limitations

Append-only: Models and profiles cannot be modified or deleted after creation.
Adapter must be bundled: The model’s adapter_path must exist in at least one known bundle. Adding models that require new adapters still requires an image rebuild.
Bundles are build-time only: Bundles cannot be created or modified via API. Rebuild and redeploy sie-config plus worker-pod images for bundle changes; gateways pick up the new bundle set from sie-config.
sie-config is single-writer: Run one replica. Multi-replica writes require shared idempotency state, which is intentionally not part of the current topology.
Readiness is per gateway replica: GET /v1/configs/models/{id}/status reports the SIE server sidecar health records visible to that gateway. Poll all replicas for a fleet-wide view.
Gateway cold start depends on sie-config: A fresh gateway that cannot reach sie-config starts with whatever optional filesystem seed was mounted. In the default deployment, typed requests may return 404 until bootstrap succeeds.