---
title: Offline / Air-Gapped Deployment
description: Run SIE in clusters with no public internet access. Mirror weights and images to private storage.
canonical_url: https://superlinked.com/docs/deployment/offline
last_updated: 2026-05-06
---

Bring SIE up in a cluster with no public internet access. The worker pods normally pull model weights from HuggingFace and container images from GHCR; both of those need to come from inside your network instead.

This guide covers a typical air-gapped flow:

1. Snapshot model weights on a workstation that has internet access.
2. Mirror the snapshot to private S3-compatible storage reachable from the cluster.
3. Configure the chart to read weights from that store and skip HuggingFace.
4. Mirror the SIE container images to a private registry.
5. Verify first inference with no egress.

The same pattern works for "restricted egress" clusters that allow private object storage but block public HuggingFace.

## 1. Snapshot model weights

The simplest tool is `huggingface-cli` (already a dependency of any SIE workstation):

```bash
export HF_HUB_CACHE=./offline-weights

# One model
huggingface-cli download BAAI/bge-m3 --cache-dir ./offline-weights

# A bundle's worth of models, repeated for each model in the bundle
huggingface-cli download intfloat/e5-base-v2 --cache-dir ./offline-weights
huggingface-cli download mixedbread-ai/mxbai-rerank-large-v1 --cache-dir ./offline-weights
```

The result is a directory in HuggingFace cache layout (`./offline-weights/models--BAAI--bge-m3/snapshots/<sha>/...`) that the chart can mount as `HF_HUB_CACHE`.

Set `HF_TOKEN` before running for any gated models.

## 2. Mirror to private storage

Push the snapshot to S3-compatible storage that the cluster can reach. AWS S3, GCS, MinIO, and Ceph all work; the chart treats them the same.

```bash
# AWS S3
aws s3 sync ./offline-weights s3://sie-models-private/weights/

# MinIO (in-cluster or on-prem)
mc mirror ./offline-weights minio/sie-models-private/weights/

# GCS
gsutil -m rsync -r ./offline-weights gs://sie-models-private/weights/
```

Whatever you choose, the URL handed to the chart in the next step must be reachable from worker pods.

## 3. Configure the cluster cache

Point the chart's `workers.common.clusterCache` at the mirrored bucket. Workers will read weights from there instead of HuggingFace.

```yaml
# values-offline.yaml
workers:
  common:
    clusterCache:
      enabled: true
      url: s3://sie-models-private/weights/   # or gs:// for GCS

    # Disable HuggingFace fallback so workers fail fast if the cache is incomplete
    hfCache:
      home: /models/huggingface
      tokenSecret: ""

# Skip HF token wiring entirely in air-gapped clusters
hfToken:
  create: false
```

For S3, the workers authenticate via IRSA (EKS) or static credentials supplied through `extraEnv`. For GCS, they use Workload Identity (GKE). For MinIO or other S3-compatibles, mount credentials via a secret and pass them through `workers.common.extraEnv`.

## 4. Mirror container images

The chart pulls these public images from GHCR by default:

| Image | Where it's set |
|-------|----------------|
| `ghcr.io/superlinked/sie-server` | `workers.common.image.repository` |
| `ghcr.io/superlinked/sie-gateway` | `gateway.image.repository` |
| `ghcr.io/superlinked/sie-config` | `config.image.repository` |

For air-gapped clusters, mirror them to a private registry once:

```bash
# Replace with your version tag
TAG=v0.3.1

for img in sie-server sie-gateway sie-config; do
  docker pull ghcr.io/superlinked/$img:$TAG
  docker tag  ghcr.io/superlinked/$img:$TAG private-registry.example.com/sie/$img:$TAG
  docker push private-registry.example.com/sie/$img:$TAG
done
```

Then point the chart at your registry:

```yaml
# values-offline.yaml (continued)
gateway:
  image:
    repository: private-registry.example.com/sie/sie-gateway
    tag: v0.3.1

config:
  image:
    repository: private-registry.example.com/sie/sie-config
    tag: v0.3.1

workers:
  common:
    image:
      repository: private-registry.example.com/sie/sie-server
      tag: v0.3.1

global:
  imagePullSecrets:
    - name: regcred
```

If your registry needs auth, create the `regcred` Docker secret in the `sie` namespace before installing the chart:

```bash
kubectl create secret docker-registry regcred \
  --docker-server=private-registry.example.com \
  --docker-username=... \
  --docker-password=... \
  -n sie
```

## 5. Install and verify

Install the chart with the offline values, no internet egress required:

```bash
helm upgrade --install sie oci://ghcr.io/superlinked/charts/sie-cluster \
  --version 0.3.1 \
  -f values-offline.yaml \
  -n sie --create-namespace
```

If you also mirrored the chart itself (recommended for fully air-gapped), pull it once with `helm pull oci://ghcr.io/superlinked/charts/sie-cluster --version 0.3.1` and install from the local `.tgz`:

```bash
helm pull oci://ghcr.io/superlinked/charts/sie-cluster --version 0.3.1
# Move sie-cluster-0.3.1.tgz onto the air-gapped workstation, then:
helm upgrade --install sie ./sie-cluster-0.3.1.tgz \
  -f values-offline.yaml \
  -n sie --create-namespace
```

Verify first inference exactly like the [GCP](/docs/deployment/cloud-gcp/) or [AWS](/docs/deployment/cloud-aws/) guides:

```bash
kubectl -n sie port-forward svc/sie-sie-cluster-gateway 8080:8080 &

python3 -c "
from sie_sdk import SIEClient

client = SIEClient('http://localhost:8080')
result = client.encode('BAAI/bge-m3', {'text': 'hello world'},
                       gpu='l4', wait_for_capacity=True, provision_timeout_s=600)
print(result['dense'].shape)  # (1024,)
"
```

The first request still pays the cold-start cost, but the weight load now comes from your private store rather than HuggingFace.

## Troubleshooting

| Symptom | Likely cause |
|---------|--------------|
| Worker pod stuck in `Init` with `403 Forbidden` from S3/GCS | IRSA/Workload Identity missing the bucket-read permission |
| `ImagePullBackOff` on a worker pod | Registry credentials missing, or `imagePullSecrets` not wired |
| Worker logs show `OSError: Couldn't reach huggingface.co` | `clusterCache` URL typo or bucket missing the requested model |
| Chart install hangs on dependency download | Sub-charts (KEDA, kube-prometheus-stack, DCGM) trying to fetch from public Artifact Hub. Use `helm pull` with `--untar` and install the local copy. |

## What's Next

- [Kubernetes in GCP](/docs/deployment/cloud-gcp/) for the online quickstart this builds on
- [Kubernetes in AWS](/docs/deployment/cloud-aws/) for the EKS counterpart
- [Config GitOps Workflow](/docs/deployment/config-gitops/) for managing model configs without redeploying the chart
- [Upgrade Runbook](/docs/deployment/upgrades/) for rolling updates and rollback
