Why did we open-source our inference engine? Read the post

SIE Telemetry

SIE (Search Inference Engine) collects anonymous usage telemetry to help us understand adoption: how many workers are running, which versions, on which hardware. Telemetry is on by default. You can disable it completely; see below.

What we collect

Every running sie-server worker sends a JSON heartbeat to telemetry.superlinked.com roughly once per hour. The payload contains exactly these fields:

FieldTypeExamplePurpose
worker_idUUIDa1b2c3d4-...Identifies this worker process. Persisted to disk so it survives restarts on the same volume.
sent_atISO 8601 UTC2026-04-15T12:00:00+00:00When the worker sent this heartbeat.
eventenuminit, update, terminateLifecycle marker: startup, hourly check-in, graceful shutdown.
sie_versionstring0.1.10Installed sie-server package version.
variantstring or nullcuda12-defaultBuild variant (platform + bundle). Null when running outside Helm.
osstringlinuxOperating system.
archstringamd64CPU architecture.
gpuslist of strings["NVIDIA L4"]GPU model names detected via NVML. Empty on CPU-only.
deployment_envstringproductionEnvironment label (production, staging, development, ci).

At the receiver, Vercel adds coarse geography from the request's egress IP: country, region, city. This reflects the egress point, not necessarily the data center location; cloud workloads behind NAT or VPN report the gateway's geography.

What we do not collect

  • IP addresses (the receiver reads Vercel geo headers; the raw IP is never stored)
  • Hostnames, cluster names, namespace names
  • API keys, tokens, or authentication material
  • Cloud account identifiers (AWS account ID, GCP project, etc.)
  • Request data, model inputs, or inference results
  • Which models are loaded

Why we collect it

Version adoption rates, GPU hardware distribution, and rough geographic spread. This tells us which versions need backport fixes, which GPU types to optimize for, and where to focus documentation and support. Nothing more.

How to opt out

Any of these disables telemetry completely. No background task starts, no outbound requests, no file writes.

Environment variable (any deployment)

SIE_TELEMETRY_DISABLED=1

Accepts 1, true, or yes (case-insensitive).

DO_NOT_TRACK convention

DO_NOT_TRACK=1

Honors the consoledonottrack.com convention.

Helm chart

telemetry:
  enabled: false

This sets SIE_TELEMETRY_DISABLED=1 on the container.

Endpoint override

Enterprise customers can route telemetry through their own collectors:

SIE_TELEMETRY_URL=https://your-collector.example.com/api/telemetry

Or in the Helm chart:

telemetry:
  url: "https://your-collector.example.com/api/telemetry"

Data retention

Heartbeat data is retained for 24 months, then deleted by an automated monthly cleanup job.

Questions

Open an issue at github.com/superlinked/sie/issues or email support@superlinked.com.

The sender source code is at observability/telemetry.py.

Self-hosted inference for search & document processing

Cut API costs by 50x, boost quality with 85+ SOTA models, and keep your data in your own cloud.

Github 1.5K

Contact us

Tell us about your use case and we'll get back to you shortly.