Self-hosted inference
for search & document processing
# Configure module "sie" { source = "superlinked/sie/aws" region = "us-east-1" gpus = ["a100-40gb", "l4-spot"] } # Deploy > terraform apply > helm install sie oci://ghcr.io/superlinked/charts/sie-cluster # Use > pip install sie-sdk client.encode("BAAI/bge-m3", Item(text="indemnification"), options={"lora": "legal"})
# Configure module "sie" { source = "superlinked/sie/google" region = "us-central1" gpus = ["a100-40gb", "l4-spot"] } # Deploy > terraform apply > helm install sie oci://ghcr.io/superlinked/charts/sie-cluster # Use > pip install sie-sdk client.encode("BAAI/bge-m3", Item(text="indemnification"), options={"lora": "legal"})
# Run > docker run -p 8080:8080 ghcr.io/superlinked/sie-server # Use > pip install sie-sdk client.encode("BAAI/bge-m3", Item(text="indemnification"), options={"lora": "legal"})
Works with your favorite tools
Browse integrations"Chroma makes context engineering simple. SIE adds instruction-following rerankers and relationship extractors for even more precise retrieval."
"LanceDB centralizes multi-modal training datasets and with SIE you can self-host inference for all the required data transformations."
"Modern search systems compose the best indexing, scoring, filtering and ranking models. With SIE you can self-host them all in one cluster."
"Weaviate's Query Agent unlocks natural language search and with SIE you can pre-process your query and data for better latency."
Benefits of self-hosted inference
Pay for your own GPUs instead of per-token API pricing. Improve GPU utilization and stability vs. custom TEI/Infinity deployments.
Boost accuracy with latest task-specific open source models. Embeddings, rerankers, extraction, including multi-modal and multi-vector.
Data never leaves your AWS/GCP. You pick models and configurations. SOC2 Type2 certified. Apache 2.0 licensed.
Learn from our example apps
Browse examples
SIE: Superlinked Inference Engine
Run all your Search & Document processing inference in one centralized cluster across teams and workloads.
Build your apps
> pip install sie-sdk > npm install @superlinked/sie-sdk
and 5+ framework integrations
Manage models & configurations via SDK
client.list_models()
Deploy the cluster
> helm install sie
oci://ghcr.io/superlinked/
charts/sie-cluster
Observe with cloud-native tools, grafana and
> sie-top
Create the infrastructure
module "sie" { source = "superlinked/sie/aws" region = "us-east-1" gpus = ["a100-40gb", "l4-spot"] }
Deploy
> terraform apply
Plan your self-deployment
How SIE fits in your stack
See where SIE sits in a typical retrieval pipeline alongside vector databases, orchestration frameworks, and your application layer.
Cost Comparison
Compare across models, GPU types, and cloud providers.