How Does Haystack Work with Embedding Models?
Haystack is an open-source framework for building NLP and RAG pipelines by connecting modular components (document stores, retrievers, readers, and generators) into composable directed acyclic graphs. It works with embedding models by providing retriever components that call the embedding model at query and index time. SIE integrates with Haystack as a self-hosted embedding backend, replacing managed API calls with GPU inference in your own cloud.
Why Haystack?
Haystack is the right choice when:
- You want a structured pipeline framework rather than writing retrieval logic from scratch
- You need to compose complex multi-hop or multi-stage pipelines (retrieve → rerank → generate → verify)
- You’re building production RAG systems and want pre-built components for evaluation, caching, and monitoring
- You want to swap components (different retrievers, LLMs, document stores) without rewriting pipeline logic
- You need multi-modal pipelines handling both text and images
Haystack’s component abstraction means you can prototype with OpenAI embeddings and switch to SIE self-hosted inference for production without changing pipeline logic.
Core Haystack concepts
Components: individual pipeline steps with typed inputs and outputs (EmbeddingRetriever, SentenceTransformersTextEmbedder, OpenAIGenerator, etc.)
Pipeline: a directed acyclic graph of components connected by their input/output types
Document Store: the storage layer (InMemoryDocumentStore, QdrantDocumentStore, WeaviateDocumentStore, etc.)
Documents: Haystack’s data type representing a piece of text with metadata and an optional embedding vector
How SIE integrates with Haystack
SIE acts as the embedding backend. Use a custom TextEmbedder component that calls SIE:
from haystack import component, Document, Pipelinefrom haystack.document_stores.in_memory import InMemoryDocumentStorefrom haystack.components.retrievers.in_memory import InMemoryEmbeddingRetrieverfrom sie_sdk import SIEClientfrom sie_sdk.types import Item
# Custom SIE embedder component@componentclass SIETextEmbedder: def __init__(self, model: str = "BAAI/bge-m3"): self.client = SIEClient("http://localhost:8080") self.model = model
@component.output_types(embedding=list[float]) def run(self, text: str): result = self.client.encode(self.model, Item(text=text), is_query=True) return {"embedding": result["dense"].tolist()}
@componentclass SIEDocumentEmbedder: def __init__(self, model: str = "BAAI/bge-m3"): self.client = SIEClient("http://localhost:8080") self.model = model
@component.output_types(documents=list[Document]) def run(self, documents: list[Document]): texts = [doc.content for doc in documents] encode_results = self.client.encode( self.model, [Item(text=t) for t in texts], ) for doc, res in zip(documents, encode_results): doc.embedding = res["dense"].tolist() return {"documents": documents}Building an indexing pipeline with SIE + Haystack
from haystack import Pipelinefrom haystack.components.writers import DocumentWriter
document_store = InMemoryDocumentStore()
# Indexing pipelineindexing_pipeline = Pipeline()indexing_pipeline.add_component("embedder", SIEDocumentEmbedder())indexing_pipeline.add_component("writer", DocumentWriter(document_store=document_store))indexing_pipeline.connect("embedder.documents", "writer.documents")
# Index documentsraw_docs = [Document(content=chunk, meta={"source": src}) for chunk, src in zip(chunks, sources)]indexing_pipeline.run({"embedder": {"documents": raw_docs}})Building a RAG query pipeline with SIE + Haystack
from haystack.components.generators import OpenAIGeneratorfrom haystack.components.builders import PromptBuilder
PROMPT = """Answer the question using the provided context.Context: {% for doc in documents %}{{ doc.content }}{% endfor %}Question: {{ query }}"""
rag_pipeline = Pipeline()rag_pipeline.add_component("query_embedder", SIETextEmbedder())rag_pipeline.add_component("retriever", InMemoryEmbeddingRetriever(document_store=document_store, top_k=5))rag_pipeline.add_component("prompt_builder", PromptBuilder(template=PROMPT))rag_pipeline.add_component("llm", OpenAIGenerator(model="gpt-4o"))
rag_pipeline.connect("query_embedder.embedding", "retriever.query_embedding")rag_pipeline.connect("retriever.documents", "prompt_builder.documents")rag_pipeline.connect("prompt_builder.prompt", "llm.prompt")
result = rag_pipeline.run({ "query_embedder": {"text": "What are the termination conditions?"}, "prompt_builder": {"query": "What are the termination conditions?"}})
print(result["llm"]["replies"][0])Using Haystack with Qdrant and SIE
For production, swap InMemoryDocumentStore for QdrantDocumentStore:
from haystack_integrations.document_stores.qdrant import QdrantDocumentStorefrom haystack_integrations.components.retrievers.qdrant import QdrantEmbeddingRetriever
document_store = QdrantDocumentStore( url="http://localhost:6333", index="documents", embedding_dim=1024, # BGE-M3 output dim)
retriever = QdrantEmbeddingRetriever(document_store=document_store, top_k=20)Haystack vs LangChain vs LlamaIndex
| Haystack | LangChain | LlamaIndex | |
|---|---|---|---|
| Pipeline model | Typed DAG | Chain / Agent | Query engine |
| Component typing | Strict | Loose | Medium |
| RAG focus | ✓ | General | ✓ |
| Evaluation tooling | Strong | Growing | Good |
| Production maturity | High | High | High |
| Best for | Production RAG, evaluation | Agents, diverse tasks | Document QA |
All three integrate well with SIE; the choice comes down to team familiarity and pipeline complexity.
Frequently asked questions
Does Haystack have a built-in SIE integration? The SIE SDK is used via a custom component as shown above. A native Haystack SIE integration is available. See the SIE + Haystack integration guide for the current implementation.
Can Haystack pipelines be serialised and deployed? Yes. Haystack pipelines serialise to YAML, enabling reproducible deployments and version-controlled pipeline definitions.
Does Haystack support reranking with SIE?
Yes. Add a custom reranker component that calls client.score() between the retriever and prompt builder steps.