Search & Retrieval

How Does Chroma Work with Embedding Models?

Chroma is an open-source vector database designed for simplicity. It runs in-process with Python, requires no infrastructure setup for development, and persists to disk or runs as a server for production. It works with embedding models by accepting vectors at insert time, generated by SIE, and enables cosine similarity search with metadata filtering. Chroma is the fastest way to prototype a RAG pipeline before graduating to a more scalable vector database.

Why Chroma?

Chroma’s key advantage is minimal friction:

Zero-config local setup: pip install chromadb and you’re running
In-process or client-server: run embedded for scripts, as a server for applications
Simple, Pythonic API: designed to be readable and easy to understand
LangChain and LlamaIndex native: first-class integration in both frameworks
Good for prototyping: get a RAG pipeline working in an afternoon

The trade-off: Chroma lacks the performance, scalability, and advanced features (native hybrid search, multi-vector) of Qdrant or Weaviate at production scale.

How Chroma and SIE work together

import chromadb
from sie_sdk import SIEClient
from sie_sdk.types import Item

sie = SIEClient("http://localhost:8080")
chroma = chromadb.Client()  # in-memory for dev; chromadb.PersistentClient() for disk

# Create collection
collection = chroma.create_collection(
    name="documents",
    metadata={"hnsw:space": "cosine"}
)

# Encode documents with SIE
encode_results = sie.encode("BAAI/bge-m3", [Item(text=c) for c in document_chunks])
vectors = [r["dense"] for r in encode_results]

# Insert — pass vectors directly (embedding_function=None)
collection.add(
    ids=[str(i) for i in range(len(document_chunks))],
    embeddings=[v.tolist() for v in vectors],
    documents=document_chunks,
    metadatas=[{"source": s, "date": d} for s, d in zip(sources, dates)]
)

# Search
query_vector = sie.encode("BAAI/bge-m3", Item(text=user_query), is_query=True)["dense"]

results = collection.query(
    query_embeddings=[query_vector.tolist()],
    n_results=10,
    where={"date": {"$gte": "2024-01-01"}}  # metadata filter
)

Chroma in persistent / server mode

For applications (not just scripts), use persistent or server mode:

# Persistent (single process, data saved to disk)
chroma = chromadb.PersistentClient(path="/path/to/chroma-db")

# Server mode (multi-process, suitable for production)
# Start server: chroma run --path /path/to/db
chroma = chromadb.HttpClient(host="localhost", port=8000)

Both modes use the same API, and the collection operations are identical.

Metadata filtering in Chroma

Chroma uses a MongoDB-style filter syntax:

# Filter by exact match
results = collection.query(
    query_embeddings=[query_vector.tolist()],
    n_results=10,
    where={"source": "legal-contracts"}
)

# Filter with operators
results = collection.query(
    query_embeddings=[query_vector.tolist()],
    n_results=10,
    where={
        "$and": [
            {"date": {"$gte": "2024-01-01"}},
            {"category": {"$in": ["contract", "agreement"]}}
        ]
    }
)

Chroma vs Qdrant vs Weaviate

	Chroma	Qdrant	Weaviate
Setup friction	Minimal	Low	Medium
Production scale	Limited	High	High
Hybrid search	✗	✓	✓
Multi-vector	✗	✓	✓
Performance	Good for small corpora	Excellent	Good
Persistent mode	✓	✓	✓
Best for	Prototyping, small projects	Production search	Structured data, GraphQL

Recommendation: start with Chroma for prototyping, migrate to Qdrant for production. The SIE SDK and embedding pipeline remain identical; only the vector DB client changes.

Migrating from Chroma to Qdrant

When your corpus outgrows Chroma or you need hybrid search, migration is straightforward because SIE produces the same vectors regardless of which vector DB you use:

# Re-encode is usually not needed if you stored the vectors
# Just re-insert into Qdrant with the same vectors

from qdrant_client import QdrantClient
from qdrant_client.models import VectorParams, Distance, PointStruct

qdrant = QdrantClient("http://localhost:6333")
qdrant.create_collection("documents", vectors_config=VectorParams(size=1024, distance=Distance.COSINE))

# Fetch from Chroma, insert into Qdrant
chroma_data = collection.get(include=["embeddings", "documents", "metadatas"])
qdrant.upsert("documents", points=[
    PointStruct(id=i, vector=emb, payload={"text": doc, **meta})
    for i, (emb, doc, meta) in enumerate(zip(
        chroma_data["embeddings"], chroma_data["documents"], chroma_data["metadatas"]
    ))
])

Frequently asked questions

Is Chroma suitable for production? For small corpora (<1M vectors) and internal tools, yes. For high-traffic search with large corpora or hybrid search requirements, Qdrant or Weaviate are better choices.

Does Chroma support hybrid search? Not natively. Chroma only supports dense vector search. For hybrid search (dense + BM25), use Qdrant or Weaviate.

Can I use Chroma with LangChain and SIE? Yes. Use SIE to generate embeddings, then pass them to Chroma via LangChain’s Chroma vectorstore with a custom embedding_function that calls SIE.