How Does Chroma Work with Embedding Models?
Chroma is an open-source vector database designed for simplicity. It runs in-process with Python, requires no infrastructure setup for development, and persists to disk or runs as a server for production. It works with embedding models by accepting vectors at insert time, generated by SIE, and enables cosine similarity search with metadata filtering. Chroma is the fastest way to prototype a RAG pipeline before graduating to a more scalable vector database.
Why Chroma?
Chroma’s key advantage is minimal friction:
- Zero-config local setup:
pip install chromadband you’re running - In-process or client-server: run embedded for scripts, as a server for applications
- Simple, Pythonic API: designed to be readable and easy to understand
- LangChain and LlamaIndex native: first-class integration in both frameworks
- Good for prototyping: get a RAG pipeline working in an afternoon
The trade-off: Chroma lacks the performance, scalability, and advanced features (native hybrid search, multi-vector) of Qdrant or Weaviate at production scale.
How Chroma and SIE work together
import chromadbfrom sie_sdk import SIEClientfrom sie_sdk.types import Item
sie = SIEClient("http://localhost:8080")chroma = chromadb.Client() # in-memory for dev; chromadb.PersistentClient() for disk
# Create collectioncollection = chroma.create_collection( name="documents", metadata={"hnsw:space": "cosine"})
# Encode documents with SIEencode_results = sie.encode("BAAI/bge-m3", [Item(text=c) for c in document_chunks])vectors = [r["dense"] for r in encode_results]
# Insert — pass vectors directly (embedding_function=None)collection.add( ids=[str(i) for i in range(len(document_chunks))], embeddings=[v.tolist() for v in vectors], documents=document_chunks, metadatas=[{"source": s, "date": d} for s, d in zip(sources, dates)])
# Searchquery_vector = sie.encode("BAAI/bge-m3", Item(text=user_query), is_query=True)["dense"]
results = collection.query( query_embeddings=[query_vector.tolist()], n_results=10, where={"date": {"$gte": "2024-01-01"}} # metadata filter)Chroma in persistent / server mode
For applications (not just scripts), use persistent or server mode:
# Persistent (single process, data saved to disk)chroma = chromadb.PersistentClient(path="/path/to/chroma-db")
# Server mode (multi-process, suitable for production)# Start server: chroma run --path /path/to/dbchroma = chromadb.HttpClient(host="localhost", port=8000)Both modes use the same API, and the collection operations are identical.
Metadata filtering in Chroma
Chroma uses a MongoDB-style filter syntax:
# Filter by exact matchresults = collection.query( query_embeddings=[query_vector.tolist()], n_results=10, where={"source": "legal-contracts"})
# Filter with operatorsresults = collection.query( query_embeddings=[query_vector.tolist()], n_results=10, where={ "$and": [ {"date": {"$gte": "2024-01-01"}}, {"category": {"$in": ["contract", "agreement"]}} ] })Chroma vs Qdrant vs Weaviate
| Chroma | Qdrant | Weaviate | |
|---|---|---|---|
| Setup friction | Minimal | Low | Medium |
| Production scale | Limited | High | High |
| Hybrid search | ✗ | ✓ | ✓ |
| Multi-vector | ✗ | ✓ | ✓ |
| Performance | Good for small corpora | Excellent | Good |
| Persistent mode | ✓ | ✓ | ✓ |
| Best for | Prototyping, small projects | Production search | Structured data, GraphQL |
Recommendation: start with Chroma for prototyping, migrate to Qdrant for production. The SIE SDK and embedding pipeline remain identical; only the vector DB client changes.
Migrating from Chroma to Qdrant
When your corpus outgrows Chroma or you need hybrid search, migration is straightforward because SIE produces the same vectors regardless of which vector DB you use:
# Re-encode is usually not needed if you stored the vectors# Just re-insert into Qdrant with the same vectors
from qdrant_client import QdrantClientfrom qdrant_client.models import VectorParams, Distance, PointStruct
qdrant = QdrantClient("http://localhost:6333")qdrant.create_collection("documents", vectors_config=VectorParams(size=1024, distance=Distance.COSINE))
# Fetch from Chroma, insert into Qdrantchroma_data = collection.get(include=["embeddings", "documents", "metadatas"])qdrant.upsert("documents", points=[ PointStruct(id=i, vector=emb, payload={"text": doc, **meta}) for i, (emb, doc, meta) in enumerate(zip( chroma_data["embeddings"], chroma_data["documents"], chroma_data["metadatas"] ))])Frequently asked questions
Is Chroma suitable for production? For small corpora (<1M vectors) and internal tools, yes. For high-traffic search with large corpora or hybrid search requirements, Qdrant or Weaviate are better choices.
Does Chroma support hybrid search? Not natively. Chroma only supports dense vector search. For hybrid search (dense + BM25), use Qdrant or Weaviate.
Can I use Chroma with LangChain and SIE?
Yes. Use SIE to generate embeddings, then pass them to Chroma via LangChain’s Chroma vectorstore with a custom embedding_function that calls SIE.