Why did we open-source our inference engine? Read the post
← All Glossary Articles

What is a Vector Database?

A vector database is a database purpose-built for storing, indexing, and querying high-dimensional numerical vectors. Unlike traditional databases that query by exact value or keyword, vector databases find the nearest vectors to a query vector using approximate nearest neighbour (ANN) algorithms — enabling semantic similarity search at scale. They are the storage and retrieval layer in semantic search and RAG systems.


Why do vector databases exist?

Standard databases (PostgreSQL, MongoDB, Elasticsearch) are not designed for similarity search over millions of high-dimensional vectors. Exact nearest neighbour search over 768-dimensional vectors is computationally intractable at scale — it requires comparing every query vector to every stored vector.

Vector databases solve this with specialised ANN index structures (HNSW, IVF, PQ) that trade a small amount of accuracy for orders-of-magnitude faster search. They also provide:

  • Filtered search — combine vector similarity with metadata filters (e.g. “find similar documents from the last 30 days”)
  • Hybrid search — combine dense vector search with sparse BM25-style search in one query
  • Scalar storage — store the original text and metadata alongside the vectors
  • CRUD operations — update and delete vectors as your corpus changes

How does a vector database work?

At index time:

  1. Receive vectors (produced by an embedding model like BGE-M3 via SIE)
  2. Build an ANN index over the vectors (HNSW is most common)
  3. Store vectors alongside the original text and metadata

At query time:

  1. Receive a query vector (encoded by the same embedding model)
  2. Traverse the ANN index to find the approximate k nearest vectors
  3. Return the matching text, metadata, and similarity scores
# Indexing
qdrant.upsert(
collection_name="documents",
points=[
{"id": doc_id, "vector": vector, "payload": {"text": chunk, "date": date}}
for doc_id, vector, chunk, date in zip(ids, vectors, chunks, dates)
]
)
# Querying
results = qdrant.search(
collection_name="documents",
query_vector=query_vector,
query_filter={"must": [{"key": "date", "range": {"gte": "2024-01-01"}}]},
limit=20
)

Major vector databases compared

DatabaseOpen sourceHybrid searchMulti-vectorManaged cloudBest for
QdrantPerformance, Rust-based
WeaviateGraphQL API, modules
ChromaLimitedSimplicity, local dev
Pinecone✓ (only)Managed, easy setup
MilvusLarge scale, enterprise
pgvectorLimited✓ (via RDS)Existing PostgreSQL users

SIE has integration guides for Qdrant, Weaviate, and Chroma.


Vector database vs traditional database + pgvector

pgvector is a PostgreSQL extension that adds vector similarity search. It’s a good starting point but has limitations at scale:

pgvectorPurpose-built vector DB
SetupEasy (existing PG)Separate deployment
ScaleMillions of vectorsHundreds of millions+
ANN performanceGood (HNSW support)Optimised, faster
Hybrid searchLimitedNative
FilteringFull SQLPurpose-built

For prototyping or small corpora (<1M vectors), pgvector is practical. For production search systems, a purpose-built vector DB provides better performance and features.


How does SIE work with vector databases?

SIE produces the vectors; the vector database stores and retrieves them. They’re complementary:

from sie_sdk import SIEClient
from sie_sdk.types import Item
import qdrant_client
sie = SIEClient("http://localhost:8080")
qdrant = qdrant_client.QdrantClient("http://localhost:6333")
# Encode documents with SIE
encode_results = sie.encode("BAAI/bge-m3", [Item(text=c) for c in document_chunks])
vectors = [r["dense"] for r in encode_results]
# Store in Qdrant
qdrant.upsert(collection_name="docs", points=[
{"id": i, "vector": v.tolist(), "payload": {"text": c}}
for i, (v, c) in enumerate(zip(vectors, document_chunks))
])
# Search
query_vector = sie.encode("BAAI/bge-m3", Item(text=user_query), is_query=True)["dense"]
results = qdrant.search("docs", query_vector=query_vector, limit=10)

Choosing the right vector database

Key questions to guide your decision:

  • Scale — how many vectors now, and in 12 months?
  • Filtering needs — do you need complex metadata filters alongside vector search?
  • Hybrid search — do you need BM25 + vector combined?
  • Deployment — self-hosted or managed cloud?
  • Multi-vector — do you need ColBERT-style token-level retrieval?
  • Existing stack — does your team already use PostgreSQL (pgvector) or Elasticsearch?

For new production deployments with SIE, Qdrant is the most commonly recommended choice: open source, high performance, native hybrid search, and multi-vector support.


Frequently asked questions

Is a vector database the same as a vector store? Often used interchangeably. “Vector store” is sometimes used informally for simpler, in-memory implementations (like FAISS). “Vector database” implies a production system with persistence, CRUD, and querying capabilities.

Can I use a vector database without an embedding model? You need vectors to populate it. You can generate vectors using any embedding model — SIE, OpenAI, Cohere, or others. The vector DB is agnostic to how the vectors were generated.

Do vector databases replace traditional search engines like Elasticsearch? For semantic search use cases, yes. But many teams use both: Elasticsearch for keyword/structured search and a vector DB for semantic search, combining results via hybrid retrieval.


Self-hosted inference for search & document processing

Cut API costs by 50x, boost quality with 85+ SOTA models, and keep your data in your own cloud.

Github 2.0K

Contact us

Tell us about your use case and we'll get back to you shortly.