Search & Retrieval

What is Semantic Search?

Semantic search is a retrieval technique that finds results based on the meaning of a query rather than exact keyword matches. Instead of matching words character-by-character, it converts text into vector embeddings and retrieves items whose embeddings are closest to the query’s embedding in high-dimensional space.

Why does semantic search matter?

Traditional keyword search breaks down when users phrase queries differently from how content is written. A search for “cheap flights” won’t match a document that says “affordable airfare”, even though they mean the same thing.

Semantic search solves this by working at the level of meaning. Both “cheap flights” and “affordable airfare” map to similar regions in embedding space, so they retrieve the same results. This makes search dramatically more robust across paraphrasing, synonyms, and multilingual queries.

How does semantic search work?

Encoding: a text embedding model (e.g. BGE-M3, E5-large) converts your corpus of documents into dense vectors at index time.
Query encoding: at search time, the user’s query is encoded into a vector using the same model.
Nearest neighbour retrieval: an approximate nearest neighbour (ANN) algorithm (e.g. HNSW) finds the corpus vectors closest to the query vector.
Optional reranking: a cross-encoder reranker rescores the top-k results for higher precision.

The quality of semantic search depends heavily on the embedding model chosen. Task-specific models, trained on retrieval pairs, significantly outperform general-purpose models.

How do you implement semantic search with SIE?

SIE provides self-hosted inference for the embedding and reranking steps, giving you full control over model selection and keeping data within your own cloud.

from sie_sdk import SIEClient
from sie_sdk.types import Item

client = SIEClient("http://localhost:8080")

# Encode documents at index time
doc_vectors = [r["dense"] for r in client.encode("BAAI/bge-m3", [Item(text=d) for d in documents])]

# Encode query at search time
query_vector = client.encode(
    "BAAI/bge-m3",
    Item(text="what is affordable airfare?"),
    is_query=True,
)["dense"]

# Pass vectors to your vector DB for ANN retrieval

SIE supports 100+ embedding models, including multilingual, multi-vector, and instruction-following variants, so you can pick the right model for your retrieval task without changing your infrastructure.

Semantic search vs keyword search vs hybrid search

	Keyword search	Semantic search	Hybrid search
Matches on	Exact terms	Meaning / intent	Both
Handles synonyms	✗	✓	✓
Handles rare terms	✓	✗	✓
Requires embedding model	✗	✓	✓
Best for	Exact lookups	Natural language queries	Most production use cases

For most production systems, hybrid search combining BM25 and semantic retrieval outperforms either approach alone.

Frequently asked questions

What’s the difference between semantic search and vector search? Vector search is the retrieval mechanism (searching by vector similarity). Semantic search is the broader capability; it uses vector search as its engine, with text embedding models to generate the vectors.

Does semantic search work in languages other than English? Yes, multilingual models like BGE-M3 support 100+ languages. SIE lets you self-host these models so multilingual queries are handled without data leaving your infrastructure.

How accurate is semantic search compared to keyword search? On natural language queries, semantic search typically achieves significantly higher recall. For highly specific technical terms or product codes, hybrid search is recommended.

What is Semantic Search?

Why does semantic search matter?

How does semantic search work?

How do you implement semantic search with SIE?

Semantic search vs keyword search vs hybrid search

Frequently asked questions

Related resources

Related Articles

Open source inference for agents