🎉 We've just raised $9.5M Seed round. Read more about our plan ->

From BM25 to Mixture-of-Encoders, The Evolution of Search & Retrieval Systems

At Haystack EU 2025 in Berlin, Superlinked’s Filip Makraduli presented “From BM25 to Mixture of Encoders: Evaluations for Next-Gen Search and Retrieval Systems.” The session explored how modern queries demand a deeper understanding of both structured and unstructured data, and how Superlinked’s mixture of encoders approach addresses this shift.

Key Takeaways

The Limits of Traditional Keyword Search

Many production systems still rely on keyword-based methods such as BM25. While effective for pure text queries, they struggle when a user’s intent includes multiple data types. Consider the query “5 guests under $200 with 4.8+ rating.” It blends numerical constraints, categorical filters, and descriptive text. Traditional text embeddings fail to fully capture that context.

For a deeper explanation of how mixed data types can be handled in vector space, read Multi-Attribute Semantic Search on VectorHub.

Evaluating Keyword, Vector, Hybrid, and Late-Interaction Models

Filip compared retrieval paradigms including BM25, dense vector search, hybrid approaches, and late-interaction models. He explained how evaluation must reflect real-world intent rather than simple keyword matching, since real user queries often involve multiple attributes or filters.

If you’re exploring how to benchmark and evaluate retrieval methods in practice, check out Retrieval Augmented Generation Evaluation, which details metrics, evaluation setups, and common pitfalls.

Introducing the Mixture of Encoders

The heart of the talk focused on Superlinked’s mixture of encoders architecture, which uses dedicated encoders for text, numbers, categories, and temporal signals. These are combined into a unified embedding space to interpret complex user intent.

This approach improves retrieval performance in multi-faceted queries, particularly for industries like e-commerce and travel. For hands-on examples, see Superlinked LangChain Retriever and Custom Retriever with LlamaIndex.

Productionising Advanced Retrieval

Filip also discussed the challenges of taking these models into production, covering latency budgets, embedding version control, schema design for multi-attribute indexing, and maintaining recall while keeping costs efficient.

If you’re assessing when to rely on reranking or how to simplify your pipeline, read Why You Do Not Need Re-Ranking, which outlines when strong first-pass retrieval can outperform complex rerankers.

Why It Matters

User expectations for search are rising. They no longer accept rigid filters or text-only semantic search. Instead, they expect systems that understand intent across text, numbers, and structured metadata.

Key lessons include:

  • Retrieval must incorporate numeric, categorical, temporal, and behavioral data alongside text.
  • The candidate set before reranking determines much of the outcome quality.
  • Architecture, indexing strategy, and schema design all shape production retrieval performance.
  • Real-world systems need to manage latency, embedding drift, and model versioning at scale.

At Superlinked, our open-source framework is designed precisely for this: enabling structured and unstructured retrieval, human and agent use cases, and first-pass relevance that actually scales.

Related Resources

‍

If these ideas resonate with your work in recommendations, conversational search, or agent workflows, we’d love to connect.

➡️ Book a call with us to explore your use case and learn how Superlinked can help you build production-ready, intelligent retrieval systems.

‍

No items found.
Posted by
No items found.

Share on social

Let’s launch vectors into production

Talk to Engineer