.png)
At Haystack EU 2025 in Berlin, Superlinked’s Filip Makraduli presented “From BM25 to Mixture of Encoders: Evaluations for Next-Gen Search and Retrieval Systems.” The session explored how modern queries demand a deeper understanding of both structured and unstructured data, and how Superlinked’s mixture of encoders approach addresses this shift.
Many production systems still rely on keyword-based methods such as BM25. While effective for pure text queries, they struggle when a user’s intent includes multiple data types. Consider the query “5 guests under $200 with 4.8+ rating.” It blends numerical constraints, categorical filters, and descriptive text. Traditional text embeddings fail to fully capture that context.
For a deeper explanation of how mixed data types can be handled in vector space, read Multi-Attribute Semantic Search on VectorHub.
Filip compared retrieval paradigms including BM25, dense vector search, hybrid approaches, and late-interaction models. He explained how evaluation must reflect real-world intent rather than simple keyword matching, since real user queries often involve multiple attributes or filters.
If you’re exploring how to benchmark and evaluate retrieval methods in practice, check out Retrieval Augmented Generation Evaluation, which details metrics, evaluation setups, and common pitfalls.
The heart of the talk focused on Superlinked’s mixture of encoders architecture, which uses dedicated encoders for text, numbers, categories, and temporal signals. These are combined into a unified embedding space to interpret complex user intent.
This approach improves retrieval performance in multi-faceted queries, particularly for industries like e-commerce and travel. For hands-on examples, see Superlinked LangChain Retriever and Custom Retriever with LlamaIndex.
Filip also discussed the challenges of taking these models into production, covering latency budgets, embedding version control, schema design for multi-attribute indexing, and maintaining recall while keeping costs efficient.
If you’re assessing when to rely on reranking or how to simplify your pipeline, read Why You Do Not Need Re-Ranking, which outlines when strong first-pass retrieval can outperform complex rerankers.
User expectations for search are rising. They no longer accept rigid filters or text-only semantic search. Instead, they expect systems that understand intent across text, numbers, and structured metadata.
Key lessons include:
At Superlinked, our open-source framework is designed precisely for this: enabling structured and unstructured retrieval, human and agent use cases, and first-pass relevance that actually scales.
‍
If these ideas resonate with your work in recommendations, conversational search, or agent workflows, we’d love to connect.
➡️ Book a call with us to explore your use case and learn how Superlinked can help you build production-ready, intelligent retrieval systems.
‍