AI Search and Matching for semi-structured data

From e-commerce products with behavioral data to support tickets with metadata & attachments
our models help you organize it all.

Partners
Education & Events
Natural Language Search on Semi-Structured Data
with Jason Liu, Staff MLE, ex-StitchFix
Cheat at Search with LLMs
with Doug Turnbull, Principal Eng, ex-Reddit
Optimize Structured Data Retrieval With Evals
with Hamel Husain, Staff MLE & Educator
From BM25 to Mixture of Encoders
at Haystack 2025
Testimonial
"We are building hyper-targeted pricing and marketing products with TBs of data & Superlinked."
Aniket Mane
VP Data at ThredUp
"We search through millions of hotels, reviews & behavioral signals with Superlinked."
James Callaghan
Head of Search at Trivago
"We match 100,000 Jira issues with multi-modal attachments to root causes with Superlinked."
Juraj Kabzan
VP Eng of Skydio

No.1 on Semi-structured Retrieval Benchmark

NDCG @10
68,78%
Superlinked (Mixture of Encoders)
Description
Uses Mixture of Encoders with Qwen3-0.6B for product description and category encoding, numerical encoders applied to product ratings, rating counts and prices. Can also generate query-specific filter predicates against materials, colors and style properties. Configured with GPT-4o for the query understanding module.
No re-ranking or metadata boosting.
61,67%
Azure AI Search (with Semantic Ranker)
Description
Azure AI Search with Semantic Re-ranker and the built-in query understanding functionality powered by OpenAI LLM API.
We implemented multiple configurations available in the Azure AI ecosystem and we took the best results provided by each - for details see link below.
57,13%
Vertex AI Search (Hybrid & Rerank)
Description
Vertex AI Search with built-in Hybrid Search configured with gemini-embedding-001 dense embedding model, splade-v3 sparse embedding model and Vertex AI RankingAPI.
The Superlinked configuration performs significantly better despite gemini-embedding-001 outperforming Qwen3-0.6B by 3 percentage points on MTEB.
51,96%
Vertex AI Discovery Engine
Description
Use SearchRequest from the Vertex AI Discovery Engine with Boost, QueryExpansion and built-in re-ranking. Tested with and without query parsing using built-in Gemini feature and took the best results.
34,75%
State of the art text embedding model
Description
The indexed JSON objects were "stringified" and embedded with Qwen3-0.6. The same model is used to encode the queries. The single dense query vector is used to retrieve the relevant results, without re-ranking.
Description
Uses Mixture of Encoders with Qwen3-0.6B for product description and category encoding, numerical encoders applied to product ratings, rating counts and prices. Can also generate query-specific filter predicates against materials, colors and style properties. Configured with GPT-4o for the query understanding module.
No re-ranking or metadata boosting.
Description
Azure AI Search with Semantic Re-ranker and the built-in query understanding functionality powered by OpenAI LLM API.
We implemented multiple configurations available in the Azure AI ecosystem and we took the best results provided by each - for details see link below.
Description
Vertex AI Search with built-in Hybrid Search configured with gemini-embedding-001 dense embedding model, splade-v3 sparse embedding model and Vertex AI RankingAPI.
The Superlinked configuration performs significantly better despite gemini-embedding-001 outperforming Qwen3-0.6B by 3 percentage points on MTEB.
Description
Use SearchRequest from the Vertex AI Discovery Engine with Boost, QueryExpansion and built-in re-ranking. Tested with and without query parsing using built-in Gemini feature and took the best results.
Description
The indexed JSON objects were "stringified" and embedded with Qwen3-0.6. The same model is used to encode the queries. The single dense query vector is used to retrieve the relevant results, without re-ranking.
Why did we create a new Information Retrieval benchmark?

Use cases

Drive impact across your organization with Superlinked

Beyond multi-modal

Represent everything you know about your users, documents, products or jira issues with unified "omni modal" embeddings for maximum real-world retrieval relevance & control.

The AI Search Stack

Use our open source framework and server to build more reliable AI systems and onboard to Superlinked Cloud once you are ready to scale to TBs of data & millions of queries.
Funded by
Let's launch AI Search and Matching to production together.
Talk to Engineer