Why did we open-source our inference engine? Read the post

Qwen/Qwen3-Reranker-4B

The Qwen3 Embedding model series is the latest proprietary model of the Qwen family, specifically designed for text embedding and ranking tasks. Building upon the dense foundational models of the Qwen3 series, it provides a comprehensive range of text embeddings and reranking models in various sizes (0.6B, 4B, and 8B).

Overview

Architecture
Qwen3
Parameters
4.0B
Tasks
Score
Outputs
Score
Max Sequence Length
32,768 tokens
License
apache-2.0

Benchmarks

AskUbuntuDupQuestions

technology reranking en

Duplicate question detection from AskUbuntu

Corpus: 6,743 Queries: 360
Quality
ndcg at 10 0.6953
map at 10 0.5480
mrr at 10 0.7743
Performance L4 b1 c16
Reference →

Self-hosted inference for search & document processing

Cut API costs by 50x, boost quality with 85+ SOTA models, and keep your data in your own cloud.

Github 2.0K

Contact us

Tell us about your use case and we'll get back to you shortly.