Qwen/Qwen3-Reranker-0.6B

Primitive: /score · Score · Qwen3

The Qwen3 Embedding model series is the latest proprietary model of the Qwen family, specifically designed for text embedding and ranking tasks. Building upon the dense foundational models of the Qwen3 series, it provides a comprehensive range of text embeddings and reranking models in various sizes (0.6B, 4B, and 8B).

Long context

View on Hugging Face → Fine-tuned from Qwen/Qwen3-0.6B-Base

Overview

Hardware: — drives latency, throughput & cost

Size	596M params
Tasks	/score
License	apache-2.0
Latency	65 ms
Throughput	1.5K tok/s
Cost	$0.151 /1M tok

Cost is approximate — computed from list GPU prices; your actual price depends on the provider you deploy SIE with.

Scoring

Inputs	text
Max sequence length	32,768

Benchmarks

AskUbuntuDupQuestions

technology reranking en

Duplicate question detection from AskUbuntu

Corpus: 6,743 Queries: 360

Quality

ndcg at 10 0.6536

map at 10 0.4986

mrr at 10 0.7642

Performance L4 b1 c16

Corpus 1.5K tok/s

Corpus p50 60.5ms

Query 1.5K tok/s

Query p50 60.5ms

Reference →

MMarcoReranking

general reranking zh

Multilingual MARCO passage reranking (Chinese)

Quality

ndcg at 10 0.0858

map at 10 0.0576

mrr at 10 0.8158

Performance L4 b1 c16

Corpus 18.7K tok/s

Corpus p50 69.8ms