Why did we open-source our inference engine? Read the post

cross-encoder/ms-marco-MiniLM-L-6-v2

This model was trained on the MS Marco Passage Ranking task.

Overview

Architecture
BERT
Parameters
23M
Tasks
Score
Outputs
Score
Max Sequence Length
512 tokens
License
apache-2.0
Languages
en

Benchmarks

AskUbuntuDupQuestions

technology reranking en

Duplicate question detection from AskUbuntu

Corpus: 6,743 Queries: 360
Quality
ndcg at 10 0.6027
map at 10 0.4439
mrr at 10 0.6776
Performance L4 b1 c16
Query 827 tok/s
Query p50 411.2ms
Reference →

CMedQAv1Reranking

medical reranking zh

Chinese medical question answering reranking (v1)

Corpus: 100,000 Queries: 2,000
Quality
map at 10 0.0835
mrr at 10 0.1371
Reference →

CMedQAv2Reranking

medical reranking zh

Chinese medical question answering reranking (v2)

Corpus: 108,000 Queries: 4,000
Quality
map at 10 0.0926
mrr at 10 0.1425
Reference →

CQADupstackPhysicsRetrieval?candidates_model=Alibaba-NLP

general retrieval en

Performance L4 b1 c16
Query 44.3K tok/s
Query p50 44.6ms

CosQA?candidates_model=Alibaba-NLP

general retrieval en

Performance L4 b1 c16
Query 20.5K tok/s
Query p50 43.6ms

FiQA2018?candidates_model=Alibaba-NLP

general retrieval en

Performance L4 b1 c16
Query 51.1K tok/s
Query p50 43.4ms

LegalBenchConsumerContractsQA?candidates_model=Alibaba-NLP

general retrieval en

Performance L4 b1 c16
Query 91.7K tok/s
Query p50 45.6ms

MMarcoReranking

general reranking zh

Multilingual MARCO passage reranking (Chinese)

Quality
map at 10 0.0543
mrr at 10 0.0544
Performance L4 b1 c16
Reference →

NFCorpus?candidates_model=Alibaba-NLP

general retrieval en

Performance L4 b1 c16
Query 70.8K tok/s
Query p50 45.9ms

NanoFiQA2018Retrieval

finance retrieval en

Smaller subset of the FiQA financial QA dataset

Performance L4 b1 c16
Query 7.5K tok/s
Query p50 388.1ms
Reference →

SCIDOCS?candidates_model=Alibaba-NLP

general retrieval en

Performance L4 b1 c16
Query 53.7K tok/s
Query p50 42.5ms

SciFact?candidates_model=Alibaba-NLP

general retrieval en

Performance L4 b1 c16
Query 67.4K tok/s
Query p50 42.1ms

StackOverflowQA?candidates_model=Alibaba-NLP

general retrieval en

Performance L4 b1 c16
Query 98.6K tok/s
Query p50 47.2ms

T2Reranking

general reranking zh

Chinese passage ranking benchmark

Quality
map at 10 0.4714
mrr at 10 0.7102
Reference →

Self-hosted inference for search & document processing

Cut API costs by 50x, boost quality with 85+ SOTA models, and keep your data in your own cloud.

Github 2.0K

Contact us

Tell us about your use case and we'll get back to you shortly.