Why did we open-source our inference engine? Read the post

mixedbread-ai/mxbai-colbert-large-v1 (Encode)

Architecture
Parameters
435M
Tasks
Encode
Outputs
Multi-Vec
Dimensions
Multi-Vec: 128
Max Sequence Length
512 tokens
License

Benchmarks

CQADupstackPhysicsRetrieval

scientific retrieval en

Performance L4 b1 c16
Corpus TPS 32.3K
Corpus p50 65.1ms
Query TPS 3.9K
Query p50 44.7ms

CosQA

technology retrieval en

Performance L4 b1 c16
Corpus TPS 16.3K
Corpus p50 51.4ms
Query TPS 2.4K
Query p50 40.1ms

FiQA2018

finance retrieval en

Performance L4 b1 c16
Corpus TPS 38.0K
Corpus p50 66.6ms
Query TPS 4.5K
Query p50 41.8ms

LegalBenchConsumerContractsQA

legal retrieval en

Performance L4 b1 c16
Corpus TPS 79.7K
Corpus p50 98.6ms
Query TPS 6.3K
Query p50 42.1ms

NFCorpus

medical retrieval en

Quality
ndcg at 10 0.3467
map at 10 0.1321
mrr at 10 0.5620
Performance L4 b1 c16
Corpus TPS 46.5K
Corpus p50 95.7ms
Query TPS 1.9K
Query p50 42.8ms

NanoFiQA2018Retrieval

finance retrieval en

Quality
ndcg at 10 0.4833
map at 10 0.4103
mrr at 10 0.5605

SCIDOCS

scientific retrieval en

Performance L4 b1 c16
Corpus TPS 40.1K
Corpus p50 75.6ms
Query TPS 4.6K
Query p50 39.8ms

SciFact

scientific retrieval en

Performance L4 b1 c16
Corpus TPS 48.5K
Corpus p50 87.4ms
Query TPS 6.3K
Query p50 41.8ms

StackOverflowQA

technology retrieval en

Performance L4 b1 c16
Corpus TPS 49.7K
Corpus p50 74.1ms
Query TPS 76.4K
Query p50 60.7ms

Self-hosted inference for search & document processing

Cut API costs by 50x, boost quality with 85+ SOTA models, and keep your data in your own cloud.

Github
1.5K

Contact us

Tell us about your use case and we'll get back to you shortly.