Why did we open-source our inference engine? Read the post

mixedbread-ai/mxbai-colbert-large-v1 (Score)

Architecture
Parameters
435M
Tasks
Encode
Outputs
Multi-Vec
Dimensions
Multi-Vec: 128
Max Sequence Length
512 tokens
License

Benchmarks

CQADupstackPhysicsRetrieval

scientific retrieval en

Performance L4 b1 c16
Corpus TPS 30.0K
Corpus p50 68.7ms
Query TPS 3.3K
Query p50 49.0ms

CosQA

technology retrieval en

Performance L4 b1 c16
Corpus TPS 12.5K
Corpus p50 62.5ms
Query TPS 2.2K
Query p50 43.6ms

FiQA2018

finance retrieval en

Performance L4 b1 c16
Corpus TPS 35.0K
Corpus p50 71.8ms
Query TPS 4.2K
Query p50 44.7ms

LegalBenchConsumerContractsQA

legal retrieval en

Performance L4 b1 c16
Corpus TPS 77.3K
Corpus p50 101.4ms
Query TPS 5.9K
Query p50 45.0ms

NFCorpus

medical retrieval en

Performance L4 b1 c16
Corpus TPS 51.3K
Corpus p50 92.4ms
Query TPS 1.8K
Query p50 44.1ms

SCIDOCS

scientific retrieval en

Performance L4 b1 c16
Corpus TPS 36.2K
Corpus p50 81.3ms
Query TPS 3.8K
Query p50 46.5ms

SciFact

scientific retrieval en

Performance L4 b1 c16
Corpus TPS 46.8K
Corpus p50 89.5ms
Query TPS 5.7K
Query p50 46.2ms

StackOverflowQA

technology retrieval en

Performance L4 b1 c16
Corpus TPS 45.2K
Corpus p50 78.9ms
Query TPS 69.9K
Query p50 64.7ms

Self-hosted inference for search & document processing

Cut API costs by 50x, boost quality with 85+ SOTA models, and keep your data in your own cloud.

Github
1.5K

Contact us

Tell us about your use case and we'll get back to you shortly.