Why did we open-source our inference engine? Read the post

vidore/colqwen2.5-v0.2

Architecture
Qwen2
Parameters
7.0B
Tasks
Encode
Outputs
Multi-Vec
Dimensions
Multi-Vec: 128
Max Sequence Length
2,048 tokens
License

Benchmarks

Vidore3ComputerScienceRetrieval

technology retrieval en

Quality
ndcg at 10 0.7680
map at 10 0.6543
mrr at 10 0.8726
Performance L4 b1 c4
Corpus TPS 2
Corpus p50 1.9s
Query TPS 139
Query p50 527.1ms

Vidore3FinanceEnRetrieval

finance retrieval en

Quality
ndcg at 10 0.6207
map at 10 0.5008
mrr at 10 0.7416
Performance L4 b1 c4
Corpus TPS 2
Corpus p50 1.9s
Query TPS 150
Query p50 547.2ms

Vidore3HrRetrieval

general retrieval en

Quality
ndcg at 10 0.6034
map at 10 0.4666
mrr at 10 0.7046
Performance L4 b1 c4
Corpus TPS 2
Corpus p50 2.0s
Query TPS 110
Query p50 769.7ms

Vidore3PharmaceuticalsRetrieval

medical retrieval en

Quality
ndcg at 10 0.6274
map at 10 0.5173
mrr at 10 0.7234
Performance L4 b1 c4
Corpus TPS 2
Corpus p50 1.9s
Query TPS 138
Query p50 565.3ms

Self-hosted inference for search & document processing

Cut API costs by 50x, boost quality with 85+ SOTA models, and keep your data in your own cloud.

Github
1.5K

Contact us

Tell us about your use case and we'll get back to you shortly.