Why did we open-source our inference engine? Read the post

google/siglip-so400m-patch14-384

Architecture
Parameters
400M
Tasks
Encode
Outputs
Dense
Dimensions
Dense: 1,152
Max Sequence Length
64 tokens
License

Benchmarks

Flickr30kI2TRetrieval

general retrieval en

Quality
ndcg at 10 0.9001
map at 10 0.8364
mrr at 10 0.9663
Performance L4-SPOT b1 c8
Corpus TPS 202
Corpus p50 523.6ms
Query TPS 10
Query p50 711.3ms
Performance L4 b1 c16
Corpus TPS 508
Corpus p50 452.9ms
Query TPS 18
Query p50 551.4ms

Self-hosted inference for search & document processing

Cut API costs by 50x, boost quality with 85+ SOTA models, and keep your data in your own cloud.

Github
1.5K

Contact us

Tell us about your use case and we'll get back to you shortly.