google/siglip2-base-patch16-224
SigLIP 2 extends the pretraining objective of SigLIP with prior, independently developed techniques into a unified recipe, for improved semantic understanding, localization, and dense features.
Overview
Benchmarks
Flickr30kI2TRetrieval
Image-to-text retrieval: retrieve captions from images
Corpus: 31,783 Queries: 1,000
Quality
ndcg at 10 0.8157
map at 10 0.7255
mrr at 10 0.9302
Performance L4 b1 c8
Corpus 1.6K tok/s
Corpus p50 68.5ms
Query 13.0 mpix/s
Query p50 99.0ms