nvidia/llama-nemoretriever-colembed-3b-v1
The nvidia/llama-nemoretriever-colembed-3b-v1 is a late interaction embedding model fine-tuned for query-document retrieval. Users can input `queries`, which are text, or `documents` which are page images, to the model.
Overview
Benchmarks
Vidore3ComputerScienceRetrieval
Visual document retrieval on computer science papers and slides
Performance L4 b1 c4
Corpus 0.6 img/s
Corpus p50 6.2s
Query 400 tok/s
Query p50 184.7ms
Vidore3FinanceEnRetrieval
Visual document retrieval on financial reports
Performance L4 b1 c4
Corpus 0.6 img/s
Corpus p50 6.1s
Query 502 tok/s
Query p50 152.7ms
Vidore3HrRetrieval
Visual document retrieval on HR-related documents
Quality
ndcg at 10 0.6513
map at 10 0.5053
mrr at 10 0.7844
Performance L4 b1 c16
Corpus 0.9 img/s
Corpus p50 17.9s
Query 689 tok/s
Query p50 740.7ms
Vidore3PharmaceuticalsRetrieval
Visual document retrieval on pharmaceutical documents
Performance L4 b1 c4
Corpus 0.7 img/s
Corpus p50 6.0s
Query 420 tok/s
Query p50 185.5ms