Why did we open-source our inference engine? Read the post

Alibaba-NLP/gte-Qwen2-7B-instruct

gte-Qwen2-7B-instruct is the latest model in the gte (General Text Embedding) model family that ranks No.1 in both English and Chinese evaluations on the Massive Text Embedding Benchmark MTEB benchmark (as of June 16, 2024).

Overview

Architecture
Qwen2
Parameters
7.6B
Tasks
Encode
Outputs
Dense
Dimensions
Dense: 3,584
Max Sequence Length
32,000 tokens
License
apache-2.0

Benchmarks

NFCorpus

medical retrieval en

Biomedical literature search from NutritionFacts.org

Corpus: 3,593 Queries: 323
Quality
ndcg at 10 0.4040
map at 10 0.1548
mrr at 10 0.6133
Performance L4 b1 c16
Corpus 3.7K tok/s
Corpus p50 1.1s
Query 228 tok/s
Query p50 361.2ms
Reference →

NanoFiQA2018Retrieval

finance retrieval en

Smaller subset of the FiQA financial QA dataset

Quality
ndcg at 10 0.6902
map at 10 0.6156
mrr at 10 0.7338
Performance L4 b1 c16
Corpus 3.3K tok/s
Corpus p50 594.5ms
Query 500 tok/s
Query p50 221.7ms
Reference →

Self-hosted inference for search & document processing

Cut API costs by 50x, boost quality with 85+ SOTA models, and keep your data in your own cloud.

Github 2.0K

Contact us

Tell us about your use case and we'll get back to you shortly.