Skip to content
Why did we open-source our inference engine? Read the post

Cohere → SIE

Cohere offers embed-v3 (asymmetric query/document embeddings) and rerank-v3.5 (cross-encoder reranker). Both have direct SIE equivalents.

  • Self-hosted. No API key, no per-call cost, no rate limits, no data egress.
  • Embedding + reranking from one stack. Cohere bills both separately; SIE serves both from one cluster.
  • Open weights. SIE serves permissive-licensed embedding models and BAAI/bge-reranker-v2-m3 from one self-hosted stack.
  • Asymmetric query/document hint. Cohere uses input_type="search_query" vs "search_document". SIE uses is_query=True vs is_query=False.
  • Rerank shape. Query + list of docs in, sorted scores out.
  • Multilingual coverage. If you picked Cohere for embed-multilingual-v3.0, use a multilingual SIE model such as intfloat/multilingual-e5-large-instruct and validate on your own labeled multilingual set before cutting over; model rankings shift by language.
import os
import cohere
docs = [
"Mitochondria are the powerhouse of the cell.",
"The Eiffel Tower is in Paris.",
]
client = cohere.ClientV2(api_key=os.environ["COHERE_API_KEY"])
resp = client.embed(
texts=["The mitochondrion is the powerhouse of the cell."],
model="embed-english-v3.0",
input_type="search_document",
embedding_types=["float"],
)
vectors = resp.embeddings.float_
resp = client.rerank(
model="rerank-v3.5",
query="What is the powerhouse of the cell?",
documents=docs,
top_n=10,
)
from sie_sdk import SIEClient
from sie_sdk.types import Item
client = SIEClient("http://localhost:8080")
texts = ["The mitochondrion is the powerhouse of the cell."]
docs = [
"Mitochondria are the powerhouse of the cell.",
"The Eiffel Tower is in Paris.",
]
# Document side: is_query=False (or omitted)
docs_result = client.encode(
"NovaSearch/stella_en_400M_v5",
[Item(text=t) for t in texts],
is_query=False,
)
vectors = [r["dense"].tolist() for r in docs_result]
# Rerank
result = client.score(
"BAAI/bge-reranker-v2-m3",
Item(text="What is the powerhouse of the cell?"),
[Item(text=d) for d in docs],
)
# result["scores"] is sorted by relevance desc; each entry has item_id and score.
CohereSIE equivalent
embed-english-v3.0NovaSearch/stella_en_400M_v5
embed-multilingual-v3.0intfloat/multilingual-e5-large-instruct
embed-english-light-v3.0intfloat/e5-small-v2
input_type="search_query"client.encode(..., is_query=True)
input_type="search_document"client.encode(..., is_query=False)
rerank-v3.5 / rerank-english-v3.0BAAI/bge-reranker-v2-m3 or jinaai/jina-reranker-v2-base-multilingual
rerank-multilingual-v3.0BAAI/bge-reranker-v2-m3

Cohere returns calibrated relevance scores in roughly the [0, 1] range. SIE’s score() returns raw model-native scores from the cross-encoder, which for bge-reranker-v2-m3 are logits (any real number; sigmoid to get a probability if you need [0,1]). The ordering is what matters for reranking; don’t compare absolute scores across the two providers.

Yes. Cohere’s embed-v3 and SIE’s catalog embedding models produce vectors in unrelated spaces, even at matched dimensions.

Terminal window
mise run serve -- -m NovaSearch/stella_en_400M_v5,BAAI/bge-reranker-v2-m3
export COHERE_API_KEY=...
uv add cohere

Run the Cohere ‘before’ and SIE ‘after’ snippets from this page on a small slice of your own corpus. Compare embeddings and rerank ordering.

Contact us

Tell us about your use case and we'll get back to you shortly.