Skip to content
Why did we open-source our inference engine? Read the post

LangChain

The sie-langchain package (Python) and @superlinked/sie-langchain package (TypeScript) provide drop-in components for LangChain. Both languages support embeddings, sparse search, reranking, and extraction (entities, relations, classifications, and object detection).

Terminal window
pip install sie-langchain

This installs sie-sdk and langchain-core as dependencies.

Terminal window
# Docker (recommended)
docker run -p 8080:8080 ghcr.io/superlinked/sie-server:default
# Or with GPU
docker run --gpus all -p 8080:8080 ghcr.io/superlinked/sie-server:default

SIEEmbeddings implements LangChain’s Embeddings interface. Use it with any vector store.

from sie_langchain import SIEEmbeddings
embeddings = SIEEmbeddings(
base_url="http://localhost:8080",
model="BAAI/bge-m3"
)
# Embed documents
vectors = embeddings.embed_documents([
"Machine learning uses algorithms to learn from data.",
"The weather is sunny today."
])
print(len(vectors)) # 2
# Embed a query
query_vector = embeddings.embed_query("What is machine learning?")
print(len(query_vector)) # 1024

Any model SIE supports for dense embeddings works - just change the model parameter:

# Stella (1024-dim, strong quality)
embeddings = SIEEmbeddings(model="NovaSearch/stella_en_400M_v5")
# Nomic MoE (768-dim)
embeddings = SIEEmbeddings(model="nomic-ai/nomic-embed-text-v2-moe")
# E5 (1024-dim) - SIE handles query vs document encoding automatically
embeddings = SIEEmbeddings(model="intfloat/e5-large-v2")

See the Model Catalog for all 85+ supported models.

from langchain_chroma import Chroma
from sie_langchain import SIEEmbeddings
embeddings = SIEEmbeddings(model="BAAI/bge-m3")
# Create vector store
vectorstore = Chroma.from_texts(
texts=["Document one", "Document two"],
embedding=embeddings
)
# Search
results = vectorstore.similarity_search("query", k=2)

Both sync and async methods are available:

# Sync
vectors = embeddings.embed_documents(texts)
query_vec = embeddings.embed_query(text)
# Async
vectors = await embeddings.aembed_documents(texts)
query_vec = await embeddings.aembed_query(text)

SIEReranker implements BaseDocumentCompressor. Use it to rerank retrieved documents. Works with both cross-encoder models (e.g., jinaai/jina-reranker-v2-base-multilingual) and ColBERT/late-interaction models (e.g., jinaai/jina-colbert-v2) - just change the model name.

from langchain_core.documents import Document
from sie_langchain import SIEReranker
reranker = SIEReranker(
base_url="http://localhost:8080",
model="jinaai/jina-reranker-v2-base-multilingual",
top_k=3
)
documents = [
Document(page_content="Machine learning is a subset of AI."),
Document(page_content="The weather is sunny today."),
Document(page_content="Deep learning uses neural networks."),
]
reranked = reranker.compress_documents(documents, "What is ML?")
for doc in reranked:
score = doc.metadata.get("relevance_score", 0)
print(f"{score:.3f}: {doc.page_content[:50]}")
from langchain.retrievers import ContextualCompressionRetriever
from sie_langchain import SIEReranker
reranker = SIEReranker(model="jinaai/jina-reranker-v2-base-multilingual", top_k=5)
compression_retriever = ContextualCompressionRetriever(
base_compressor=reranker,
base_retriever=vectorstore.as_retriever(search_kwargs={"k": 20})
)
# Retrieves 20 docs, reranks, returns top 5
results = compression_retriever.invoke("What is machine learning?")

Use SIESparseEncoder with SIEEmbeddings for hybrid dense+sparse search.

from langchain_pinecone import PineconeHybridSearchRetriever
from sie_langchain import SIEEmbeddings, SIESparseEncoder
retriever = PineconeHybridSearchRetriever(
embeddings=SIEEmbeddings(model="BAAI/bge-m3"),
sparse_encoder=SIESparseEncoder(model="BAAI/bge-m3"),
index=pinecone_index
)
results = retriever.invoke("hybrid search query")

Complete example combining embeddings, reranking, and LLM generation:

from langchain_chroma import Chroma
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI
from langchain.retrievers import ContextualCompressionRetriever
from sie_langchain import SIEEmbeddings, SIEReranker
# 1. Create embeddings and vector store
embeddings = SIEEmbeddings(
base_url="http://localhost:8080",
model="BAAI/bge-m3"
)
documents = [
"Machine learning is a branch of artificial intelligence.",
"Neural networks are inspired by biological neurons.",
"Deep learning uses multiple layers of neural networks.",
"Python is popular for machine learning development.",
]
vectorstore = Chroma.from_texts(texts=documents, embedding=embeddings)
# 2. Create two-stage retriever with reranking
reranker = SIEReranker(
base_url="http://localhost:8080",
model="jinaai/jina-reranker-v2-base-multilingual",
top_k=2
)
retriever = ContextualCompressionRetriever(
base_compressor=reranker,
base_retriever=vectorstore.as_retriever(search_kwargs={"k": 10})
)
# 3. Build RAG chain
template = """Answer based on the context:
Context: {context}
Question: {question}"""
prompt = ChatPromptTemplate.from_template(template)
llm = ChatOpenAI(model="gpt-4o-mini")
def format_docs(docs):
return "\n".join(doc.page_content for doc in docs)
chain = (
{"context": retriever | format_docs, "question": RunnablePassthrough()}
| prompt
| llm
| StrOutputParser()
)
# 4. Query
answer = chain.invoke("What is deep learning?")
print(answer)

SIEExtractor provides zero-shot extraction as a LangChain Tool. It supports all extraction types: entities (GLiNER), relations (GLiREL), classifications (GLiClass), and object detection (GroundingDINO/OWL-v2). The tool name is "sie_extract" and it returns a dict with keys entities, relations, classifications, and objects.

from sie_langchain import SIEExtractor
extractor = SIEExtractor(
base_url="http://localhost:8080",
model="urchade/gliner_multi-v2.1",
labels=["person", "organization", "location"]
)
result = extractor.invoke("Tim Cook announced new products at Apple Park in Cupertino.")
for entity in result["entities"]:
print(f"{entity['label']}: {entity['text']} ({entity['score']:.2f})")
# person: Tim Cook (0.96)
# organization: Apple (0.91)
# location: Cupertino (0.88)

Extract relationships between entities using GLiREL:

from sie_langchain import SIEExtractor
extractor = SIEExtractor(
base_url="http://localhost:8080",
model="jackboyla/glirel-large-v0",
labels=["works_for", "ceo_of", "founded"]
)
result = extractor.invoke("Tim Cook is the CEO of Apple Inc.")
for relation in result["relations"]:
print(f"{relation['head']} --{relation['relation']}--> {relation['tail']}")
# Tim Cook --ceo_of--> Apple Inc.

Classify text into categories using GLiClass:

from sie_langchain import SIEExtractor
extractor = SIEExtractor(
base_url="http://localhost:8080",
model="knowledgator/gliclass-base-v1.0",
labels=["positive", "negative", "neutral"]
)
result = extractor.invoke("I absolutely loved this movie! The acting was superb.")
for classification in result["classifications"]:
print(f"{classification['label']}: {classification['score']:.2f}")
# positive: 0.94
# neutral: 0.04
# negative: 0.02

LangChain’s Embeddings interface is text-only (embed_documents(texts) / embed_query(text)), so there is no native way to pass images through the integration. For image embedding with models like CLIP, SigLIP, or ColPali, use the SIE SDK directly:

from sie_sdk import SIEClient
from sie_sdk.types import Item
client = SIEClient("http://localhost:8080")
# Embed an image
result = client.encode(
"openai/clip-vit-large-patch14",
Item(images=["photo.jpg"]),
output_types=["dense"]
)
image_embedding = result["dense"].tolist()
# Embed text+image together (for models that support it)
result = client.encode(
"openai/clip-vit-large-patch14",
Item(text="A photo of a cat", images=["cat.jpg"]),
output_types=["dense"]
)

See Encode for full SDK documentation and the Model Catalog for supported vision models.

ParameterTypeDefaultDescription
base_urlstrhttp://localhost:8080SIE server URL
modelstrBAAI/bge-m3Model to use
instructionstrNoneInstruction prefix for encoding
output_dtypestrNoneOutput dtype: float32, float16, int8, binary
gpustrNoneTarget GPU type for routing
timeout_sfloat180.0Request timeout in seconds
ParameterTypeDefaultDescription
base_urlstrhttp://localhost:8080SIE server URL
modelstrjinaai/jina-reranker-v2-base-multilingualReranker model
top_kintNoneNumber of documents to return
gpustrNoneTarget GPU type for routing
optionsdictNoneModel-specific options
timeout_sfloat180.0Request timeout in seconds

The extraction model determines which result types are populated. Use GLiNER models for entities, GLiREL for relations, GLiClass for classifications, and GroundingDINO/OWL-v2 for object detection. The tool name is "sie_extract".

ParameterTypeDefaultDescription
base_urlstrhttp://localhost:8080SIE server URL
modelstrurchade/gliner_multi-v2.1Extraction model (GLiNER, GLiREL, GLiClass, GroundingDINO, OWL-v2)
labelslist[str]["person", "organization", "location"]Labels for extraction (entity types, relation types, or classification categories)
gpustrNoneTarget GPU type for routing
optionsdictNoneModel-specific options
timeout_sfloat180.0Request timeout in seconds

Contact us

Tell us about your use case and we'll get back to you shortly.