Skip to content
Why did we open-source our inference engine? Read the post

CrewAI

The sie-crewai package provides CrewAI tools and embedders: SIERerankerTool for reranking, SIEExtractorTool for extraction (entities, relations, classifications, and object detection), and SIESparseEmbedder for hybrid search.

Terminal window
pip install sie-crewai

This installs sie-sdk and crewai as dependencies.

Terminal window
# Docker (recommended)
docker run -p 8080:8080 ghcr.io/superlinked/sie-server:default
# Or with GPU
docker run --gpus all -p 8080:8080 ghcr.io/superlinked/sie-server:default

SIE integrates with CrewAI through two embedding approaches:

  1. Dense embeddings - Use SIE’s OpenAI-compatible API with CrewAI’s built-in embedder config
  2. Sparse embeddings - Use SIESparseEmbedder for hybrid search workflows

Configure CrewAI to use SIE’s OpenAI-compatible endpoint:

from crewai import Crew
crew = Crew(
agents=[...],
tasks=[...],
embedder={
"provider": "openai",
"config": {
"api_base": "http://localhost:8080/v1",
"model": "BAAI/bge-m3"
}
}
)

Use SIESparseEmbedder for sparse vectors in hybrid search:

from sie_crewai import SIESparseEmbedder
sparse_embedder = SIESparseEmbedder(
base_url="http://localhost:8080",
model="BAAI/bge-m3"
)
# Embed documents
sparse_vectors = sparse_embedder.embed_documents([
"Machine learning uses algorithms to learn from data.",
"The weather is sunny today."
])
print(sparse_vectors[0].keys()) # dict_keys(['indices', 'values'])
# Embed a query (uses is_query=True for asymmetric models)
query_vector = sparse_embedder.embed_query("What is machine learning?")

Complete example using SIE embeddings with a CrewAI agent for hybrid search:

from crewai import Agent, Crew, Task
from sie_crewai import SIESparseEmbedder
# 1. Configure dense embeddings via OpenAI-compatible API
embedder_config = {
"provider": "openai",
"config": {
"api_base": "http://localhost:8080/v1",
"model": "BAAI/bge-m3"
}
}
# 2. Set up sparse embedder for hybrid search
sparse_embedder = SIESparseEmbedder(
base_url="http://localhost:8080",
model="BAAI/bge-m3"
)
# 3. Prepare your corpus with both dense and sparse embeddings
corpus = [
"Machine learning is a branch of artificial intelligence.",
"Neural networks are inspired by biological neurons.",
"Deep learning uses multiple layers of neural networks.",
]
# Get sparse embeddings for your vector database
sparse_vectors = sparse_embedder.embed_documents(corpus)
# Store sparse_vectors in your vector DB (Qdrant, Weaviate, etc.)
# 4. Create a research agent
researcher = Agent(
role="Research Analyst",
goal="Find and analyze information from the knowledge base",
backstory="Expert at finding relevant information using semantic search.",
verbose=True
)
# 5. Define the research task
research_task = Task(
description="Search the knowledge base for information about deep learning.",
expected_output="A summary of findings about deep learning.",
agent=researcher
)
# 6. Create and run the crew
crew = Crew(
agents=[researcher],
tasks=[research_task],
embedder=embedder_config,
verbose=True
)
result = crew.kickoff()
print(result)

SIERerankerTool is a CrewAI BaseTool that reranks documents by relevance to a query. Agents can use it to improve search quality.

from crewai import Agent, Crew, Task
from sie_crewai import SIERerankerTool
reranker = SIERerankerTool(
base_url="http://localhost:8080",
model="jinaai/jina-reranker-v2-base-multilingual",
)
researcher = Agent(
role="Research Analyst",
goal="Find the most relevant information",
tools=[reranker],
)
task = Task(
description="Rerank these documents for the query 'What is deep learning?'",
expected_output="The most relevant documents.",
agent=researcher,
)
crew = Crew(agents=[researcher], tasks=[task])
result = crew.kickoff()

SIEExtractorTool is a CrewAI BaseTool that extracts structured data from text. It supports all extraction types: entities (GLiNER), relations (GLiREL), classifications (GLiClass), and object detection (GroundingDINO/OWL-v2). The _run() method formats all 4 types in the string output with separate sections for entities, relations, classifications, and objects.

from crewai import Agent, Crew, Task
from sie_crewai import SIEExtractorTool
extractor = SIEExtractorTool(
base_url="http://localhost:8080",
model="urchade/gliner_multi-v2.1",
labels=["person", "organization", "location"],
)
analyst = Agent(
role="Data Analyst",
goal="Extract key entities from documents",
tools=[extractor],
)
task = Task(
description="Extract all people, organizations, and locations from: 'Tim Cook announced new products at Apple Park in Cupertino.'",
expected_output="A list of extracted entities.",
agent=analyst,
)
crew = Crew(agents=[analyst], tasks=[task])
result = crew.kickoff()

Extract relationships between entities using GLiREL:

from sie_crewai import SIEExtractorTool
extractor = SIEExtractorTool(
base_url="http://localhost:8080",
model="jackboyla/glirel-large-v0",
labels=["works_for", "ceo_of", "founded"],
)
# Use with an agent, or call directly:
result = extractor._run("Tim Cook is the CEO of Apple Inc.")
print(result)
# Relations:
# Tim Cook --ceo_of--> Apple Inc. (score: 0.92)

Classify text into categories using GLiClass:

from sie_crewai import SIEExtractorTool
extractor = SIEExtractorTool(
base_url="http://localhost:8080",
model="knowledgator/gliclass-base-v1.0",
labels=["positive", "negative", "neutral"],
)
result = extractor._run("I absolutely loved this movie! The acting was superb.")
print(result)
# Classifications:
# positive (score: 0.94)
# neutral (score: 0.04)
# negative (score: 0.02)
ParameterTypeDefaultDescription
base_urlstrhttp://localhost:8080SIE server URL
modelstrBAAI/bge-m3Model to use for sparse embeddings
gpustrNoneTarget GPU type for routing
optionsdictNoneModel-specific options
timeout_sfloat180.0Request timeout in seconds
ParameterTypeDefaultDescription
base_urlstrhttp://localhost:8080SIE server URL
modelstrjinaai/jina-reranker-v2-base-multilingualReranker model
gpustrNoneTarget GPU type for routing
optionsdictNoneModel-specific options
timeout_sfloat180.0Request timeout in seconds

The extraction model determines which result types are included in the output. Use GLiNER models for entities, GLiREL for relations, GLiClass for classifications, and GroundingDINO/OWL-v2 for object detection.

ParameterTypeDefaultDescription
base_urlstrhttp://localhost:8080SIE server URL
modelstrurchade/gliner_multi-v2.1Extraction model (GLiNER, GLiREL, GLiClass, GroundingDINO, OWL-v2)
labelslist[str]["person", "organization", "location"]Labels for extraction (entity types, relation types, or classification categories)
gpustrNoneTarget GPU type for routing
optionsdictNoneModel-specific options
timeout_sfloat180.0Request timeout in seconds

Contact us

Tell us about your use case and we'll get back to you shortly.