How to extract entities and structured data with SIE

SIE’s extract primitive pulls structured information from unstructured content. It handles named entity recognition (NER), relation extraction, text classification, and vision tasks including captioning and OCR. Models run on your own infrastructure with zero per-call API costs.

Python
TypeScript

from sie_sdk import SIEClient
from sie_sdk.types import Item

client = SIEClient("http://localhost:8080")

text = Item(text="Apple CEO Tim Cook announced the iPhone 16 in Cupertino.")
result = client.extract(
    "urchade/gliner_multi-v2.1",
    text,
    labels=["person", "organization", "product", "location"]
)
for entity in result["entities"]:
    print(f"{entity['label']}: {entity['text']} (score: {entity['score']:.2f})")
# organization: Apple (score: 0.95)
# person: Tim Cook (score: 0.93)
# product: iPhone 16 (score: 0.89)
# location: Cupertino (score: 0.87)

import { SIEClient } from "@superlinked/sie-sdk";

const client = new SIEClient("http://localhost:8080");

const result = await client.extract(
  "urchade/gliner_multi-v2.1",
  { text: "Apple CEO Tim Cook announced the iPhone 16 in Cupertino." },
  { labels: ["person", "organization", "product", "location"] },
);
for (const entity of result.entities) {
  console.log(`${entity.label}: ${entity.text} (score: ${entity.score.toFixed(2)})`);
}
// organization: Apple (score: 0.95)
// person: Tim Cook (score: 0.93)
// product: iPhone 16 (score: 0.89)
// location: Cupertino (score: 0.87)

await client.close();

For model recommendations, see the full model catalog.

Input Types

Item accepts three input modes depending on the model:

text: plain string. Used by GLiNER, GLiREL, GLiClass, and the rest of the text-only extractors.
images: list of image bytes (or {data, format} dicts in Python). Used by Florence-2, Donut, GroundingDINO, OWL-v2, and image-input OCR models like zai-org/GLM-OCR, lightonai/LightOnOCR-2-1B, and PaddlePaddle/PaddleOCR-VL-1.5. See Vision Tasks and OCR.
document: raw file bytes (PDF, DOCX, HTML, MD, TXT, RTF, ODT, PPTX, XLSX, CSV). Used by the multi-page docling parser. The Python SDK auto-detects the format from a path suffix; bytes-based callers pass format explicitly. See OCR Docling.

Named Entity Recognition

GLiNER models support zero-shot NER: define any entity types you need at query time, with no predefined schema.

result = client.extract(
    "urchade/gliner_multi-v2.1",
    Item(text="The merger between Acme Corp and Beta Inc requires FTC approval."),
    labels=["company", "regulatory_body", "legal_action"]
)
for entity in result["entities"]:
    print(f"{entity['label']}: {entity['text']}")
# company: Acme Corp
# company: Beta Inc
# regulatory_body: FTC

const result = await client.extract(
  "urchade/gliner_multi-v2.1",
  { text: "The merger between Acme Corp and Beta Inc requires FTC approval." },
  { labels: ["company", "regulatory_body", "legal_action"] },
);
for (const entity of result.entities) {
  console.log(`${entity.label}: ${entity.text}`);
}
// company: Acme Corp
// company: Beta Inc
// regulatory_body: FTC

Entity Positions

Each entity includes character positions for highlighting or downstream processing:

Python
TypeScript

result = client.extract(
    "urchade/gliner_multi-v2.1",
    Item(text="Tim Cook works at Apple."),
    labels=["person", "organization"]
)
for entity in result["entities"]:
    print(f"{entity['label']}: '{entity['text']}' [{entity['start']}:{entity['end']}]")
# person: 'Tim Cook' [0:8]
# organization: 'Apple' [18:23]

const result = await client.extract(
  "urchade/gliner_multi-v2.1",
  { text: "Tim Cook works at Apple." },
  { labels: ["person", "organization"] },
);
for (const entity of result.entities) {
  console.log(`${entity.label}: '${entity.text}' [${entity.start}:${entity.end}]`);
}
// person: 'Tim Cook' [0:8]
// organization: 'Apple' [18:23]

Batch Extraction

Python
TypeScript

documents = [
    Item(id="doc-1", text="Microsoft acquired Activision for $69 billion."),
    Item(id="doc-2", text="Sundar Pichai leads Google's AI initiatives."),
]
results = client.extract(
    "urchade/gliner_multi-v2.1",
    documents,
    labels=["person", "organization", "money"]
)

const documents = [
  { id: "doc-1", text: "Microsoft acquired Activision for $69 billion." },
  { id: "doc-2", text: "Sundar Pichai leads Google's AI initiatives." },
];
const results = await client.extract(
  "urchade/gliner_multi-v2.1",
  documents,
  { labels: ["person", "organization", "money"] },
);

Response Format

The ExtractResult contains different fields depending on the extraction type used:

Field	Type	When present
`id`	`str or None`	Always (if provided in input)
`entities`	`list[Entity]`	NER models (GLiNER)
`relations`	`list[Relation]`	Relation extraction (GLiREL)
`classifications`	`list[Classification]`	Classification models (GLiClass)
`objects`	`list[DetectedObject]`	Object detection (GroundingDINO, OWLv2)
`data`	`dict`	Document/composite extractors (Docling, Donut, document-mode Florence-2)

Entity Fields

Field	Type	Description
`text`	`str`	Extracted text span
`label`	`str`	Entity type label
`score`	`float`	Confidence score from 0 to 1
`start`	`int`	Start character position
`end`	`int`	End character position

HTTP API

The server defaults to msgpack. For JSON responses:

curl -X POST http://localhost:8080/v1/extract/urchade/gliner_multi-v2.1 \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -d '{
    "items": [{"text": "Tim Cook is the CEO of Apple."}],
    "params": {"labels": ["person", "organization"]}
  }'

See the HTTP API Reference.

Framework Integrations

Extraction is available through all major framework integrations, not just the native SDK:

Framework	Component	Returns
LangChain	`SIEExtractor`	Dict with `entities`, `relations`, `classifications`, `objects`
LlamaIndex	`create_sie_extractor_tool`	Dict with `entities`, `relations`, `classifications`, `objects`
Haystack	`SIEExtractor`	Typed outputs: `Entity`, `Relation`, `Classification`, `DetectedObject`
DSPy	`SIEExtractor`	`dspy.Prediction` with extraction fields
CrewAI	`SIEExtractorTool`	Formatted string with all extraction types

Frequently Asked Questions

What is zero-shot NER? Zero-shot NER means you can define your entity types at query time without fine-tuning a model. GLiNER models like urchade/gliner_multi-v2.1 accept arbitrary label strings and extract matching spans from text. There is no fixed list of entity types.

Does SIE support relation extraction? Yes. GLiREL models extract relationships between entities (for example, “CEO of”, “acquired by”). See Relations and Classification.

Can SIE extract data from PDFs and images? Yes. SIE supports four dedicated OCR models: zai-org/GLM-OCR, lightonai/LightOnOCR-2-1B, PaddlePaddle/PaddleOCR-VL-1.5, and docling (multi-page PDF/DOCX/HTML). They convert documents to Markdown while preserving tables and layout. Donut and Florence-2 are also available for image captioning and visual QA. See OCR and Vision Tasks.

Which model should I use for entity extraction? urchade/gliner_multi-v2.1 is a strong default for multilingual NER. It handles zero-shot extraction across 100+ languages. Browse all extraction models in the model catalog.