Why did we open-source our inference engine? Read the post

naver-clova-ix/donut-base-finetuned-docvqa

Donut model fine-tuned on DocVQA. It was introduced in the paper OCR-free Document Understanding Transformer by Geewok et al. and first released in this repository.

Overview

Architecture
Encoder-Decoder
Parameters
110M
Tasks
Extract
Outputs
text_regions
License
mit

Benchmarks

DocVQA

general kie en

Visual question answering on document images

Corpus: 5,188 Queries: 5,188
Quality
anls 0.6350
Performance L4-SPOT b1 c4
Performance L4 b1 c16
Reference →

Self-hosted inference for search & document processing

Cut API costs by 50x, boost quality with 85+ SOTA models, and keep your data in your own cloud.

Github 2.0K

Contact us

Tell us about your use case and we'll get back to you shortly.