Build a multimodal wine recommender with OCR
What this is
Section titled “What this is”Real product flows often need search and extraction at the same time. A user might know they want “something fruity and not too tannic” or they might just have a photo of a bottle they liked. Most stacks would solve those with two separate services and two separate codebases.
This demo wires both into one app on SIE. Type taste preferences and get ranked wine recommendations, or upload a bottle photo and get the wine identified from its label. The recommendation flow uses encode and score. The label flow uses extract for OCR. Both run through one SIE endpoint, behind a single Next.js UI.
The demo is structured so each half can also be run standalone. That makes it a useful reference if you want to lift either the retrieval logic or the OCR matching into your own product flow.
How it is wired
Section titled “How it is wired”wine_flavor/: retrieval and reranking from flavor + structure preferences. Uses encode and score.wine_picture_detection/: OCR-based wine label detection. Uses extract.- Root
app.py: FastAPI app that joins the two flows and serves the Next.js frontend.
The duplicated database files (wine_flavor.db) and local .env setup are intentional, so each subproject can be run in isolation without depending on the full root app.
What you can do
Section titled “What you can do”- Enter flavor and structure preferences to get wine recommendations
- Compare recommendation behavior across different reranking approaches
- Upload a wine label image and inspect the OCR-driven bottle matching flow
- Use the combined app as a reference for wiring multiple SIE primitives into one user-facing demo
Project structure
Section titled “Project structure”app.py: demo backend that wires OCR and retrieval into one FastAPI appapp/: Next.js frontend for the demo UIwine_flavor/: standalone retrieval and reranking prototypewine_picture_detection/: standalone OCR and label-matching prototype
SIE features used
Section titled “SIE features used”encodefor retrieval embeddingsscorefor reranking candidate winesextractfor OCR-based wine label detection
Why the OCR flow matters
Section titled “Why the OCR flow matters”The OCR side of the demo shows that SIE is not only useful for text retrieval. In this example, extract is used to pull readable text from a bottle label image, then that text is matched against the local wine catalog to identify the bottle.
This is important because real product flows often combine search and extraction rather than using only one primitive. A user may not know the exact wine name, but they may still have a label photo. The OCR path turns that image into usable text and then connects it back to the recommendation and catalog experience.
The OCR pipeline is also intentionally model-flexible. You can use this example to try different OCR-capable extraction models through SIE without rewriting the application flow, which makes it a useful reference for developers who want to evaluate image-to-text approaches quickly.
Schema design
Section titled “Schema design”Wine recommendation
Section titled “Wine recommendation”flowchart TD; A1["User preferences"] --> B["Normalization"]; A2["Wine structural and flavor Attributes"] --> B; B --> C["Vectorization from user preferences and wine attributes"]; C --> D["1st Retrieval - List of N candidates"]; D --> E["Pull reviews for candidate wines"]; E --> F1["Standard reranking on candidate reviews"]; E --> F2["Custom reranking with embeddings on flavor profiles and reviews"]; F1 --> G1["Final Wine List"]; F2 --> G2["Final Wine List"];Wine identification
Section titled “Wine identification”flowchart TD; A["Image upload"] --> B["Analysis of image quality"]; B --> C["OCR / Text extraction"]; C --> D["Score wines against the extracted content"]; D --> E["Wine identification"];Prerequisite
Section titled “Prerequisite”Running the full demo
Section titled “Running the full demo”The full app runs through Docker Compose:
backend: FastAPI onhttp://localhost:8000frontend: Next.js onhttp://localhost:3000
From the repo root:
cd examples/wine-recommendercp .env.example .env# If SIE is not running on your host at port 8080, edit CLUSTER_URL in .env.docker compose up --buildMake sure ports 3000 and 8000 are free before starting the stack.
App URLs:
- Frontend:
http://localhost:3000 - Backend:
http://localhost:8000
Stop it with:
docker compose downEnvironment files
Section titled “Environment files”- The root app and
wine_flavor/subproject use the root.env wine_picture_detection/can also use its own local.env/.env.example- The duplicated setup is intentional so both subprojects can be run individually
If you are running the full demo, put the required backend keys in the root .env. For local or self-hosted SIE without auth, leave API_KEY= blank.
What to try
Section titled “What to try”- Start the full app and open
http://localhost:3000. - Try recommendation queries with different structure preferences such as:
high acidity + low sweetness,full-bodied + high tannin. - Upload the sample wine label or your own label image and inspect the detected wine match.
- Compare how the recommendation and OCR flows use different SIE primitives inside the same app.
- Pay attention to the OCR output quality and matching behavior. The image path is useful for understanding how extraction can support search and retrieval when the user starts from a photo instead of structured text.
- This repo is optimized for demoing the product idea, not for production deployment or large-scale operation.
- The main app is intentionally simple:
app.pywires together the OCR module and the retrieval module rather than hiding them behind a larger service architecture.
By Valentin Marek.