Vision-grade OCR is here·What's new

Document OCR that actually reads the page.

Drop in any TIFF, PDF or scan. BotifyOCR runs a vision-language model on every page, returns clean Markdown, structured JSON and bounding-box overlays you can review side-by-side.

No signup · drag & drop in the browser · 100 free pages / day

botifyocr.app / studio
Page 1 of 2bboxes on
Radiology Report
Patient ID 116980 · Acquired 2026-05-28
Name
A. Doe
DOB
1984-03-12
Referring MD
Dr. Suresh K.
FINDINGS
Lungs are clear with no focal consolidation.
Cardiomediastinal silhouette unremarkable.
No pleural effusion or pneumothorax.
Mild degenerative changes of thoracic spine.
IMPRESSION
No acute cardiopulmonary process.
Extraction
MarkdownJSONText
# Radiology Report
Patient A. Doe · DOB 1984-03-12
## Findings
  • Lungs are clear with no focal consolidation.
  • Cardiomediastinal silhouette unremarkable.
  • No pleural effusion or pneumothorax.
## Impression
No acute cardiopulmonary process.
2 pages · 412 tokens · 1.8sOCR complete
Built for production OCR

Everything you need to turn scans into clean, reviewable data.

BotifyOCR was built to replace brittle OCR + regex stacks with a single vision-language pipeline you can audit page-by-page.

Vision-language OCR

A multimodal LLM reads the whole page — not just glyphs — so it handles handwriting, multi-column layouts, tables and stamps.

Bounding-box review

Every token has coordinates. Toggle overlays on the page to verify what was extracted, where, with one click.

Markdown, JSON & text

Get a structured Markdown view, raw text, or strict JSON tokens — whichever your downstream pipeline prefers.

Multi-page batching

PDFs and multi-page TIFFs are split, scheduled and recombined automatically. Pages process in parallel on GPU.

GPU-accelerated

Runs on vLLM with paged attention on a single RTX 4090 / 5090 — typical pages finish in 1-2 seconds.

Self-hostable

Docker-compose, environment-driven, no SaaS lock-in. Keep your documents inside your VPC.

How it works

From scanned page to structured data in four steps.

01

Upload

Drop a TIFF, PDF, PNG or JPG up to 50 MB. Multi-page documents are split client-side for instant feedback.

02

Detect & route

Each page is normalized, deskewed, and routed to the best pipeline — vision LLM, layout-aware OCR or both.

03

Extract

A vLLM-served vision-language model reads the page and emits tokens with bounding boxes, reading order and confidence.

04

Review & export

Review side-by-side with overlays, edit if needed, then export as Markdown, JSON or plain text.

Use cases

Wherever paper meets pipelines.

Healthcare

Radiology, pathology and lab reports — preserve sections, tables and stamps without manual templating.

Legal & insurance

Scanned contracts, claim forms and court filings extracted as structured Markdown ready for downstream LLMs.

Finance & ops

Invoices, POs and bank statements with line-items, totals and bounding boxes for audit.

Government archives

Decades-old TIFF archives digitized at scale — handwriting, faded ink and dot-matrix prints handled.

Pricing

Start free. Scale on your terms.

Free

$0/ forever

For trying it out and small personal projects.

  • 100 pages / day
  • Markdown + JSON export
  • Browser studio
  • Community support
Open Studio
Coming soon

Pro

Soon

For teams running OCR at production volume.

  • Unlimited pages
  • API + webhook access
  • Workspace seats
  • Priority queue
Contact sales

Self-hosted

BYO GPU

For regulated industries that keep data on-prem.

  • Docker-compose stack
  • vLLM + RTX 4090/5090
  • No data egress
  • Email support
Reach out to team

Ready to see what your scans really say?

Open the studio, drop in a document, and watch the OCR happen in seconds. No signup, no credit card.