Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
-
Updated
Feb 26, 2026 - Python
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
All-in-One Development Tool based on PaddlePaddle
A high-quality PDF to Markdown tool based on large language model visual recognition. 一款基于大模型视觉识别的高质量PDF转Markdown工具
📄🔍 Parse, extract, and analyze documents with ease 📄🔍
Ray-based accelerator for MinerU VLM inference pipeline. Lightweight, multi-GPU friendly PDF → Markdown processing. 基于 Ray 的 MinerU VLM 推理加速器,轻量、低侵入,面向多 GPU / 国产算力环境的 PDF → Markdown 处理方案。
A high-quality PDF to Markdown tool based on large language model visual recognition. 一款基于大模型视觉识别的高质量PDF转Markdown工具桌面版
MCP server for Meta's nougat-ocr. Instruct your agent to convert academic papers to Markdown files with high mathematical accuracy
Omni Doc Converter Agent PDF,Images,Docx,PPTX,Spreadsheet 万能文档转化助手AI Agent
📝 Manage your projects and notes locally with Ironpad, a file-based system that keeps your data safe in Markdown format without cloud reliance.
Add a description, image, and links to the pdf2markdown topic page so that developers can more easily learn about it.
To associate your repository with the pdf2markdown topic, visit your repo's landing page and select "manage topics."