EXPLORE
PaddlePaddle/PaddleOCR MIRROR
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
ai4sciencechineseocrdocument-parsingdocument-translationkie
Python 0 0
opendatalab/MinerU MIRROR
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
ai4sciencedocument-analysisextract-datalayout-analysisocr
Python 0 0