Skip to content

mrrtmob/kiri-ocr

Repository files navigation

Kiri OCR 📄

PyPI version License Python Versions Downloads Hugging Face Model Hugging Face Spaces

Kiri OCR is a lightweight OCR library for English and Khmer documents. It provides document-level text detection, recognition, and rendering capabilities.

🚀 Try the Live Demo | 📚 Full Documentation

Kiri OCR

✨ Key Features

  • High Accuracy: Transformer model with hybrid CTC + attention decoder
  • Bi-lingual: Native support for English and Khmer (and mixed text)
  • Document Processing: Automatic text line and word detection
  • Streaming: Real-time character-by-character output (like LLM streaming)
  • Easy to Use: Simple Python API and CLI

📦 Installation

pip install kiri-ocr

💻 Quick Start

CLI Tool

kiri-ocr document.jpg

Python API

from kiri_ocr import OCR

# Initialize (auto-downloads from Hugging Face)
ocr = OCR()

# Extract text from document
text, results = ocr.extract_text('document.jpg')
print(text)

# Get detailed box-by-box results
for line in results:
    print(f"{line['text']} (confidence: {line['confidence']:.1%})")

Decoding Methods

Choose the decoding method based on your speed/quality tradeoff:

# Fast (CTC) - Fastest, good for batch processing
ocr = OCR(decode_method="fast")

# Accurate (Decoder) - Balanced speed and quality (default)
ocr = OCR(decode_method="accurate")

# Beam Search - Best quality, slowest
ocr = OCR(decode_method="beam")

Streaming Recognition

Get character-by-character output like LLM streaming:

from kiri_ocr import OCR

ocr = OCR(decode_method="accurate")

# Stream characters as they're decoded
for chunk in ocr.extract_text_stream_chars('document.jpg'):
    print(chunk['token'], end='', flush=True)
    if chunk['document_finished']:
        print()  # Done!

📚 Documentation

Full documentation is available on the Wiki:

📊 Benchmark

Results on synthetic test images (10 popular fonts):

Benchmark Graph

📁 Project Structure

kiri_ocr/
├── core.py               # OCR class
├── model.py              # Transformer model
├── training.py           # Training code
├── cli.py                # Command-line interface
└── detector/             # Text detection
    ├── db/               # DB detector
    └── craft/            # CRAFT detector

☕ Support

If you find this project useful:

Join our Discord Community](https://discord.gg/Vcrw274RVC)

⚖️ License

Apache License 2.0

About

Kiri OCR is a lightweight, OCR library for English and Khmer documents.

Topics

Resources

License

Stars

Watchers

Forks

Languages