Skip to content
#

flash-attention-2

Here are 22 public repositories matching this topic...

Tiny VLMs Lab is a Hugging Face Space and open-source project showcasing lightweight Vision-Language Models for image captioning, OCR, reasoning, and multimodal understanding. It offers a simple Gradio interface to upload images, query models, adjust generation settings, and export results in Markdown or PDF.

  • Updated Nov 26, 2025
  • Python

Core-OCR is an advanced, experimental Optical Character Recognition (OCR) and document analysis suite designed for highly accurate text extraction, table reconstruction, and complex visual reasoning. Built on the robust Qwen2.5-VL and Qwen2-VL multimodal architectures.

  • Updated Mar 23, 2026
  • Python

A Gradio-powered web interface for performing advanced OCR tasks using the DeepSeek-OCR model. This experimental app leverages Hugging Face Transformers to process images for text extraction, document conversion, figure parsing, and object localization.

  • Updated Nov 4, 2025
  • Python

Systematically train and benchmark Mistral, Qwen2.5, and SmolLM2 on essay grading across 39 experiments through data analysis and engineering, structured preprocessing, instruction tuning, postprocessing, and leakage aware evaluation for robust score and rationale generation

  • Updated Aug 14, 2025
  • Jupyter Notebook

This application allows users to perform various OCR tasks such as converting documents to markdown, extracting text, locating specific text within images, and parsing figures, all through a user-friendly interface. This demo leverages the deepseek-ai/DeepSeek-OCR-2

  • Updated Feb 5, 2026
  • Python

Improve this page

Add a description, image, and links to the flash-attention-2 topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the flash-attention-2 topic, visit your repo's landing page and select "manage topics."

Learn more