mistral-rs

Here are 3 public repositories matching this topic...

SaschaOnTour / turboquant

Rust KV-cache compression for LLM inference. Implements TurboQuant (Zandieh et al., ICLR 2026) plus PQO — our variant that drops QJL, adds a fused CUDA kernel, and shrinks the cache to ~20% of FP16 (49% total VRAM at 32K). mistral.rs integration.

rust compression quantization memory-optimization kv-cache cuda-kernel llm llm-inference mistral-rs

Updated Apr 20, 2026
Rust

CrispStrobe / CrispSorter

Star

AI-powered document organiser. Extracts text and/or sorts documents: Drop in a bunch of PDFs, DOCX files, or ebooks, and it extracts Document Text, identifies Title, Author, and Year, with a local or remote LLM, and moves them into folders, and/or keeps the extracted text.

ocr extractor sorter extract-text documents-manager llm documents-management ollama mlx-lm mistral-rs

Updated Apr 22, 2026
JavaScript

miyako / mistral-rs

Star

Local inference engine

candle 4d-llm mistral-rs

Updated Jan 11, 2026
Ruby

Improve this page

Add a description, image, and links to the mistral-rs topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the mistral-rs topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly