Here are
19 public repositories
matching this topic...
🦞 LLM Token Compression & Reduction Tool — Cut AI agent token costs by up to 97%. 6-layer deterministic context compression for AI agent workspaces. No LLM required. Prompt compression, context window optimization & cost reduction for any LLM pipeline.
Updated
Mar 10, 2026
Python
📚 Collection of token-level model compression resources.
The official code for the paper: LLaVA-Scissor: Token Compression with Semantic Connected Components for Video LLMs
Updated
Jul 1, 2025
Python
Token-Oriented Object Notation - A compact data format for reducing token consumption when sending structured data to LLMs (PHP implementation)
Official repository of the paper "A Glimpse to Compress: Dynamic Visual Token Pruning for Large Vision-Language Models"
Updated
Feb 13, 2026
Python
[CVPR 2026] FluxMem: Adaptive Hierarchical Memory for Streaming Video Understanding
Updated
Mar 3, 2026
Python
[ICLR 2026 Oral] FlashVID: Efficient Video Large Language Models via Training-free Tree-based Spatiotemporal Token Merging
Updated
Feb 12, 2026
Python
[ICLR 2026] Official code repository for "⚡️VisionTrim: Unified Vision Token Compression for Training-Free MLLM Acceleration"
Updated
Feb 24, 2026
Shell
[ICLR 2026] MergeMix: A Unified Augmentation Paradigm for Visual and Multi-Modal Understanding
Updated
Feb 27, 2026
Python
😎 Awesome papers on token redundancy reduction
This repo integrates DyCoke's token compression method with VLMs such as Gemma3 and InternVL3
Updated
Nov 11, 2025
Python
[ICLR 2026] Official code of PPE: Positional Preservation Embedding for Token Compression in Multimodal Large Language Models.
Updated
Feb 12, 2026
Python
Official implementation of TCSVT 2025 paper: DiViCo: Disentangled Visual Token Compression For Efficient Large Vision-Language Model
Updated
May 13, 2025
Python
[Arxiv 2025 Preprint] HiPrune, a training-free visual token pruning method for VLM acceleration.
Updated
Nov 10, 2025
Jupyter Notebook
hardened docker container & compose for openclaw
Updated
Mar 14, 2026
TypeScript
🛠️ Implement TOON in PHP for efficient serialization of JSON-like data, optimizing parsing for Large Language Models while maintaining clarity and structure.
Compress React/Next.js files by ~40% for AI assistants. MCP server + encoder.
Updated
Feb 25, 2026
JavaScript
Maximum meaning, minimum tokens. Rust-based markdown compression for LLM workflows.
Updated
Mar 5, 2026
TypeScript
Improve this page
Add a description, image, and links to the
token-compression
topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the
token-compression
topic, visit your repo's landing page and select "manage topics."
Learn more
You can’t perform that action at this time.