GeeeekExplorer / nano-vllm Public

Notifications You must be signed in to change notification settings
Fork 1.9k
Star 12.8k

Code
Issues 20
Pull requests 43
Actions
Projects
Security and quality
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security and quality
Insights

Pull requests: GeeeekExplorer/nano-vllm

Labels 9 Milestones 0

New pull request New

43 Open 58 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Fix scheduler.postprocess return type

#200 opened Apr 11, 2026 by KinglittleQ

Loading…

fix(engine): resolve crash on fully cached (100% match) prompt via L-1 fallback strategy

#199 opened Apr 10, 2026 by fishAndShrimp

Loading…

fix: pass rope_scaling=None for Qwen3 to avoid unhashable dict error

#198 opened Apr 9, 2026 by CruxZhou

Loading…

docs: use hf download in README

#194 opened Apr 2, 2026 by sablecode

Loading…

[Feat] Add PyTorch Profiler support for performance analysis

#193 opened Mar 31, 2026 by RagingSilence

Loading…

fix: correct postprocess return type annotation

#192 opened Mar 30, 2026 by Desirer

Loading…

Fix CUDA graph block_tables shape mismatch

#191 opened Mar 24, 2026 by ilrewrite

Loading…

Feature/support llama3

#188 opened Mar 21, 2026 by wudong5

Loading…

fix: update download command for model weights in README

#185 opened Mar 12, 2026 by SYaoJun

Loading…

feat: INT8 KV cache quantization (~48% memory reduction)

#184 opened Mar 9, 2026 by dzhengAP

Loading…

docs: add Chinese README and language links

#183 opened Mar 8, 2026 by LJS1124

Loading…

fix: rope_scaling unhashable dict error with transformers>=5.1.0

#182 opened Mar 8, 2026 by chenwenxiaolive

Loading…

1 task done

refactor(block_manager): replace numpy with array for token ID hashing

#180 opened Mar 7, 2026 by fly1989

Loading…

add a Dockerfile for nano-vllm

#178 opened Mar 3, 2026 by pacoxu

Loading…

[Doc]Add Repository Architecture Overview Document

#177 opened Feb 26, 2026 by CalvinXKY

Loading…

Avoid per-step allocations in CUDA-graph decode(fix #175)

#176 opened Feb 23, 2026 by MrAnayDongre

Loading…

Update embed_head.py

#174 opened Feb 21, 2026 by TianduoWang

Loading…

enable 'slots=True' for dataclasses

#172 opened Feb 9, 2026 by IceCreamMilkyTea

Loading…

fix: modify input when input is fp32

#171 opened Feb 8, 2026 by philhuan

Loading…

fix(rms_norm): add copy for residual

#169 opened Jan 28, 2026 by tpoisonooo

Loading…

test

#160 opened Jan 15, 2026 by volcano98

Loading…

fix: clean up hash_to_block_id mapping when deallocating blocks

#153 opened Jan 6, 2026 by ggboooy

Loading…

remove hard code for block_size

#148 opened Dec 29, 2025 by guodongxiaren

Loading…

bug for tensor parallelism # issue 144

#145 opened Dec 17, 2025 by LiaoMengqi

Loading…

Fix: unsqueeze on the last dim for VocabParallelEmbedding

#142 opened Dec 10, 2025 by ljwljwljwljw

Loading…

Previous 1 2 Next

Previous Next

ProTip! What’s not been updated in a month: updated:<2026-03-12.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!