-
Notifications
You must be signed in to change notification settings - Fork 1.2k
llmfit mis-evaluating model fit memory? #302
Copy link
Copy link
Open
Description
I don't understand how llmfit evaluates if a given model fits. Qwen3.5-9b in Q4_K_M (unsloth, model file is 5.29 GB (5,680,529,408 bytes)) is running just fine on my laptop GPU. But according to llmfit, it shouldn't. It is slow (10 tok/sec, but that is untuned.) If you look at the hardware in the screenshot, with 32GB of main ram and 8 GB of VRAM it clearly exceeds the requirement in the Notes section. I'm also unclear where it gets 26.3 GB from in the Memory section.
Maybe a bug? Or am I just confused and it is explained somewhere I missed?

Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels