Skip to content

llmfit mis-evaluating model fit memory? #302

@jdougan

Description

@jdougan

I don't understand how llmfit evaluates if a given model fits. Qwen3.5-9b in Q4_K_M (unsloth, model file is 5.29 GB (5,680,529,408 bytes)) is running just fine on my laptop GPU. But according to llmfit, it shouldn't. It is slow (10 tok/sec, but that is untuned.) If you look at the hardware in the screenshot, with 32GB of main ram and 8 GB of VRAM it clearly exceeds the requirement in the Notes section. I'm also unclear where it gets 26.3 GB from in the Memory section.

Maybe a bug? Or am I just confused and it is explained somewhere I missed?

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions