Skip to content

Memory calculator #1608

@jeremyfowers

Description

@jeremyfowers

The biggest user friction we have right now is making sure a good model with the right settings (i.e., recipe) is loaded for agentic workloads like OpenClaw and Claude Code.

We can't automatically choose a recipe for users because we don't know what will cause them to OOM. Knowing this requires a memory calculator that takes into account the model size, and how that model's architecture changes the RAM requirement with critical settings like ctx_size.

One way to solve this is brute force: load all the top tool-calling models with varying context sizes, measure memory pressure, put the data in a table, and do a lookup on that table at runtime.

Alternatively, we could try to come up with a formula for RAM usage. This might be tricky though because its highly model-architecture and implementation-dependent, and can change across llamacpp releases.

Ideas welcome!

cc @sawansri

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions