GPU memory budget calculation doesn't work with gpu-mem-util-gb mod

**Describe the bug**
The GPU Memory Budget subsection of VRAM Estimation appears to be off by roughly an order of magnitude. I didn't see anything obvious in the yaml file for the recipe that could cause the issue.

**To Reproduce**
Run `sparkrun run @experimental/qwen3.5-397b-a17b-int4-autoround-2x-vllm`

**Diagnostics**

```
{14:24}|spark@spark:~ ➭ sparkrun run @experimental/qwen3.5-397b-a17b-int4-autoround-2x-vllm
sparkrun v0.2.25

Runtime:   vllm-distributed
Image:     ghcr.io/spark-arena/dgx-vllm-eugr-nightly-tf5:latest
Model:     Intel/Qwen3.5-397B-A17B-int4-AutoRound
Mode:      cluster (2 nodes)

VRAM Estimation:
  Model dtype:      int4
  KV cache dtype:   bfloat16
  Architecture:     60 layers, 2 KV heads, 256 head_dim
  Model weights:    210.78 GB
  KV cache:         30.00 GB (max_model_len=262,144)
  Tensor parallel:  2
  Per-GPU total:    120.39 GB
  DGX Spark fit:    YES

  GPU Memory Budget:
    gpu_memory_utilization: 11200%
    Usable GPU memory:     13552.0 GB (121 GB x 11200%)
    Available for KV:      13446.6 GB
    Max context tokens:    234,996,514
    Context multiplier:    896.4x (vs max_model_len=262,144)

Hosts:     default cluster 'default'
  Head:    127.0.0.1
  Workers: 192.168.1.116
```

**Local/Remote**
- I am running sparkrun from the park head node.

**Additional context**
N/A

**Suggested Fix**
Unsure


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU memory budget calculation doesn't work with gpu-mem-util-gb mod #136

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

GPU memory budget calculation doesn't work with gpu-mem-util-gb mod #136

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions