fix: correct ROCm GPU name extraction and discrete GPU selection by octo-patch · Pull Request #307 · AlexsJones/llmfit

octo-patch · 2026-04-03T03:21:35Z

Fixes #271

Problem

On systems with both a discrete GPU and an iGPU visible to ROCm (e.g. Ryzen 9800X3D + RX 7900 XTX), two bugs in detect_amd_gpu_rocm_info caused the wrong GPU name to be reported and the GPU to still show up with gpu_name: "Card Series":

Wrong split index: The --showproductname parser used split(':').nth(1), which returns the field label (literally "Card series") instead of the model value. The line format is GPU[N] : Card series : <name>, so the value lives after the second colon.
GPU index not tracked: VRAM filtering correctly identified discrete GPU entries by byte count, but didn't record which GPU[N] indices they belong to. The subsequent name lookup therefore had no way to target the correct GPU in the product-name output and always returned the first matching line (which could be the iGPU).

Solution

Fixed the split to splitn(3, ':').nth(2) so the actual model name is returned instead of the label.
Changed VRAM parsing to track (gpu_index, vram_bytes) tuples. The first discrete GPU index is then passed to the name parser, which performs a targeted GPU[N]-prefixed scan before falling back to the first match.
Extracted both parsing steps into parse_rocm_vram_indexed and parse_rocm_gpu_name helper methods so they can be unit-tested without a real ROCm installation.

Testing

Five new unit tests added in hardware::tests:

test_parse_rocm_vram_indexed_single_gpu — single GPU VRAM parsed with correct index
test_parse_rocm_vram_indexed_dual_gpu_apu — two entries (discrete + iGPU) parsed with correct indices
test_parse_rocm_gpu_name_single_gpu — card series value extracted (not the label)
test_parse_rocm_gpu_name_prefers_target_index — discrete GPU name returned when iGPU comes first
test_parse_rocm_gpu_name_falls_back_without_index — fallback path works when no target index given

All tests pass (cargo test -p llmfit-core -- rocm). No hardware required to run the tests.

The name 'huggingface-cli' is deprecated. Their CLI is now called 'hf': https://huggingface.co/docs/huggingface_hub/en/guides/cli

fix: invoke hf instead of huggingface-cli

Restructure single-crate project into Cargo workspace: - llmfit-core: core library (hardware detection, model fitting, providers) - llmfit-tui: CLI/TUI binary (unchanged user experience) - llmfit-desktop: macOS desktop app via Tauri 2 The workspace split enables the desktop app to reuse core logic while keeping the CLI/TUI as the default build target. Moved SortColumn to core crate for shared use across frontends. Desktop app features: - System specs display (RAM, CPU, GPU) - Model compatibility table with fit scoring - Dark theme UI using project icon from assets/icon.svg - Tauri 2 with minimal permissions No changes to data files — moved as-is via git mv. Signed-off-by: Three Foxes (in a Trenchcoat) <threefoxes53235@gmail.com> Signed-off-by: Three Foxes (in a Trenchcoat) <threefoxesyes3inatrenchcoat@gmail.com>

- Remove llama title/subtitle header from desktop app - Show total + available RAM separately - Render all detected GPUs with VRAM, backend, and count - Show unified memory indicator for Apple Silicon - Responsive grid layout for system spec cards Signed-off-by: Three Foxes (in a Trenchcoat) <threefoxesyes3inatrenchcoat@gmail.com>

…sktop-v2

…t/workspace-and-desktop-v2 feat: workspace restructure + Tauri desktop app

Adds build-desktop job that builds Tauri desktop app for both aarch64-apple-darwin and x86_64-apple-darwin targets. DMGs are uploaded alongside CLI tarballs in GitHub Releases. Signed-off-by: Three Foxes (in a Trenchcoat) <threefoxesyes3inatrenchcoat@gmail.com>

…t/workspace-and-desktop-v2 ci: build macOS desktop app (.dmg) in release workflow

Click any model row to open a detail modal showing: - Parameters, quantization, runtime, score, speed, use case - Memory utilization bar (color-coded green/yellow/red) - Fit analysis with notes - Installed status badge - Download button (pulls via Ollama when available) - Pull progress bar with live status polling New Tauri commands: start_pull, poll_pull, is_ollama_available Added runtime, installed, utilization_pct to model data. Signed-off-by: Three Foxes (in a Trenchcoat) <threefoxesyes3inatrenchcoat@gmail.com>

The Tauri build runs with working-directory: llmfit-desktop but the workspace target dir may be at the repo root or under the subcrate. Search both locations and fail with diagnostics if neither contains the bundle. Signed-off-by: Three Foxes (in a Trenchcoat) <threefoxesyes3inatrenchcoat@gmail.com>

…t/desktop-modal feat: model detail modal + Ollama download in desktop app

Signed-off-by: Alex <alexsimonjones@gmail.com>

- release.yml now excludes v*-mac tags (CLI + crate + homebrew only) - New release-desktop.yml triggers on v*-mac tags - Uses --bundles app to produce .app bundle without code signing - Searches both target/ and llmfit-desktop/target/ for bundle - Desktop releases no longer slow down normal CLI releases Signed-off-by: Three Foxes (in a Trenchcoat) <threefoxesyes3inatrenchcoat@gmail.com>

Problem: Multi-GPU systems had their VRAM summed into a single pool, leading to overly optimistic model fit recommendations since most inference runtimes (llama.cpp, Ollama, etc.) don't support tensor parallelism by default. Changes: - NVIDIA detection: group by model, keep max per-card VRAM (never sum) - AMD ROCm detection: collect per-card VRAM, use max per-card - Refactor nvidia-smi parsing into separate testable function - Update display text from "GB VRAM total" → "GB VRAM each" - Add unit tests for multi-GPU parsing behavior This gives more realistic recommendations by assuming models must fit on a single GPU unless explicitly configured for tensor parallelism.

fix: use per-card VRAM instead of summed for multi-GPU systems

fix: typo in CHANGELOG.md (suppor -> support)

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Fix compile warnings in providers and TUI

…AlexsJones#49) - For dense models: use choose_quant before deciding GPU path - For MoE models: try quantization hierarchy in moe_offload_path - Add moe_memory_for_quant helper to compute MoE memory at specific quant - Add test_moe_offload_tries_lower_quantization test

- Add Remote Ollama instances section to README - Documents OLLAMA_HOST env var for custom endpoints - Addresses issue AlexsJones#40 - feature already exists but was undocumented - Includes examples for remote servers, custom ports, Docker, etc.

docs: document OLLAMA_HOST environment variable for remote connections

…ysfs Improve GPU identification fallback on Linux containers

- Rename llmfit-tui package to llmfit for crates.io continuity - Add homepage and keywords to llmfit-core for publishing - Update authors field to proper format - Add version requirement for llmfit-core dependency Fixes AlexsJones#58 Signed-off-by: Three Foxes (in a Trenchcoat) <threefoxesyes3inatrenchcoat@gmail.com>

- Publish llmfit-core first (dependency) - Wait for crates.io index to update - Then publish llmfit (depends on llmfit-core) Signed-off-by: Three Foxes (in a Trenchcoat) <threefoxesyes3inatrenchcoat@gmail.com>

…/crates-io-metadata fix: correct crates.io metadata and prepare for publishing

ci: enable windows build targets

Signed-off-by: AlexsJones <alexsimonjones@gmail.com>

…-android-hw-detection Fix Android CPU and Vulkan GPU detection fallback

Add more lfm models

Signed-off-by: AlexsJones <alexsimonjones@gmail.com>

…rmux-gpu-limitations docs: document Android GPU detection limitations

Signed-off-by: AlexsJones <alexsimonjones@gmail.com>

- test_gguf_source_deserialization — GgufSource JSON round-trips correctly - test_gguf_sources_default_to_empty — models without gguf_sources in JSON default to [] - test_catalog_popular_models_have_gguf_sources — 5 well-known models (Llama-3.3-70B, Qwen2.5-7B, etc.) have non-empty gguf_sources in the catalog - test_catalog_gguf_sources_have_valid_repos — every gguf_source in the catalog has owner/repo format, non-empty provider, and contains GGUF - test_catalog_has_significant_gguf_coverage — at least 25% of catalog models have GGUF sources (currently 30%) providers.rs (7 tests): - test_hf_name_to_gguf_candidates_generates_common_patterns — heuristic generates bartowski, ggml-org, TheBloke candidates - test_hf_name_to_gguf_candidates_strips_owner — strips the Org/ prefix correctly - test_lookup_gguf_repo_known_mappings — hardcoded mappings resolve for known models - test_lookup_gguf_repo_unknown_returns_none — unknown models return None - test_has_gguf_mapping_matches_known_models — boolean check works - test_gguf_candidates_fallback_covers_major_providers — fallback covers all 3 providers and all end in -GGUF - test_gguf_candidates_known_mapping_returns_single — hardcoded mapping returns exactly 1 result Signed-off-by: AlexsJones <alexsimonjones@gmail.com>

Signed-off-by: AlexsJones <alexsimonjones@gmail.com>

The JSON output (--json flag and API) was missing `moe_offloaded_gb`, so MoE models showed only active-expert VRAM as `memory_required_gb` without indicating the additional RAM needed for inactive experts. Add `moe_offloaded_gb` and `total_memory_gb` (VRAM + offloaded RAM) to both display and API JSON serializers so consumers can see the full memory footprint. Closes AlexsJones#230 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…-fields fix: surface MoE offloaded RAM in JSON output

Signed-off-by: Alex <alexsimonjones@gmail.com>

Add support for Docker Desktop's built-in Model Runner as a fourth runtime provider alongside Ollama, llama.cpp, and MLX. Detection probes the OpenAI-compatible /v1/models endpoint on localhost:12434 (configurable via DOCKER_MODEL_RUNNER_HOST). Downloads use `docker model pull`. A new scraper (scripts/scrape_docker_models.py) queries Docker Hub's ai/ namespace and cross-references against the HF model database to produce an embedded catalog (docker_models.json) of confirmed available models. Only models verified in the catalog appear as downloadable via Docker. - Provider: detect, list installed, pull via docker CLI - TUI: status bar shows Docker availability, 'D' in Inst column, provider picker includes Docker Model Runner - Inst column refactored from enum to bitfield for extensibility - Makefile: `make update-catalogs` refreshes all scrapers and rebuilds Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Signed-off-by: Alex <alexsimonjones@gmail.com>

When rocm-smi reports multiple GPU agents (e.g. a discrete RX 7900 XTX alongside the integrated Raphael/iGPU on a Ryzen 9800X3D), two bugs caused the wrong name to be returned: 1. The --showproductname parser used split(':').nth(1) which returns the field label ("Card series") instead of the model value. The line format is "GPU[N] : Card series : <name>", so the value is after the second colon; fixed to splitn(3, ':').nth(2). 2. VRAM filtering correctly identified the discrete GPU by its byte count, but the GPU index was not tracked, so the subsequent name lookup had no way to target the right GPU[N] in the product-name output. Fixed by tracking (gpu_index, vram_bytes) tuples and passing the first discrete GPU index to the name parser. Extracted both parsing steps into parse_rocm_vram_indexed and parse_rocm_gpu_name helper methods so they can be unit-tested without a real ROCm installation. Five new unit tests are added. Fixes AlexsJones#271

AlexsJones · 2026-04-06T19:51:09Z

Please cut a new PR against HEAD, this has too many changes from a bad rebase

counterposition and others added 30 commits February 21, 2026 12:35

fix: invoke hf instead of huggingface-cli

c150e25

The name 'huggingface-cli' is deprecated. Their CLI is now called 'hf': https://huggingface.co/docs/huggingface_hub/en/guides/cli

Merge pull request AlexsJones#34 from counterposition/fix/hf-cli-path

f27981e

fix: invoke hf instead of huggingface-cli

Merge remote-tracking branch 'origin/main' into feat/workspace-and-de…

1625422

…sktop-v2

Merge pull request AlexsJones#36 from three-foxes-in-a-trenchcoat/fea…

47fac99

…t/workspace-and-desktop-v2 feat: workspace restructure + Tauri desktop app

Merge pull request AlexsJones#37 from three-foxes-in-a-trenchcoat/fea…

2f36ed1

…t/workspace-and-desktop-v2 ci: build macOS desktop app (.dmg) in release workflow

Merge pull request AlexsJones#38 from three-foxes-in-a-trenchcoat/fea…

9bfff1e

…t/desktop-modal feat: model detail modal + Ollama download in desktop app

chore: v0.4.0 release

b29b6fa

Signed-off-by: Alex <alexsimonjones@gmail.com>

fix: typo in CHANGELOG.md (suppor -> support)

ca0f328

Fix unused-code warnings in core and TUI

4100334

Merge pull request AlexsJones#46 from luojiyin1987/fix/multi-gpu-vram

d420f47

fix: use per-card VRAM instead of summed for multi-GPU systems

Merge pull request AlexsJones#44 from luojiyin1987/fix/typos

161db41

fix: typo in CHANGELOG.md (suppor -> support)

Improve GPU identification without vendor utilities

dfa2040

Update llmfit-core/src/hardware.rs

cbabdf4

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Merge pull request AlexsJones#47 from kvkvkv01/fix/remove-build-warnings

5f4768a

Fix compile warnings in providers and TUI

Merge pull request AlexsJones#51 from AlexsJones/fix/desktop-release

03ab0b6

docs: document OLLAMA_HOST environment variable for remote connections

ci: enable windows build targets

1dffa87

Merge pull request AlexsJones#48 from c3c4d4/fix/gpu-identification-s…

97e1f30

…ysfs Improve GPU identification fallback on Linux containers

ci: update release workflow to publish workspace crates in order

f9d7abf

- Publish llmfit-core first (dependency) - Wait for crates.io index to update - Then publish llmfit (depends on llmfit-core) Signed-off-by: Three Foxes (in a Trenchcoat) <threefoxesyes3inatrenchcoat@gmail.com>

Merge pull request AlexsJones#59 from three-foxes-in-a-trenchcoat/fix…

111dc10

…/crates-io-metadata fix: correct crates.io metadata and prepare for publishing

Merge pull request AlexsJones#53 from akarsh16reddy/main

232e212

ci: enable windows build targets

AlexsJones and others added 28 commits March 12, 2026 01:13

chore: version bump

340b412

Signed-off-by: AlexsJones <alexsimonjones@gmail.com>

chore: version bump

eb5eadd

Signed-off-by: AlexsJones <alexsimonjones@gmail.com>

Merge branch 'AlexsJones:main' into ykhrustalev/lfm

e05ceed

Merge pull request AlexsJones#177 from MahmoudAdelbghany/fix-issue-74…

4951c96

…-android-hw-detection Fix Android CPU and Vulkan GPU detection fallback

Merge pull request AlexsJones#98 from ykhrustalev/ykhrustalev/lfm

e795022

Add more lfm models

chore: version bump

93df54d

Signed-off-by: AlexsJones <alexsimonjones@gmail.com>

docs: document Android GPU detection limitations

a2969e3

Merge pull request AlexsJones#214 from haosenwang1018/docs/android-te…

dbf82dd

…rmux-gpu-limitations docs: document Android GPU detection limitations

fix: Runtimes are installed but no downloadable artifact exist

aa9c193

chore: version bump

c70f982

Signed-off-by: AlexsJones <alexsimonjones@gmail.com>

chore: version bump

5eaa0f3

Signed-off-by: AlexsJones <alexsimonjones@gmail.com>

chore: version bump

d0fb061

Signed-off-by: AlexsJones <alexsimonjones@gmail.com>

chore: cargo fmt

27fee48

Signed-off-by: AlexsJones <alexsimonjones@gmail.com>

fix: regression in json only mode

8d5a312

Signed-off-by: AlexsJones <alexsimonjones@gmail.com>

chore: version bump

6897af1

Signed-off-by: AlexsJones <alexsimonjones@gmail.com>

feat: added in vim like bindings

4bc444b

Signed-off-by: AlexsJones <alexsimonjones@gmail.com>

chore: version bump

a8b07cf

Signed-off-by: AlexsJones <alexsimonjones@gmail.com>

Merge pull request AlexsJones#235 from AlexsJones/fix/moe-json-memory…

926aa7e

…-fields fix: surface MoE offloaded RAM in JSON output

chore: increasing test coverage

4c6ec63

Signed-off-by: Alex <alexsimonjones@gmail.com>

chore: formatted cargo

8b51f37

Signed-off-by: Alex <alexsimonjones@gmail.com>

chore: version bump

3a6b774

Signed-off-by: Alex <alexsimonjones@gmail.com>

feat: updated demo

de39efe

Signed-off-by: Alex <alexsimonjones@gmail.com>

chore: version bump

f011612

Signed-off-by: Alex <alexsimonjones@gmail.com>

chore: version bump

b142d38

Signed-off-by: Alex <alexsimonjones@gmail.com>

AlexsJones closed this Apr 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: correct ROCm GPU name extraction and discrete GPU selection#307

fix: correct ROCm GPU name extraction and discrete GPU selection#307
octo-patch wants to merge 347 commits intoAlexsJones:mainfrom
octo-patch:fix/issue-271-rocm-gpu-name-detection

octo-patch commented Apr 3, 2026

Uh oh!

AlexsJones commented Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

octo-patch commented Apr 3, 2026

Problem

Solution

Testing

Uh oh!

AlexsJones commented Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants