Skip to content

Compile bug: #22499

@pkmcenter

Description

@pkmcenter

Git commit

COMPILED, BUT CAN'T RUN ON RTX5080

GPU:RTX 5080(Blackwell)
DRIVER:591.86
CUDA:13.1 / NVCC 13.2
CMake:GGML_CUDA=ON, CMAKE_CUDA_ARCHITECTURES=120
LOG:mmq_x_best=0 + mmq.cuh:4135: fatal error

Operating systems

Windows

GGML backends

CUDA

Problem description & steps to reproduce

Compile and run llama-server.exe on Windows,llama-server.exe runs well., but can't visit.

First Bad Commit

No response

Compile command

cmake -B build `
  -DGGML_CUDA=ON `
  -DCMAKE_CUDA_ARCHITECTURES="120" `
  -DGGML_CUDA_MMQ=OFF `
  .

Relevant log output

slot launch_slot_: id  0 | task 0 | processing task, is_child = 0
slot update_slots: id  0 | task 0 | new prompt, n_ctx_slot = 64000, n_keep = 0, task.n_tokens = 22
slot update_slots: id  0 | task 0 | n_tokens = 0, memory_seq_rm [0, end)
slot init_sampler: id  0 | task 0 | init sampler, took 0.00 ms, tokens: text = 22, total = 22
slot update_slots: id  0 | task 0 | prompt processing done, n_tokens = 22, batch.n_tokens = 22
srv  log_server_r: done request: POST /v1/chat/completions 127.0.0.1 200
mmq_x_best=0
D:\LLM\llama.cpp\main\llama.cpp\ggml\src\ggml-cuda\template-instances\../mmq.cuh:4135: fatal error

[INFO] Server stopped

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions