Git commit
COMPILED, BUT CAN'T RUN ON RTX5080
GPU:RTX 5080(Blackwell)
DRIVER:591.86
CUDA:13.1 / NVCC 13.2
CMake:GGML_CUDA=ON, CMAKE_CUDA_ARCHITECTURES=120
LOG:mmq_x_best=0 + mmq.cuh:4135: fatal error
Operating systems
Windows
GGML backends
CUDA
Problem description & steps to reproduce
Compile and run llama-server.exe on Windows,llama-server.exe runs well., but can't visit.
First Bad Commit
No response
Compile command
cmake -B build `
-DGGML_CUDA=ON `
-DCMAKE_CUDA_ARCHITECTURES="120" `
-DGGML_CUDA_MMQ=OFF `
.
Relevant log output
slot launch_slot_: id 0 | task 0 | processing task, is_child = 0
slot update_slots: id 0 | task 0 | new prompt, n_ctx_slot = 64000, n_keep = 0, task.n_tokens = 22
slot update_slots: id 0 | task 0 | n_tokens = 0, memory_seq_rm [0, end)
slot init_sampler: id 0 | task 0 | init sampler, took 0.00 ms, tokens: text = 22, total = 22
slot update_slots: id 0 | task 0 | prompt processing done, n_tokens = 22, batch.n_tokens = 22
srv log_server_r: done request: POST /v1/chat/completions 127.0.0.1 200
mmq_x_best=0
D:\LLM\llama.cpp\main\llama.cpp\ggml\src\ggml-cuda\template-instances\../mmq.cuh:4135: fatal error
[INFO] Server stopped
Git commit
COMPILED, BUT CAN'T RUN ON RTX5080
GPU:RTX 5080(Blackwell)
DRIVER:591.86
CUDA:13.1 / NVCC 13.2
CMake:GGML_CUDA=ON, CMAKE_CUDA_ARCHITECTURES=120
LOG:mmq_x_best=0 + mmq.cuh:4135: fatal error
Operating systems
Windows
GGML backends
CUDA
Problem description & steps to reproduce
Compile and run llama-server.exe on Windows,llama-server.exe runs well., but can't visit.
First Bad Commit
No response
Compile command
Relevant log output