Compile bug:

### Git commit

COMPILED, BUT CAN'T RUN ON RTX5080

GPU：RTX 5080（Blackwell）
DRIVER：591.86
CUDA：13.1 / NVCC 13.2
CMake：GGML_CUDA=ON, CMAKE_CUDA_ARCHITECTURES=120
LOG：mmq_x_best=0 + mmq.cuh:4135: fatal error

### Operating systems

Windows

### GGML backends

CUDA

### Problem description & steps to reproduce

Compile and run llama-server.exe on Windows，llama-server.exe runs well., but can't visit.

### First Bad Commit

_No response_

### Compile command

```shell
cmake -B build `
  -DGGML_CUDA=ON `
  -DCMAKE_CUDA_ARCHITECTURES="120" `
  -DGGML_CUDA_MMQ=OFF `
  .
```

### Relevant log output

```shell
slot launch_slot_: id  0 | task 0 | processing task, is_child = 0
slot update_slots: id  0 | task 0 | new prompt, n_ctx_slot = 64000, n_keep = 0, task.n_tokens = 22
slot update_slots: id  0 | task 0 | n_tokens = 0, memory_seq_rm [0, end)
slot init_sampler: id  0 | task 0 | init sampler, took 0.00 ms, tokens: text = 22, total = 22
slot update_slots: id  0 | task 0 | prompt processing done, n_tokens = 22, batch.n_tokens = 22
srv  log_server_r: done request: POST /v1/chat/completions 127.0.0.1 200
mmq_x_best=0
D:\LLM\llama.cpp\main\llama.cpp\ggml\src\ggml-cuda\template-instances\../mmq.cuh:4135: fatal error

[INFO] Server stopped
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compile bug: #22499

Git commit

Operating systems

GGML backends

Problem description & steps to reproduce

First Bad Commit

Compile command

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Compile bug: #22499

Description

Git commit

Operating systems

GGML backends

Problem description & steps to reproduce

First Bad Commit

Compile command

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions