Compile bug: Entry function flash_attn_tile (mangled) uses too much shared data (0xd100 bytes, 0xc000 max)

### Git commit

7b8443ac786c06438e0f407b7adaa72c220b5099 and fc2b0053ffe878ff5a26934bdb555681f15bc699

### Operating systems

Linux

### GGML backends

CUDA

### Problem description & steps to reproduce

I'm trying to build on archlinux with cuda 12.9 (for an nvidia 1080Ti). Possibly I'm not using the right C++ compiler but this has worked up until #22286.

Admittedly I've been getting a lot of warnings saying: `nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).`

Apologies if this is redundant or duplicated (I looked but found nothing related).

### First Bad Commit

7b8443ac786c06438e0f407b7adaa72c220b5099

### Compile command

```shell
cmake -B build -DGGML_CCACHE=OFF -DGGML_CUDA=ON -DCMAKE_CUDA_ARCHITECTURES="61" -DCMAKE_BUILD_TYPE=Release
cmake --build build --config Release -j8
```

### Relevant log output

```shell
[  8%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu.o
nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
ptxas error   : Entry function '_Z15flash_attn_tileILi320ELi256ELi1ELi32ELb1EEvPKcS1_S1_S1_S1_PKiPfP6float2ffffjfi5uint3iiiiiiiiiiiliiliiiiil' uses too much shared data (0xd100 bytes, 0xc000 max)
ptxas error   : Entry function '_Z15flash_attn_tileILi320ELi256ELi1ELi32ELb0EEvPKcS1_S1_S1_S1_PKiPfP6float2ffffjfi5uint3iiiiiiiiiiiliiliiiiil' uses too much shared data (0xd100 bytes, 0xc000 max)
make[2]: *** [ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/build.make:1040: ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-tile-instance-dkq320-dv256.cu.o] Error 255
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [CMakeFiles/Makefile2:2412: ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/all] Error 2
make: *** [Makefile:146: all] Error 2
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compile bug: Entry function flash_attn_tile (mangled) uses too much shared data (0xd100 bytes, 0xc000 max) #22491

Git commit

Operating systems

GGML backends

Problem description & steps to reproduce

First Bad Commit

Compile command

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Compile bug: Entry function flash_attn_tile (mangled) uses too much shared data (0xd100 bytes, 0xc000 max) #22491

Description

Git commit

Operating systems

GGML backends

Problem description & steps to reproduce

First Bad Commit

Compile command

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions