Skip to content

CUDA illegal memory access when Gaussian opacity < 1/255 in duplicateToTilesTouched (compact box)#58

Open
LiSaiCSU wants to merge 2 commits into
fastgs:mainfrom
LiSaiCSU:main
Open

CUDA illegal memory access when Gaussian opacity < 1/255 in duplicateToTilesTouched (compact box)#58
LiSaiCSU wants to merge 2 commits into
fastgs:mainfrom
LiSaiCSU:main

Conversation

@LiSaiCSU
Copy link
Copy Markdown

@LiSaiCSU LiSaiCSU commented Jun 4, 2026

Summary

Training crashes with CUDA error: illegal memory access during _C.rasterize_gaussians (often reported asynchronously at (radii > 0).nonzero() in Python). Root cause: compact-box tile culling uses log(con_o.w * 255.0f) without guarding con_o.w <= 1/255, while the rasterizer already skips splats with effective alpha below 1/255.

Environment

  • OS: Windows 11
  • GPU: NVIDIA RTX 5090 (sm_120)
  • PyTorch: 2.11 + CUDA 12.8
  • Dataset: COLMAP, ~100 views, images rescaled to ~1600×900
  • Command: python train.py -s data -m data

Reproduction

  1. Train until ~4k–7k iterations (densify continues; opacities drift down).
  2. Crash stack (representative):
    • gaussian_rendererdiff_gaussian_rasterization_fastgs._C.rasterize_gaussians
    • Sometimes mis-attributed to visibility_filter : (radii > 0).nonzero().
  3. With CUDA_LAUNCH_BLOCKING=1 and pipe.debug=True (--debug_from N):
    • Error localizes to cuda_rasterizer/rasterizer_impl.cu around CHECK_CUDA(FORWARD::render(...)).
  4. Saved forward args (snapshot_fw.dump via debug path) showed at crash:
    • P ≈ 81461 Gaussians
    • opacity min ≈ 0.001086 (< 1/255 ≈ 0.0039215686)
    • Image ~1600×899

Root cause (proposed)

In cuda_rasterizer/auxiliary.h, duplicateToTilesTouched:

float t = 2.0f * log(con_o.w * 255.0f);
<img width="2396" height="925" alt="image" src="https://github.com/user-attachments/assets/ed45a306-801f-4f10-ac3d-85ad0cd4bdd2" />

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant