Skip to content

building torch 2.10 with ROCm 7.11 #2996

@bluefalcon13

Description

@bluefalcon13

🐛 Describe the bug

When building pytorch via the release/2.10 branch, using a docker with ROCm 7.11 installed via apt repos, libkineto does not recognize the new /opt/rocm folder structure. Here is the output from running the build:

-- Configuring Kineto dependency:
--   KINETO_SOURCE_DIR = /opt/build/pytorch/third_party/kineto/libkineto
--   KINETO_BUILD_TESTS = OFF
--   KINETO_LIBRARY_TYPE = static
--  ROCM_SOURCE_DIR = /opt/rocm
-- Could not find nvcc, please set CUDAToolkit_ROOT.

Additional info:
Docker build has applied "yee ol' shotgun of vars" to try and ensure all of these edge cases get caught:

# Environment for Strix Halo (40 CU)
ENV ROCM_PATH=/opt/rocm/core \
    VIRTUAL_ENV=/opt/venv
ENV PYTORCH_ROCM_ARCH=gfx1151 \
    HIP_PATH=$ROCM_PATH \
    # Many AMD build scripts specifically look for ROCM_HOME
    ROCM_HOME=$ROCM_PATH \
    CMAKE_PREFIX_PATH=$ROCM_PATH \
    C_INCLUDE_PATH=$ROCM_PATH/include \
    CPLUS_INCLUDE_PATH=$ROCM_PATH/include \
    HIP_PLATFORM=amd \
    PATH="$VIRTUAL_ENV/bin:$ROCM_PATH/bin:$ROCM_PATH/llvm/bin:$PATH"

a slew of arguments, both CMAKE_ARGS and environment args were attempted to ensure I was configuring everything correctly, but my FAVORITE was this:

Building wheel torch-2.10.0+gita3ed239
Found cmake (/opt/venv/bin/cmake) version: 4.0.0 (>=3.27)
-- Building version 2.10.0+gita3ed239
cmake -GNinja -DBUILD_PYTHON=True -DBUILD_TEST=True -DCMAKE_ARGS=-DROCM_SOURCE_DIR=/opt/rocm/core -DCMAKE_POLICY_VERSION_MINIMUM=3.5 -DCMAKE_DEBUG_FIND_PACKAGE_roctracer=ON --debug-find -DCMAKE_DEBUG_FIND_PACKAGE_roctracer=ON --debug-find -DCMAKE_BUILD_TYPE=Release -DCMAKE_EXECUTABLE=/opt/venv/bin/cmake -DCMAKE_INSTALL_PREFIX=/opt/build/pytorch/torch -DCMAKE_POLICY_VERSION_MINIMUM=3.5 -DCMAKE_PREFIX_PATH=/opt/venv/lib/python3.12/site-packages;/opt/rocm/core -DPython_EXECUTABLE=/opt/venv/bin/python3 -DPython_NumPy_INCLUDE_DIR=/opt/venv/lib/python3.12/site-packages/numpy/_core/include -DTORCH_BUILD_VERSION=2.10.0+gita3ed239 -DUSE_NINJA=1 -DUSE_NUMPY=True -DUSE_ROCM=1 /opt/build/pytorch
--------------------------------
-- Configuring Kineto dependency:
--   KINETO_SOURCE_DIR = /opt/build/pytorch/third_party/kineto/libkineto
--   KINETO_BUILD_TESTS = OFF
--   KINETO_LIBRARY_TYPE = static
--  ROCM_SOURCE_DIR = /opt/rocm
-- Could not find nvcc, please set CUDAToolkit_ROOT.
--------------------------------
-- Performing Test HAS_WNO_ERROR_ARRAY_BOUNDS - Success
INFOcaffe2 ROCM_SOURCE_DIR = /opt/rocm

WORKAROUND:
I created symlinks in the /opt/rocm/ folder, linking /opt/rocm/lib to the /opt/rocm/core/lib folder and the same for include.

Versions

Collecting environment information...
PyTorch version: N/A
Is debug build: N/A
CUDA used to build PyTorch: N/A
ROCM used to build PyTorch: N/A

OS: Ubuntu 24.04.4 LTS (x86_64)
GCC version: (Ubuntu 13.3.0-6ubuntu2~24.04.1) 13.3.0
Clang version: 22.0.0git (https://github.com/ROCm/llvm-project.git 4adeabb0862ea8119c143d9f8256475b0b687217+PATCHED:f3b5643f91ad4def7b92cd48247bc11f1f39fb5c)
CMake version: version 4.0.0
Libc version: glibc-2.39

Python version: 3.12.3 (main, Jan 22 2026, 20:57:42) [GCC 13.3.0] (64-bit runtime)
Python platform: Linux-6.18.9-zen1-2-zen-x86_64-with-glibc2.39
Is CUDA available: N/A
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: Could not collect
Nvidia driver version: Could not collect
cuDNN version: Could not collect
Is XPU available: N/A
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: N/A
Caching allocator config: N/A

CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 48 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 32
On-line CPU(s) list: 0-31
Vendor ID: AuthenticAMD
Model name: AMD RYZEN AI MAX+ 395 w/ Radeon 8060S
CPU family: 26
Model: 112
Thread(s) per core: 2
Core(s) per socket: 16
Socket(s): 1
Stepping: 0
Frequency boost: enabled
CPU(s) scaling MHz: 46%
CPU max MHz: 5187.5000
CPU min MHz: 625.0000
BogoMIPS: 5988.80
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good amd_lbr_v2 nopl xtopology nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpuid_fault cpb cat_l3 cdp_l3 hw_pstate ssbd mba perfmon_v2 ibrs ibpb stibp ibrs_enhanced vmmcall fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local user_shstk avx_vnni avx512_bf16 clzero irperf xsaveerptr rdpru wbnoinvd cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif x2avic v_spec_ctrl vnmi avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid bus_lock_detect movdiri movdir64b overflow_recov succor smca fsrm avx512_vp2intersect flush_l1d amd_lbr_pmc_freeze
Virtualization: AMD-V
L1d cache: 768 KiB (16 instances)
L1i cache: 512 KiB (16 instances)
L2 cache: 16 MiB (16 instances)
L3 cache: 64 MiB (2 instances)
NUMA node(s): 1
NUMA node0 CPU(s): 0-31
Vulnerability Gather data sampling: Not affected
Vulnerability Ghostwrite: Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Mmio stale data: Not affected
Vulnerability Old microcode: Not affected
Vulnerability Reg file data sampling: Not affected
Vulnerability Retbleed: Not affected
Vulnerability Spec rstack overflow: Mitigation; IBPB on VMEXIT only
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Enhanced / Automatic IBRS; IBPB conditional; STIBP always-on; PBRSB-eIBRS Not affected; BHI Not affected
Vulnerability Srbds: Not affected
Vulnerability Tsa: Not affected
Vulnerability Tsx async abort: Not affected
Vulnerability Vmscape: Mitigation; IBPB on VMEXIT

Versions of relevant libraries:
[pip3] numpy==2.1.2
[pip3] optree==0.13.0
[conda] Could not collect

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions