-
Notifications
You must be signed in to change notification settings - Fork 148
Description
Problem Description
Summary
test/integration/CMakeLists.txt sets the process-wide LD_PRELOAD environment variable at configure time, which contaminates all subsequent execute_process() calls and post-generate ninja invocations. On incremental asan builds (where libintercept.so already exists from a prior build), this causes rpm, ninja, and other system tools to crash with:
rpm: error while loading shared libraries: libclang_rt.asan-x86_64.so: cannot open shared object file: No such file or directory
/usr/local/bin/ninja: error while loading shared libraries: libclang_rt.asan-x86_64.so: cannot open shared object file: No such file or directory
CMake Generate step failed. Build files cannot be regenerated correctly.
Environment
- Build system: TheRock (ROCm super-build) with
--preset linux-release-asan - Container image:
ghcr.io/rocm/therock_build_manylinux_x86_64:latest(manylinux_2_28, RHEL 8 based) - aqlprofile sanitizer: OFF (
aqlprofile_SANITIZER=OFF) - Dependent libraries:
libhsa-runtime64.sobuilt with ASAN (ROCR-Runtime_SANITIZER=ASAN)
Root Cause
In test/integration/CMakeLists.txt (around line 66):
# Add a PRELOAD environment with libintercept
set(ENV{LD_PRELOAD} "$ENV{LD_PRELOAD}:${CMAKE_CURRENT_BINARY_DIR}/libintercept.so")
add_test(NAME testv2 COMMAND testv2)
set_tests_properties(testv2 PROPERTIES ENVIRONMENT "${LD_PRELOAD}" TIMEOUT 45 LABELS "unittests" FAIL_REGULAR_EXPRESSION "${AQLPROFILE_DEFAULT_FAIL_REGEX}")set(ENV{LD_PRELOAD} ...) modifies the cmake process environment, not a cmake variable. This environment persists for:
- All subsequent
execute_process()calls during configure (e.g.,rpm --eval %{?dist}for CPACK) - The post-generate
ninja -t recompactandninja -t restatoperations
Why it works on clean builds (and in CI)
On a first configure, libintercept.so does not yet exist in CMAKE_CURRENT_BINARY_DIR. The dynamic linker encounters a nonexistent file in LD_PRELOAD, prints a warning, and continues. System tools like rpm and ninja run successfully.
CI always starts with a clean build directory, so this path is always taken.
Why it fails on incremental builds
On a reconfigure (e.g., after a reboot, stamp file deletion, or any re-run of the configure step), libintercept.so already exists from the prior build. The dynamic linker loads it via LD_PRELOAD, which triggers its transitive dependency chain:
LD_PRELOAD: libintercept.so
→ DT_NEEDED: libhsa-amd-aqlprofile64.so
→ DT_NEEDED: libhsa-runtime64.so.1 (built with ASAN)
→ DT_NEEDED: libclang_rt.asan-x86_64.so → NOT FOUND → FATAL
The ASAN runtime is not discoverable because:
- aqlprofile has
SANITIZER=OFF, so no ASAN-relatedLD_LIBRARY_PATHor RPATH is configured - The ASAN runtime is not in the system
ldconfigcache - No
LD_LIBRARY_PATHpoints to the clang resource directory
Every system tool invoked by cmake (rpm, ninja, gzip, date) inherits the poisoned LD_PRELOAD and crashes.
Reproduction
- Configure and build aqlprofile in an ASAN super-build (first time — succeeds)
- Delete the configure stamp:
rm -f <build>/profiler/aqlprofile/stamp/configure.stamp - Re-run configure:
cmake --build <build> --target aqlprofile+configure - Observe failure:
rpm: error while loading shared libraries: libclang_rt.asan-x86_64.so
Additional Issue
The set_tests_properties call on the next line uses ${LD_PRELOAD} (a cmake variable), not $ENV{LD_PRELOAD} (the environment variable that was just set). Unless a cmake variable named LD_PRELOAD is defined elsewhere, this test property is likely empty/incorrect — meaning the set(ENV{...}) call is both harmful (pollutes configure) and ineffective for its intended purpose (setting the test environment).
Suggested Fix
Remove the set(ENV{LD_PRELOAD}) and set the preload directly via test properties:
# BEFORE (buggy):
set(ENV{LD_PRELOAD} "$ENV{LD_PRELOAD}:${CMAKE_CURRENT_BINARY_DIR}/libintercept.so")
add_test(NAME testv2 COMMAND testv2)
set_tests_properties(testv2 PROPERTIES ENVIRONMENT "${LD_PRELOAD}" TIMEOUT 45 LABELS "unittests" FAIL_REGULAR_EXPRESSION "${AQLPROFILE_DEFAULT_FAIL_REGEX}")
# AFTER (fixed):
add_test(NAME testv2 COMMAND testv2)
set_tests_properties(testv2 PROPERTIES
ENVIRONMENT "LD_PRELOAD=${CMAKE_CURRENT_BINARY_DIR}/libintercept.so"
TIMEOUT 45
LABELS "unittests"
FAIL_REGULAR_EXPRESSION "${AQLPROFILE_DEFAULT_FAIL_REGEX}")This scopes LD_PRELOAD to the test execution only, preventing configure-time environment pollution.
Workaround
Delete the stale libintercept.so before reconfiguring:
rm -f <build>/profiler/aqlprofile/build/test/integration/libintercept.soOperating System
manylinux
CPU
NA
GPU
NA
ROCm Version
7.11
ROCm Component
No response
Steps to Reproduce
No response
(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
No response
Additional Information
No response