Skip to content

[Issue]: set(ENV{LD_PRELOAD}) in test/integration/CMakeLists.txt poisons cmake configure on incremental builds #3438

@davidd-amd

Description

@davidd-amd

Problem Description

Summary

test/integration/CMakeLists.txt sets the process-wide LD_PRELOAD environment variable at configure time, which contaminates all subsequent execute_process() calls and post-generate ninja invocations. On incremental asan builds (where libintercept.so already exists from a prior build), this causes rpm, ninja, and other system tools to crash with:

rpm: error while loading shared libraries: libclang_rt.asan-x86_64.so: cannot open shared object file: No such file or directory

/usr/local/bin/ninja: error while loading shared libraries: libclang_rt.asan-x86_64.so: cannot open shared object file: No such file or directory

CMake Generate step failed.  Build files cannot be regenerated correctly.

Environment

  • Build system: TheRock (ROCm super-build) with --preset linux-release-asan
  • Container image: ghcr.io/rocm/therock_build_manylinux_x86_64:latest (manylinux_2_28, RHEL 8 based)
  • aqlprofile sanitizer: OFF (aqlprofile_SANITIZER=OFF)
  • Dependent libraries: libhsa-runtime64.so built with ASAN (ROCR-Runtime_SANITIZER=ASAN)

Root Cause

In test/integration/CMakeLists.txt (around line 66):

# Add a PRELOAD environment with libintercept
set(ENV{LD_PRELOAD} "$ENV{LD_PRELOAD}:${CMAKE_CURRENT_BINARY_DIR}/libintercept.so")

add_test(NAME testv2 COMMAND testv2)
set_tests_properties(testv2 PROPERTIES ENVIRONMENT "${LD_PRELOAD}" TIMEOUT 45 LABELS "unittests" FAIL_REGULAR_EXPRESSION "${AQLPROFILE_DEFAULT_FAIL_REGEX}")

set(ENV{LD_PRELOAD} ...) modifies the cmake process environment, not a cmake variable. This environment persists for:

  • All subsequent execute_process() calls during configure (e.g., rpm --eval %{?dist} for CPACK)
  • The post-generate ninja -t recompact and ninja -t restat operations

Why it works on clean builds (and in CI)

On a first configure, libintercept.so does not yet exist in CMAKE_CURRENT_BINARY_DIR. The dynamic linker encounters a nonexistent file in LD_PRELOAD, prints a warning, and continues. System tools like rpm and ninja run successfully.

CI always starts with a clean build directory, so this path is always taken.

Why it fails on incremental builds

On a reconfigure (e.g., after a reboot, stamp file deletion, or any re-run of the configure step), libintercept.so already exists from the prior build. The dynamic linker loads it via LD_PRELOAD, which triggers its transitive dependency chain:

LD_PRELOAD: libintercept.so
  → DT_NEEDED: libhsa-amd-aqlprofile64.so
  → DT_NEEDED: libhsa-runtime64.so.1  (built with ASAN)
    → DT_NEEDED: libclang_rt.asan-x86_64.so  →  NOT FOUND  →  FATAL

The ASAN runtime is not discoverable because:

  • aqlprofile has SANITIZER=OFF, so no ASAN-related LD_LIBRARY_PATH or RPATH is configured
  • The ASAN runtime is not in the system ldconfig cache
  • No LD_LIBRARY_PATH points to the clang resource directory

Every system tool invoked by cmake (rpm, ninja, gzip, date) inherits the poisoned LD_PRELOAD and crashes.

Reproduction

  1. Configure and build aqlprofile in an ASAN super-build (first time — succeeds)
  2. Delete the configure stamp: rm -f <build>/profiler/aqlprofile/stamp/configure.stamp
  3. Re-run configure: cmake --build <build> --target aqlprofile+configure
  4. Observe failure: rpm: error while loading shared libraries: libclang_rt.asan-x86_64.so

Additional Issue

The set_tests_properties call on the next line uses ${LD_PRELOAD} (a cmake variable), not $ENV{LD_PRELOAD} (the environment variable that was just set). Unless a cmake variable named LD_PRELOAD is defined elsewhere, this test property is likely empty/incorrect — meaning the set(ENV{...}) call is both harmful (pollutes configure) and ineffective for its intended purpose (setting the test environment).

Suggested Fix

Remove the set(ENV{LD_PRELOAD}) and set the preload directly via test properties:

# BEFORE (buggy):
set(ENV{LD_PRELOAD} "$ENV{LD_PRELOAD}:${CMAKE_CURRENT_BINARY_DIR}/libintercept.so")
add_test(NAME testv2 COMMAND testv2)
set_tests_properties(testv2 PROPERTIES ENVIRONMENT "${LD_PRELOAD}" TIMEOUT 45 LABELS "unittests" FAIL_REGULAR_EXPRESSION "${AQLPROFILE_DEFAULT_FAIL_REGEX}")

# AFTER (fixed):
add_test(NAME testv2 COMMAND testv2)
set_tests_properties(testv2 PROPERTIES
  ENVIRONMENT "LD_PRELOAD=${CMAKE_CURRENT_BINARY_DIR}/libintercept.so"
  TIMEOUT 45
  LABELS "unittests"
  FAIL_REGULAR_EXPRESSION "${AQLPROFILE_DEFAULT_FAIL_REGEX}")

This scopes LD_PRELOAD to the test execution only, preventing configure-time environment pollution.

Workaround

Delete the stale libintercept.so before reconfiguring:

rm -f <build>/profiler/aqlprofile/build/test/integration/libintercept.so

Operating System

manylinux

CPU

NA

GPU

NA

ROCm Version

7.11

ROCm Component

No response

Steps to Reproduce

No response

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions