[DO NOT MERGE] Test by lamb-j · Pull Request #1154 · ROCm/llvm-project

lamb-j · 2026-01-20T23:00:02Z

No description provided.

.github/workflows/ci_weekly.yml

+    runs-on: ubuntu-latest
+    steps:
+      - run: echo "Skipped"


To fix the problem, the workflow should explicitly specify restricted GITHUB_TOKEN permissions instead of relying on repository defaults. Since this job only runs a shell command and does not need to interact with the GitHub API, the safest and least-privileged configuration is to set permissions: { contents: read } (or even permissions: {} if your policies allow that). Adding the permissions block at the workflow root will apply to all jobs that do not override it.

The single best way to fix this without changing existing behavior is to add a permissions block near the top of .github/workflows/ci_weekly.yml, just under the name: (or under on: if you prefer), setting contents: read. No imports or additional definitions are needed because this is a YAML configuration change only. The donothing job can remain unchanged and will inherit these minimal permissions.

Concretely:

Edit .github/workflows/ci_weekly.yml.

Insert a top-level permissions: section after line 1 (or between lines 2 and 3) with contents: read.

Leave the jobs section and donothing job untouched so functionality stays the same.

…83594) When this document was converted from rst to markdown, the contents didn't get updated correctly.

) When cc1 runs out-of-process and crashes, sys::ExecuteAndWait returns -2 for signal-killed children. The resignaling block added in 15488a7 only handled CommandRes > 128, so the driver would exit normally with code 1 instead of dying by signal.

Currently if there are operations between the loops we get a dominance issue as the delinearlized index is added after the operations. This PR fixes that. For testing we also add a transform pattern that makes a direct call to coalesceLoops as the existing pattern calls coalescePerfectlyNestedSCFForLoops which does not consider the loop nest perfectly nested if there are operations between them which is safer for that usage. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

…lvm#183572) `BitCastOp::fold` called `Type::getIntOrFloatBitWidth()` on the source element type without first verifying it satisfies `isIntOrFloat()`. When the source vector has `index` element type (e.g. `vector<16xindex>`), the assertion `only integers and floats have a bitwidth` fires. Add an `srcElemType.isIntOrFloat()` guard to the condition so that the constant-folding path is skipped for non-integer/float element types. Fixes llvm#177835

…oison (llvm#183596) When `constFoldBinaryOp<IntegerAttr>` is called with a `ub.poison` operand, it propagates the poison attribute as its result. The fold method for `arith.addui_extended` then attempted to cast this result to `TypedAttr` via `llvm::cast<TypedAttr>(sumAttr)`, which failed with an assertion because `PoisonAttr` does not implement the `TypedAttr` interface. Fix this by checking whether the folded sum is a poison attribute before the cast. When poison is detected, it is propagated to both the sum and overflow results. Fixes llvm#181534

…s of partial specializations (llvm#183348) This fixes a helper so it implements retrieval of the argument replaced for a template parameter for partial spcializations. This was left out of the original patch, since it's quite hard to actually test. This helper implements the retrieval for variable templates, but only for completeness sake, as no current users rely on this, as I don't think a similar test case is possible to implement with variable templates. This fixes a regression introduced in llvm#161029 which will be backported to llvm-22, so there are no release notes. Fixes llvm#181062 Fixes llvm#181410

…#183487) BF16 source operands use F32 inline constant values, so set OP_SEL to select the high half of the constant, since BF16 encoding matches the high 16 bits of F32 encoding. This behaviour is different from F16 source operands which use F16 constant values in the low 16 bits. Fixes: llvm#183337

This is another instance of the logic from llvm#183159. If we know one source is not-infinity, and the other source is less than or equal to 1, this cannot overflow. Special case llvm.amdgcn.trig.preop, as a substitute for proper range tracking. This almost enables pruning edge case handling in trig function implementations, if not for the recursion depth limit (but that's a problem for another day).

As in title. Only `reassoc` pattern was supplied -- for completeness all should be supplied. Make FastMathFlag ctor public as well.

The Metal Shader converter can output shader reflection information into a JSON file. This connects the -Fre flag (DXC's flag for reflection) to the Metal Shader Converter tool step to produce the JSON file. As a temporary state the -Fre flag will error when used without the -metal flag. This is required to address llvm/offload-test-suite#452

Summary: This is needed on some platforms like Windows when the generated command line becomes too large. This seems to be occurring in practice so we need to support this. Uses the same basic support clang does. No test because there isn't any current infrastructure to support it, will likely be "tested" by ROCBLAS builds not failing anymore on Windows.

Summary: This patch matches CUDA, moving the HIP compilation jobs to the new driver by default. The old behavior will return with `--no-offload-new-driver`. The main difference is that objects compiled with the old driver are no longer compatible and will need to be recompiled or the old driver used.

Summary: This is needed on some platforms like Windows when the generated command line becomes too large. This seems to be occurring in practice so we need to support this. Uses the same basic support clang does. No test because there isn't any current infrastructure to support it, will likely be "tested" by ROCBLAS builds not failing anymore on Windows.

We have accumulated four places where variables were only being used in asserts. This change silences the warnings for that.

…183548) Summary: These command line invocations can become so large that they no longer fit, we should support response files in this case so the build on Windows can be unblocked with the new driver.

llvm#180563) Fixes llvm#154713. The crash was due to `Index` sometimes being an unsigned 64-bit integer which was being zero-extended to a signed 64-bit, triggering an assertion failure in `APSInt::getExtValue`. This patch zero-extends it to a unsigned 64-bit integer instead, since `HandleLValueVectorElement` takes in a `uint64_t` anyway.

In CGOpenMPRuntimeGPU::translateParameter, reference-type captured variables were translated to pointer parameters with two address-space annotations: 1. LangAS::opencl_global on the pointee (for map'd variables), which correctly produces ptr addrspace(1) in NVPTX IR. 2. getLangASFromTargetAS(NVPTX_local_addr=5) on the pointer itself, annotating the parameter as living in NVPTX local (stack) memory. The second annotation is incorrect at the Clang type-system level: EmitParmDecl only supports parameters to be in LangAS::Default (or the special cases for OpenCL). Temporarily add an assert in EmitParmDecl that catches parameters with non-default address spaces in non-OpenCL compilations, and fix the violation by dropping the NVPTX_local_addr addAddressSpace call. Should fix the issue noticed in llvm#181256 (comment), allowing removing that special case there for OpenMP, though I haven't tested the combination yet. That PR would fix EmitParmDecl to actually support non-default address spaces from Sema, and will remove this assert again. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

…s in `JSONFormat` This changes fixes the diagnostic infrastructure in `JSONFormat` implementation to pass model objects (`EntityId`, `EntityLinkage`, `BuildNamespace`, `NestedBuildNamespace`, `SummaryName`) directly to `ErrorBuilder` instead of manually extracting their components. This relies on existing `llvm::format_provider` specializations for these objects. To support consistent string conversion for `BuildNamespaceKind` and `EntityLinkageType`, across both serialization and `operator<<`, `toString`/`fromString` functions have been introduced in an internal header `ModelStringConversions.h`. `EntityLinkage::LinkageType` is promoted to a standalone enum class `EntityLinkageType` at namespace scope, following the same pattern as `BuildNamespaceKind`. Tests have been added for `operator<<` and `format_provider` for all affected types, and a new `ModelStringConversionsTest.cpp` directly unit-tests the `toString/fromString` functions including round-trip and unknown-input cases.

This patch fixes a HexagonConstPropagation assert when evaluating sign-bit CONST32/CONST64 immediates (e.g. 0x80000000) after ConstantInt stopped implicitly truncating, by allowing truncation for that signed case.

…to wi pass (llvm#181917) This PR adds distribution pattern for xegpu.load & store ops for the new sg-to-wi pass

Fix incremental build failure when using PCH build and clang-cache together.

…-load.hlsl (llvm#182817) Update clang/test/CIR/CodeGenHLSL/matrix-element-expr-load.hlsl to use `-verify` with expected CIR NYI diagnostics.

…-for-size` passed together (llvm#183642) Currently, the asserts fires when both `UseExtTspForPerf` and `UseExtTspForSize` are true on a given function. Ideally, we should allow `-enable-ext-tsp-block-placement` and `-apply-ext-tsp-for-size` passed together, meaning run the block placement for performance on hot functions, while run the placement for size on cold functions. The diff makes `UseExtTspForPerf` and `UseExtTspForSize` mutually exclusive per-function: functions with the `OptForSize` attribute use ext-tsp block placement for size, while the others use ext-tsp block placement for perf. Co-authored-by: Sharon Xu <sharonxu@fb.com>

__builtin_elementwise_{max,min} becomes __builtin_elementwise_{maximum,minimum}

…s aren't unused (llvm#183638) Fixes llvm#162619.

…tpcrel (llvm#155776) Similar to llvm#132569 for RISC-V, replace the unofficial `@plt` and `@gotpcrel` relocation specifiers, currently only used by clang -fexperimental-relative-c++-abi-vtables, with %pltpcrel %gotpcrel. The syntax is not used in humand-written assembly code, and is not supported by GNU assembler. Also replace the recent `@funcinit` with `%funcinit(x)`.

…lvm#183778) After successfully commuting an instruction to be compatible with the current VGPR MSB mode, update CurrentMode with the commuted instruction's mode requirements. This locks in the mode bits the commuted instruction relies on, preventing later instructions from piggybacking and corrupting those bits. Without this fix, a subsequent instruction needing a different mode could piggyback onto the preceding s_set_vgpr_msb and change mode bits that the commuted instruction depends on. For example, a nullopt src1 position (treated as 0) could be overwritten to a different value, causing incorrect register encoding for the commuted instruction. The fix still allows compatible piggybacking - instructions that only add new mode bits without changing existing ones can still piggyback.

…rsion (llvm#183472) Wire the llvm-mc --reloc-section-sym={all,internal,none} option through the clang driver (-Wa,--reloc-section-sym=) and cc1as (--reloc-section-sym=). The option is only valid for ELF targets. GNU Assembler will add the option as well.

…3885) This test exercises macOS-specific linker functionality (-delay_library) and uses a hardcoded local working directory for the launch info. It should not run against a remote platform where neither condition holds. Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>

…vm#183835) Functionally reverts a80d432, with new test. This should be applied somewhere, but this is the wrong place. Fixes regression reported after llvm#182444

…vm#183890) UseAtForSpecifier defaults to false in MCAsmInfo, and RISCVMCAsmInfo never calls initializeAtSpecifiers (which sets it to true).

Add additional tests to cover missing code paths when narrowing interleave groups: * tail-folding * interleave-groups that require a scalar iteration.

In order to start testing DWARFv6 feature support we need to bump this version for tooling to work. This does not mean we officially support DWARFv6. It just enables us testing the features gradually.

) Bumps `.debug_line` maximum supported version to DWARFv6. This does not mean we officially support DWARFv6. It just enables us testing the features gradually.

) Bumps `.debug_rnglists` maximum supported version to DWARFv6. This does not mean we officially support DWARFv6. It just enables us testing the features gradually. Added unit-test since there was no prior test in the entire LLVM test-suite that checked this.

This uses the newly added code from llvm#182051 to optimize to MVE sli and sri. The only major difference is the legal types supported, but we also lower intrinsics via VSLIIMM/VSLIIMM, so that only one tablegen pattern is needed.

…vm#183813) Precompiled headers are already skipped when building ConstantFolding.cpp with MSVC, they cause problems with Clang too so disable it there the same way.

Depends on: * llvm#183838 * llvm#183841 * llvm#183859 Bumps the supported version to 6. Unit header layout hasn't changed between versions AFAIK, so re-used the DWARF5 `FileCheck` in the test. This by no means claims full DWARFv6 support, but is handy for testing DWARFv6 features while full support is being gradually implemented.

llvm#183785) `foldReshapeOp` (in `ReshapeOpsUtils.h`) and `FoldReshapeWithConstant` (in `TensorOps.cpp`) both tried to create a new `DenseElementsAttr` constant when folding a reshape op whose operand is a constant. Neither checked that the result type was statically shaped before doing so, but `DenseElementsAttr::reshape()` and `DenseElementsAttr::getFromRawBuffer()` both assert `hasStaticShape()`. Guard both fold paths with a `hasStaticShape()` check so they return early when the result type contains a dynamic dimension. Fixes llvm#177845

… used across blocks (llvm#183828) The noSkipBlockErasure callback in TestVisitors.cpp dropped uses of op results within the same region before erasing a block, but did not drop uses of the block's own arguments (e.g. function entry block arguments). When the block was subsequently erased its block arguments were destroyed while their use-lists were still non-empty, triggering the assertion in IRObjectWithUseList::~IRObjectWithUseList(). Fix this by also iterating over the block's arguments and dropping any uses that belong to the same parent region. This mirrors the existing logic for op result uses and makes the block-erasure walk handle IRs where function arguments are consumed by ops in sibling blocks. Also replace `block->front().getParentRegion()` with `block->getParent()` for robustness (avoids UB when the block has no ops). Add a regression test based on the reproducer from llvm#182996. Fixes llvm#182996

…83073) Change(s): - Suppress range errors in CounterExpr

…183757) The `noSkipBlockErasure` callback in `testNoSkipErasureCallbacks` called `block->front().getParentRegion()` to get the parent region of a block. This dereferences the ilist sentinel node when the block has no operations, triggering an assertion failure. Use `block->getParent()` instead, which directly returns the region containing the block without requiring any operations to be present. Fixes llvm#183511

…ub.poison (llvm#183816) `AffineLinearizeIndexOp::fold` guarded the constant-folding path with `llvm::is_contained(adaptor.getMultiIndex(), nullptr)`, which only catches operands that have not been evaluated at all. When an operand folds to `ub.PoisonAttr`, the attribute is non-null so the guard passed, and the subsequent `cast<IntegerAttr>(indexAttr)` call crashed with an assertion failure. Fix by replacing the null-only check with one that requires every multi-index attribute to be a concrete `IntegerAttr`, returning `nullptr` for any other attribute (including null and PoisonAttr). Fixes llvm#178204

) In CGOpenMPRuntimeGPU::translateParameter, reference-type captured variables were translated to pointer parameters with two address-space annotations: 1. LangAS::opencl_global on the pointee (for map'd variables), which correctly produces ptr addrspace(1) in NVPTX IR. 2. getLangASFromTargetAS(NVPTX_local_addr=5) on the pointer itself, annotating the parameter as living in NVPTX local (stack) memory. The second annotation is incorrect at the Clang type-system level: EmitParmDecl only supports parameters to be in LangAS::Default (or the special cases for OpenCL). Temporarily add an assert in EmitParmDecl that catches parameters with non-default address spaces in non-OpenCL compilations, and fix the violation by dropping the NVPTX_local_addr addAddressSpace call. Should fix the issue noticed in llvm#181256 (comment), allowing removing that special case there for OpenMP, though I haven't tested the combination yet. That PR would fix EmitParmDecl to actually support non-default address spaces from Sema, and will remove this assert again. Co-authored-by: Jameson Nash <vtjnash@gmail.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

…#179688) The usage of pointers to member functions with Pointer Authentication requires generation of `*_vfpthunk_` functions. These thunk functions can be later inlined and optimized by replacing the indirect call instruction with a direct one and then inlining that function call. In absence of `!dbg` metadata attached to the original call instruction, such inlining ultimately results in an assertion "!dbg attachment points at wrong subprogram for function" in the assertions-enabled builds. By manually executing `opt` with `-verify-each` option on the LLVM IR produced by the frontend, an actual issue can be observed: "inlinable function call in a function with debug info must have a !dbg location" after the replacement of indirect call instruction with the direct one takes place. This commit fixes the issue by attaching artificial `!dbg` locations to the original call instruction (as well as most other instructions in `*_vfpthunk_` function) the same way it is done for other compiler-generated helper functions.

#1601) …rrno. (llvm#183099) It came up in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123826 that GCC was simplifying pow(x, 2.0) -> x * x, even when doing so caused -fmath-errno to be ignored. This patch fixes a similar bug in LLVM. For ConstantFolding folding powf expressions that may raise exceptions, see llvm#183102. Co-authored-by: Ricardo Jesus <rjj@nvidia.com>

github-advanced-security bot found potential problems Jan 21, 2026

View reviewed changes

skganesan008 temporarily deployed to rock-ci January 21, 2026 02:37 — with GitHub Actions Inactive

skganesan008 temporarily deployed to rock-ci January 22, 2026 02:40 — with GitHub Actions Inactive

skganesan008 temporarily deployed to rock-ci January 23, 2026 02:37 — with GitHub Actions Inactive

andykaylor and others added 23 commits February 26, 2026 19:02

[CIR][docs] Fix table of contents for CIR eh and cleanups doc (llvm#1…

bf8e006

…83594) When this document was converted from rst to markdown, the contents didn't get updated correctly.

merge main into amd-staging

d8eab11

Add remaining patterns for floating-point flag matches (llvm#173912)

32f9a40

As in title. Only `reassoc` pattern was supplied -- for completeness all should be supplied. Make FastMathFlag ctor public as well.

[CIR][NFC] Fix unused variable warnings (llvm#183604)

19c862d

We have accumulated four places where variables were only being used in asserts. This change silences the warnings for that.

[Clang] Enable response file support for 'llvm-offload-binary' (llvm#…

566e25d

…183548) Summary: These command line invocations can become so large that they no longer fit, we should support response files in this case so the build on Windows can be unblocked with the new driver.

[PowerPC] Remove NoNaNsFPMath uses (llvm#183449)

dc66b51

[Hexagon] Fix assert on sign-bit CONST32 immediates (llvm#182118)

4c60c01

This patch fixes a HexagonConstPropagation assert when evaluating sign-bit CONST32/CONST64 immediates (e.g. 0x80000000) after ConstantInt stopped implicitly truncating, by allowing truncation for that signed case.

[MLIR][XeGPU] Add distribution pattern for xegpu.load & store for sg …

52f76c0

…to wi pass (llvm#181917) This PR adds distribution pattern for xegpu.load & store ops for the new sg-to-wi pass

[cmake] Don't use PCH when clang-cache launcher is used (llvm#183620)

4efc6a9

Fix incremental build failure when using PCH build and clang-cache together.

Men-cotton and others added 30 commits February 28, 2026 13:12

[CIR] Use -verify on clang/test/CIR/CodeGenHLSL/matrix-element-expr…

d72e95b

…-load.hlsl (llvm#182817) Update clang/test/CIR/CodeGenHLSL/matrix-element-expr-load.hlsl to use `-verify` with expected CIR NYI diagnostics.

[amd/device-libs] __builtin_elementwise_max ...

04484e4

__builtin_elementwise_{max,min} becomes __builtin_elementwise_{maximum,minimum}

[clang-tidy] Teach misc-unused-using-decls that exported using-decl…

ce6a3d9

…s aren't unused (llvm#183638) Fixes llvm#162619.

InstCombine: Stop applying nofpclass from use nofpclass attribute (ll…

1ff1e5f

…vm#183835) Functionally reverts a80d432, with new test. This should be applied somewhere, but this is the wrong place. Fixes regression reported after llvm#182444

merge main into amd-staging (#1599)

a3f9f6a

RISCVMCAsmInfo: Remove redundant UseAtForSpecifier = false. NFC (ll…

55f9cf3

…vm#183890) UseAtForSpecifier defaults to false in MCAsmInfo, and RISCVMCAsmInfo never calls initializeAtSpecifiers (which sets it to true).

[LV] Add tail-folding & required scalar epilogue tests for IG narrowing.

ab2908e

Add additional tests to cover missing code paths when narrowing interleave groups: * tail-folding * interleave-groups that require a scalar iteration.

[llvm][DebugInfo] Bump DWARFContext maximum DWARF version (llvm#183838)

c40b0b2

In order to start testing DWARFv6 feature support we need to bump this version for tooling to work. This does not mean we officially support DWARFv6. It just enables us testing the features gradually.

[llvm][DebugInfo] Bump DWARFDebugLine maximum DWARF version (llvm#183841

ce3460e

) Bumps `.debug_line` maximum supported version to DWARFv6. This does not mean we officially support DWARFv6. It just enables us testing the features gradually.

[ARM][MVE] Add SLI and SRI recognition. (llvm#183471)

9b1f784

This uses the newly added code from llvm#182051 to optimize to MVE sli and sri. The only major difference is the legal types supported, but we also lower intrinsics via VSLIIMM/VSLIIMM, so that only one tablegen pattern is needed.

[CMake][LLVM] Disable PCH on Clang for file with custom flags too (ll…

3403aac

…vm#183813) Precompiled headers are already skipped when building ConstantFolding.cpp with MSVC, they cause problems with Clang too so disable it there the same way.

Restore llvm#125407, Make covmap tolerant of nested Decisions (llvm#1…

2456214

…83073) Change(s): - Suppress range errors in CounterExpr

[AArch64] Add fcvt-i256 test cases. NFC

0b61f15

merge main into amd-staging

e61d49a

merge main into amd-staging (#1600)

baed2c8

[Revert_patches.txt] cleanup (#1605)

bf52cf2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DO NOT MERGE] Test#1154

[DO NOT MERGE] Test#1154
lamb-j wants to merge 6892 commits intoamd/dev/paakan/amd-main-testfrom
amd-staging

lamb-j commented Jan 20, 2026

Uh oh!

Check warning

Copilot Autofix

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

@@ -1,4 +1,6 @@
             name: WIP Placeholder CI Weekly
+            permissions:
+              contents: read
             on:
                 # For AMD GPU families that expect_failure, we run builds and tests from this scheduled trigger

Conversation

lamb-j commented Jan 20, 2026

Uh oh!

Check warning

Copilot Autofix

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants