Skip to content

[DO NOT MERGE] Test#1154

Open
lamb-j wants to merge 6892 commits intoamd/dev/paakan/amd-main-testfrom
amd-staging
Open

[DO NOT MERGE] Test#1154
lamb-j wants to merge 6892 commits intoamd/dev/paakan/amd-main-testfrom
amd-staging

Conversation

@lamb-j
Copy link
Collaborator

@lamb-j lamb-j commented Jan 20, 2026

No description provided.

Comment on lines +12 to +14
runs-on: ubuntu-latest
steps:
- run: echo "Skipped"

Check warning

Code scanning / CodeQL

Workflow does not contain permissions Medium

Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {}

Copilot Autofix

AI about 1 month ago

To fix the problem, the workflow should explicitly specify restricted GITHUB_TOKEN permissions instead of relying on repository defaults. Since this job only runs a shell command and does not need to interact with the GitHub API, the safest and least-privileged configuration is to set permissions: { contents: read } (or even permissions: {} if your policies allow that). Adding the permissions block at the workflow root will apply to all jobs that do not override it.

The single best way to fix this without changing existing behavior is to add a permissions block near the top of .github/workflows/ci_weekly.yml, just under the name: (or under on: if you prefer), setting contents: read. No imports or additional definitions are needed because this is a YAML configuration change only. The donothing job can remain unchanged and will inherit these minimal permissions.

Concretely:

  • Edit .github/workflows/ci_weekly.yml.
  • Insert a top-level permissions: section after line 1 (or between lines 2 and 3) with contents: read.
  • Leave the jobs section and donothing job untouched so functionality stays the same.
Suggested changeset 1
.github/workflows/ci_weekly.yml

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/.github/workflows/ci_weekly.yml b/.github/workflows/ci_weekly.yml
--- a/.github/workflows/ci_weekly.yml
+++ b/.github/workflows/ci_weekly.yml
@@ -1,4 +1,6 @@
 name: WIP Placeholder CI Weekly
+permissions:
+  contents: read
 
 on:
     # For AMD GPU families that expect_failure, we run builds and tests from this scheduled trigger
EOF
@@ -1,4 +1,6 @@
name: WIP Placeholder CI Weekly
permissions:
contents: read

on:
# For AMD GPU families that expect_failure, we run builds and tests from this scheduled trigger
Copilot is powered by AI and may make mistakes. Always verify output.
andykaylor and others added 23 commits February 26, 2026 19:02
…83594)

When this document was converted from rst to markdown, the contents
didn't get updated correctly.
)

When cc1 runs out-of-process and crashes, sys::ExecuteAndWait returns -2
for signal-killed children. The resignaling block added in 15488a7
only handled CommandRes > 128, so the driver would exit normally with
code 1 instead of dying by signal.
Currently if there are operations between the loops we get a dominance
issue as the delinearlized index is added after the operations. This PR
fixes that.

For testing we also add a transform pattern that makes a direct call to
coalesceLoops as the existing pattern calls
coalescePerfectlyNestedSCFForLoops which does not consider the loop nest
perfectly nested if there are operations between them which is safer for
that usage.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…lvm#183572)

`BitCastOp::fold` called `Type::getIntOrFloatBitWidth()` on the source
element type without first verifying it satisfies `isIntOrFloat()`. When
the source vector has `index` element type (e.g. `vector<16xindex>`),
the assertion `only integers and floats have a bitwidth` fires.

Add an `srcElemType.isIntOrFloat()` guard to the condition so that the
constant-folding path is skipped for non-integer/float element types.

Fixes llvm#177835
…oison (llvm#183596)

When `constFoldBinaryOp<IntegerAttr>` is called with a `ub.poison`
operand, it propagates the poison attribute as its result. The fold
method for `arith.addui_extended` then attempted to cast this result to
`TypedAttr` via `llvm::cast<TypedAttr>(sumAttr)`, which failed with an
assertion because `PoisonAttr` does not implement the `TypedAttr`
interface.

Fix this by checking whether the folded sum is a poison attribute before
the cast. When poison is detected, it is propagated to both the sum and
overflow results.

Fixes llvm#181534
…s of partial specializations (llvm#183348)

This fixes a helper so it implements retrieval of the argument replaced
for a template parameter for partial spcializations.

This was left out of the original patch, since it's quite hard to
actually test.

This helper implements the retrieval for variable templates, but only
for completeness sake, as no current users rely on this, as I don't
think a similar test case is possible to implement with variable
templates.

This fixes a regression introduced in llvm#161029 which will be backported
to llvm-22, so there are no release notes.

Fixes llvm#181062
Fixes llvm#181410
…#183487)

BF16 source operands use F32 inline constant values, so set OP_SEL to
select the high half of the constant, since BF16 encoding matches the
high 16 bits of F32 encoding. This behaviour is different from F16
source operands which use F16 constant values in the low 16 bits.

Fixes: llvm#183337
This is another instance of the logic from llvm#183159. If we know
one source is not-infinity, and the other source is less than or
equal to 1, this cannot overflow. Special case llvm.amdgcn.trig.preop,
as a substitute for proper range tracking. This almost enables pruning
edge case handling in trig function implementations, if not for the
recursion depth limit (but that's a problem for another day).
As in title. Only `reassoc` pattern was supplied -- for completeness all
should be supplied. Make FastMathFlag ctor public as well.
The Metal Shader converter can output shader reflection information into
a JSON file. This connects the -Fre flag (DXC's flag for reflection) to
the Metal Shader Converter tool step to produce the JSON file. As a
temporary state the -Fre flag will error when used without the -metal
flag.

This is required to address
llvm/offload-test-suite#452
Summary:
This is needed on some platforms like Windows when the generated command
line becomes too large. This seems to be occurring in practice so we
need to support this. Uses the same basic support clang does.

No test because there isn't any current infrastructure to support it,
will likely be "tested" by ROCBLAS builds not failing anymore on
Windows.
Summary:
This patch matches CUDA, moving the HIP compilation jobs to the new
driver by default. The old behavior will return with
`--no-offload-new-driver`. The main difference is that objects compiled
with the old driver are no longer compatible and will need to be
recompiled or the old driver used.
Summary:
This is needed on some platforms like Windows when the generated command
line becomes too large. This seems to be occurring in practice so we
need to support this. Uses the same basic support clang does.

No test because there isn't any current infrastructure to support it,
will likely be "tested" by ROCBLAS builds not failing anymore on
Windows.
We have accumulated four places where variables were only being used in
asserts. This change silences the warnings for that.
…183548)

Summary:
These command line invocations can become so large that they no longer
fit, we should support response files in this case so the build on
Windows can be unblocked with the new driver.
llvm#180563)

Fixes llvm#154713.

The crash was due to `Index` sometimes being an unsigned 64-bit integer
which was being zero-extended to a signed 64-bit, triggering an
assertion failure in `APSInt::getExtValue`. This patch zero-extends it
to a unsigned 64-bit integer instead, since `HandleLValueVectorElement`
takes in a `uint64_t` anyway.
In CGOpenMPRuntimeGPU::translateParameter, reference-type captured
variables were translated to pointer parameters with two address-space
annotations:

1. LangAS::opencl_global on the pointee (for map'd variables), which
correctly produces ptr addrspace(1) in NVPTX IR.
2. getLangASFromTargetAS(NVPTX_local_addr=5) on the pointer itself,
annotating the parameter as living in NVPTX local (stack) memory.

The second annotation is incorrect at the Clang type-system level:
EmitParmDecl only supports parameters to be in LangAS::Default (or the
special cases for OpenCL).

Temporarily add an assert in EmitParmDecl that catches parameters with
non-default address spaces in non-OpenCL compilations, and fix the
violation by dropping the NVPTX_local_addr addAddressSpace call.

Should fix the issue noticed in

llvm#181256 (comment),
allowing removing that special case there for OpenMP, though I haven't
tested the combination yet. That PR would fix EmitParmDecl to actually
support non-default address spaces from Sema, and will remove this
assert again.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…s in `JSONFormat`

This changes fixes the diagnostic infrastructure in `JSONFormat`
implementation to pass model objects (`EntityId`, `EntityLinkage`,
`BuildNamespace`, `NestedBuildNamespace`, `SummaryName`) directly to
`ErrorBuilder` instead of manually extracting their components. This
relies on existing `llvm::format_provider` specializations for these
objects.

To support consistent string conversion for `BuildNamespaceKind` and
`EntityLinkageType`, across both serialization and `operator<<`,
`toString`/`fromString` functions have been introduced in an internal
header `ModelStringConversions.h`.

`EntityLinkage::LinkageType` is promoted to a standalone enum class
`EntityLinkageType` at namespace scope, following the same pattern as
`BuildNamespaceKind`.

Tests have been added for `operator<<` and `format_provider` for all
affected types, and a new `ModelStringConversionsTest.cpp` directly
unit-tests the `toString/fromString` functions including round-trip and
unknown-input cases.
This patch fixes a HexagonConstPropagation assert when evaluating
sign-bit CONST32/CONST64 immediates (e.g. 0x80000000) after ConstantInt
stopped implicitly truncating, by allowing truncation for that signed
case.
…to wi pass (llvm#181917)

This PR adds distribution pattern for xegpu.load & store ops for the new
sg-to-wi pass
Fix incremental build failure when using PCH build and clang-cache
together.
Men-cotton and others added 30 commits February 28, 2026 13:12
…-load.hlsl (llvm#182817)

Update clang/test/CIR/CodeGenHLSL/matrix-element-expr-load.hlsl to use
`-verify` with expected CIR NYI diagnostics.
…-for-size` passed together (llvm#183642)

Currently, the asserts fires when both `UseExtTspForPerf` and
`UseExtTspForSize` are true on a given function.

Ideally, we should allow `-enable-ext-tsp-block-placement` and
`-apply-ext-tsp-for-size` passed together, meaning run the block
placement for performance on hot functions, while run the placement for
size on cold functions.

The diff makes `UseExtTspForPerf` and `UseExtTspForSize` mutually
exclusive per-function: functions with the `OptForSize` attribute use
ext-tsp block placement for size, while the others use ext-tsp block
placement for perf.

Co-authored-by: Sharon Xu <sharonxu@fb.com>
  __builtin_elementwise_{max,min}
becomes
  __builtin_elementwise_{maximum,minimum}
…tpcrel (llvm#155776)

Similar to llvm#132569 for RISC-V, replace the unofficial `@plt` and
`@gotpcrel` relocation specifiers, currently only used by clang
-fexperimental-relative-c++-abi-vtables, with %pltpcrel %gotpcrel. The
syntax is not used in humand-written assembly code, and is not supported
by GNU assembler.

Also replace the recent `@funcinit` with `%funcinit(x)`.
…lvm#183778)

After successfully commuting an instruction to be compatible with the
current VGPR MSB mode, update CurrentMode with the commuted
instruction's mode requirements. This locks in the mode bits the
commuted instruction relies on, preventing later instructions from
piggybacking and corrupting those bits.

Without this fix, a subsequent instruction needing a different mode
could piggyback onto the preceding s_set_vgpr_msb and change mode bits
that the commuted instruction depends on. For example, a nullopt src1
position (treated as 0) could be overwritten to a different value,
causing incorrect register encoding for the commuted instruction.

The fix still allows compatible piggybacking - instructions that only
add new mode bits without changing existing ones can still piggyback.
…rsion (llvm#183472)

Wire the llvm-mc --reloc-section-sym={all,internal,none} option through
the clang driver (-Wa,--reloc-section-sym=) and cc1as
(--reloc-section-sym=). The option is only valid for ELF targets.

GNU Assembler will add the option as well.
…3885)

This test exercises macOS-specific linker functionality (-delay_library)
and uses a hardcoded local working directory for the launch info. It
should not run against a remote platform where neither condition holds.

Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
…vm#183835)

Functionally reverts a80d432, with new
test.
This should be applied somewhere, but this is the wrong place.

Fixes regression reported after llvm#182444
…vm#183890)

UseAtForSpecifier defaults to false in MCAsmInfo, and RISCVMCAsmInfo
never calls initializeAtSpecifiers (which sets it to true).
Add additional tests to cover missing code paths when narrowing
interleave groups:
 * tail-folding
 * interleave-groups that require a scalar iteration.
In order to start testing DWARFv6 feature support we need to bump this
version for tooling to work.

This does not mean we officially support DWARFv6. It just enables us
testing the features gradually.
)

Bumps `.debug_line` maximum supported version to DWARFv6.

This does not mean we officially support DWARFv6. It just enables us
testing the features gradually.
)

Bumps `.debug_rnglists` maximum supported version to DWARFv6.

This does not mean we officially support DWARFv6. It just enables us
testing the features gradually.

Added unit-test since there was no prior test in the entire LLVM
test-suite that checked this.
This uses the newly added code from llvm#182051 to optimize to MVE sli and
sri. The only major difference is the legal types supported, but we also
lower intrinsics via VSLIIMM/VSLIIMM, so that only one tablegen pattern
is needed.
…vm#183813)

Precompiled headers are already skipped when building ConstantFolding.cpp with MSVC, they cause problems with Clang too so disable it there the same way.
Depends on:
* llvm#183838
* llvm#183841
* llvm#183859

Bumps the supported version to 6. Unit header layout hasn't changed
between versions AFAIK, so re-used the DWARF5 `FileCheck` in the test.
This by no means claims full DWARFv6 support, but is handy for testing
DWARFv6 features while full support is being gradually implemented.
llvm#183785)

`foldReshapeOp` (in `ReshapeOpsUtils.h`) and `FoldReshapeWithConstant`
(in `TensorOps.cpp`) both tried to create a new `DenseElementsAttr`
constant when folding a reshape op whose operand is a constant. Neither
checked that the result type was statically shaped before doing so, but
`DenseElementsAttr::reshape()` and
`DenseElementsAttr::getFromRawBuffer()` both assert `hasStaticShape()`.

Guard both fold paths with a `hasStaticShape()` check so they return
early when the result type contains a dynamic dimension.

Fixes llvm#177845
… used across blocks (llvm#183828)

The noSkipBlockErasure callback in TestVisitors.cpp dropped uses of op
results within the same region before erasing a block, but did not drop
uses of the block's own arguments (e.g. function entry block arguments).
When the block was subsequently erased its block arguments were
destroyed while their use-lists were still non-empty, triggering the
assertion in IRObjectWithUseList::~IRObjectWithUseList().

Fix this by also iterating over the block's arguments and dropping any
uses that belong to the same parent region. This mirrors the existing
logic for op result uses and makes the block-erasure walk handle IRs
where function arguments are consumed by ops in sibling blocks.

Also replace `block->front().getParentRegion()` with
`block->getParent()` for robustness (avoids UB when the block has no
ops).

Add a regression test based on the reproducer from
llvm#182996.

Fixes llvm#182996
…83073)

Change(s):

- Suppress range errors in CounterExpr
…183757)

The `noSkipBlockErasure` callback in `testNoSkipErasureCallbacks` called
`block->front().getParentRegion()` to get the parent region of a block.
This dereferences the ilist sentinel node when the block has no
operations, triggering an assertion failure.

Use `block->getParent()` instead, which directly returns the region
containing the block without requiring any operations to be present.

Fixes llvm#183511
…ub.poison (llvm#183816)

`AffineLinearizeIndexOp::fold` guarded the constant-folding path with
`llvm::is_contained(adaptor.getMultiIndex(), nullptr)`, which only
catches operands that have not been evaluated at all. When an operand
folds to `ub.PoisonAttr`, the attribute is non-null so the guard passed,
and the subsequent `cast<IntegerAttr>(indexAttr)` call crashed with an
assertion failure.

Fix by replacing the null-only check with one that requires every
multi-index attribute to be a concrete `IntegerAttr`, returning
`nullptr` for any other attribute (including null and PoisonAttr).

Fixes llvm#178204
)

In CGOpenMPRuntimeGPU::translateParameter, reference-type captured
variables were translated to pointer parameters with two address-space
annotations:

1. LangAS::opencl_global on the pointee (for map'd variables), which
correctly produces ptr addrspace(1) in NVPTX IR.
2. getLangASFromTargetAS(NVPTX_local_addr=5) on the pointer itself,
annotating the parameter as living in NVPTX local (stack) memory.

The second annotation is incorrect at the Clang type-system level:
EmitParmDecl only supports parameters to be in LangAS::Default (or the
special cases for OpenCL).

Temporarily add an assert in EmitParmDecl that catches parameters with
non-default address spaces in non-OpenCL compilations, and fix the
violation by dropping the NVPTX_local_addr addAddressSpace call.

Should fix the issue noticed in


llvm#181256 (comment),
allowing removing that special case there for OpenMP, though I haven't
tested the combination yet. That PR would fix EmitParmDecl to actually
support non-default address spaces from Sema, and will remove this
assert again.

Co-authored-by: Jameson Nash <vtjnash@gmail.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…#179688)

The usage of pointers to member functions with Pointer Authentication
requires generation of `*_vfpthunk_` functions. These thunk functions
can be later inlined and optimized by replacing the indirect call
instruction with a direct one and then inlining that function call.

In absence of `!dbg` metadata attached to the original call instruction,
such inlining ultimately results in an assertion "!dbg attachment points
at wrong subprogram for function" in the assertions-enabled builds. By
manually executing `opt` with `-verify-each` option on the LLVM IR
produced by the frontend, an actual issue can be observed: "inlinable
function call in a function with debug info must have a !dbg location"
after the replacement of indirect call instruction with the direct one
takes place.

This commit fixes the issue by attaching artificial `!dbg` locations to
the original call instruction (as well as most other instructions in
`*_vfpthunk_` function) the same way it is done for other
compiler-generated helper functions.
#1601)

…rrno. (llvm#183099)

It came up in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123826 that
GCC was simplifying pow(x, 2.0) -> x * x, even when doing so caused
-fmath-errno to be ignored. This patch fixes a similar bug in LLVM.

For ConstantFolding folding powf expressions that may raise exceptions,
see llvm#183102.

Co-authored-by: Ricardo Jesus <rjj@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.