merge main into amd-staging #430

ronlieb · 2025-10-29T23:31:43Z

No description provided.

…r create_nd_tdesc. (llvm#164283) Current lowering pattern for create_nd_tdesc restricts source memref to static shape. In case of a dynamic ranked memref, create_nd_tdesc already provides shape as an argument. Lowering can use those values instead of returning a mismatch error.

…#165588) Missing fold to make use of VTESTPD

Fixes the failing DIA unit test (https://lab.llvm.org/buildbot/#/builders/197/builds/10342) after llvm#165363. Now that the native plugin is the default, we need to set the symbol file plugin for DIA via the settings.

…#165516)

… in the AffineForEmptyLoopFolder (llvm#164064) Co-authored-by: Jakub Kuderski <[email protected]>

…/TESTP node just uses the ZF flag (llvm#165601) If we're just comparing against zero then move the constant to the RHS to reduce duplicated folds. Noticed while triaging llvm#156233

…vm#165453) Modify the python wrapper to return uint32_t, which prevents incorrect child name-to-index mapping and avoids performing redundant operations on non-existent SBValues.

This follows similar reasoning as 45ce887 (llvm#159556): LV does not preserve LCSSA, it constructs it just before processing a loop to vectorize. Runtime check expressions are invariant to that loop, so expanding them should not break LCSSA form for the loop we are about to vectorize. LV creates SCEV and memory runtime checks early on and then disconnects the blocks temporarily. The patch fixes a mis-compile, where previously LCSSA construction during SCEV expand may replace uses in currently unreachable SCEV/memory check blocks. Fixes llvm#162512 PR: llvm#165505

… operand in the AffineForEmptyLoopFolder" (llvm#165607) Reverts llvm#164064 Broke Windows on mlir-s390x-linux buildbot build, needs investigations.

…165570) Previously, `hoistCommonCodeFromSuccessors` returned early if one of the succ of BB has >1 predecessors. However, if the succ is an unreachable BB, we can relax the condition to perform `hoistCommonCodeFromSuccessors` based on the assumption of not reaching UB. See discussion dtcxzyw/llvm-opt-benchmark#2989 for details. Alive2 proof: https://alive2.llvm.org/ce/z/OJOw0s Promising optimization impact: dtcxzyw/llvm-opt-benchmark#2995

After the default PDB plugin changed to the native one (llvm#165363), this test failed, because it uses the size of public symbols and the native plugin sets the size to 0 (as PDB doesn't include this information explicitly). A PDB was built because the final executable in that test was linked with `-gdwarf`.

…nsert instruction If the gather/buildvector node has the match and this matching node has a scheduled copyable parent, and the parent node of the original node has a last instruction, which is non-schedulable and is part of the schedule copyable parent, such matching node should be excluded as non-matching, since it produces wrong def-use chain. Fixes llvm#165435

Attaching using `core`, `gdbremote` or `attachInfo` may have an error. fail early if it does.

This pr adds the equivalent validation of `llvm.loop` metadata that is [done in DXC](https://github.com/microsoft/DirectXShaderCompiler/blob/8f21027f2ad5dcfa63a275cbd278691f2c8fad33/lib/DxilValidation/DxilValidation.cpp#L3010). This is done as follows: - Add `llvm.loop` to the metadata allow-list in `DXILTranslateMetadata` - Iterate through all `llvm.loop` metadata nodes and strip all incompatible ones - Raise an error for ill-formed nodes that are compatible with DXIL Resolves: llvm#137387

... which silently caused the wrong overload to be selected.

…vm#165616)

…165618)

To my knowledge, NetBSD is mostly like other BSDs, but doesn't have `xlocale.h`. I think c664a7f may have inadvertently broken this. With this change, I was able to run [zig-bootstrap](https://github.com/ziglang/zig-bootstrap) to completion for `x86_64-netbsd10.1-none`.

…m#165611) When we create a `SparseIterator`, we sometimes wrap it in a `FilterIterator`, which delegates _some_ calls to the underlying `SparseIterator`. After construction, e.g. in `makeNonEmptySubSectIterator()`, we call `setSparseEmitStrategy()`. This sets the strategy only in one of the filters -- if we call `setSparseEmitStrategy()` immediately after creating the `SparseIterator`, then the wrapped `SparseIterator` will have the right strategy, and the `FilterIterator` strategy will be unintialized; if we call `setSparseEmitStrategy()` after wrapping the iterator in `FilterIterator`, then the opposite happens. If we make `setSparseEmitStrategy()` a virtual method so that it's included in the `FilterIterator` pattern, and then do all reads of `emitStrategy` via a virtual method as well, it's pretty simple to ensure that the value of `strategy` is being set consistently and correctly. Without this, the UB of strategy being uninitialized manifests as a sporadic test failure in mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_strided_conv_2d_nhwc_hwcf.mlir, when run downstream with the right flags (e.g. asan + assertions off). The test sometimes fails with `ne_sub<trivial<dense[0,1]>>.begin' op created with unregistered dialect`. It can also be directly observed w/ msan that this uninitialized read is the cause of that issue, but msan causes other problems w/ this test.

…165290) This pr introduces an allow-list for module metadata, this encompasses the llvm metadata nodes: `llvm.ident` and `llvm.module.flags`, as well as, the generated `dx.` options. Resolves: llvm#164473.

…lvm#165496) We currently use a background thread to read the DAP output. This means the test thread and the background thread can race at times and we may have inconsistent timing due to these races. To improve the consistency I've removed the reader thread and instead switched to using the `selectors` module that wraps `select` in a platform independent way.

) And reduce the number of getLLVMStyleWithColumnLimit calls.

Fix getShadowAddress computation by adding ShadowBase if it is not zero. Co-authored-by: anoopkg6 <[email protected]>

This consists of marking the various strict opcodes as legal, and adjusting instruction selection patterns so that 'op' is 'any_op'. The changes are similar to those in D114946 for AArch64. Custom lowering and promotion are set for some FP16 strict ops to work correctly. This PR is part of the work on adding strict FP support in ARM, which was previously discussed in llvm#137101.

…165574) The CRC optimization relies on stripping the auxiliary data completely, and should hence be forbidden when it has a user in the exit-block. Forbid this case, fixing a miscompile. Fixes llvm#165382.

…165281) CreateProcess fails with ERROR_INVALID_PARAMETER when duplicate HANDLEs are passed via `PROC_THREAD_ATTRIBUTE_HANDLE_LIST`. This can happen, for example, if stdout and stdin are the same device (e.g. a bidirectional named pipe), or if stdout and stderr are the same device. Fixes msys2/MINGW-packages#26030

z1-cciauto · 2025-10-29T23:32:28Z

PSDB Link: https://compiler-ci.amd.com/job/compiler-psdb-amd-staging/2548

silee2 and others added 30 commits October 29, 2025 09:55

[X86] vector-reduce-or-cmp.ll - add v4i64 signbit test coverage (llvm…

34a3488

…#165588) Missing fold to make use of VTESTPD

[LLDB][PDB] Explicitly set DIA plugin in unit test (llvm#165592)

957598f

Fixes the failing DIA unit test (https://lab.llvm.org/buildbot/#/builders/197/builds/10342) after llvm#165363. Now that the native plugin is the default, we need to set the symbol file plugin for DIA via the settings.

[flang][cuda][NFC] Enhance test for tma_bulk_g2s lowering (llvm#165603)

40c917f

[mlir][bufferize] Use the flag of skipRegions to print op (NFC) (llvm…

2dca188

…#165516)

[mlir][affine] Add fold logic when the affine.yield has IV as operand…

e24e7ff

… in the AffineForEmptyLoopFolder (llvm#164064) Co-authored-by: Jakub Kuderski <[email protected]>

[X86] combinePTESTCC - canonicalize constants to the RHS if the PTEST…

ba769e1

…/TESTP node just uses the ZF flag (llvm#165601) If we're just comparing against zero then move the constant to the RHS to reduce duplicated folds. Noticed while triaging llvm#156233

[lldb] Do not narrow GetIndexOfChildWithName return type to int (ll…

d87c80b

…vm#165453) Modify the python wrapper to return uint32_t, which prevents incorrect child name-to-index mapping and avoids performing redundant operations on non-existent SBValues.

Revert "[mlir][affine] Add fold logic when the affine.yield has IV as…

7b98280

… operand in the AffineForEmptyLoopFolder" (llvm#165607) Reverts llvm#164064 Broke Windows on mlir-s390x-linux buildbot build, needs investigations.

[lldb-dap] Report any errors during attach request (llvm#165270)

b17f1fd

Attaching using `core`, `gdbremote` or `attachInfo` may have an error. fail early if it does.

[AMDGPU] Support true16 spill restore with sram-ecc (llvm#165320)

5f1813e

[AArch64][PAC] Fix an implicit pointer-to-bool conversion (llvm#165056)

9d1b6ee

... which silently caused the wrong overload to be selected.

[TSan][Test-Only][Darwin] Fix typo in external.cpp again (llvm#165612)

8fdac32

[mlir][amdgpu][rocdl] Allow for graceful wmma conversion failures (ll…

3167752

…vm#165616)

[flang][cuda] Convert src and dst to llvm.ptr in tma_bulk_load (llvm#…

5c8492a

…165618)

[gn build] Port e938943

c1423f3

[DirectX] Use an allow-list of DXIL compatible module metadata (llvm#…

ad29838

…165290) This pr introduces an allow-list for module metadata, this encompasses the llvm metadata nodes: `llvm.ident` and `llvm.module.flags`, as well as, the generated `dx.` options. Resolves: llvm#164473.

[clang-format][NFC] Port FormatTestComments to verifyFormat (llvm#164310

4cb73cd

) And reduce the number of getLLVMStyleWithColumnLimit calls.

[dfsan] Fix getShadowAddress computation (llvm#162864)

71be1c1

Fix getShadowAddress computation by adding ShadowBase if it is not zero. Co-authored-by: anoopkg6 <[email protected]>

[HashRecognize] Forbid optz when data.next has exit-block user (llvm#…

3bc9b28

…165574) The CRC optimization relies on stripping the auxiliary data completely, and should hence be forbidden when it has a user in the exit-block. Forbid this case, fixing a miscompile. Fixes llvm#165382.

[flang][rt] Add install target for header files (llvm#165610)

4d6bff4

lb90 and others added 2 commits October 29, 2025 15:28

merge main into amd-staging

7e6dae3

ronlieb requested review from a team and dpalermo October 29, 2025 23:31

ronlieb requested review from krzysz00 and kuhar as code owners October 29, 2025 23:31

dpalermo approved these changes Oct 29, 2025

View reviewed changes

z1-cciauto merged commit 30ad1e7 into amd-staging Oct 30, 2025
12 checks passed

z1-cciauto deleted the amd/merge/upstream_merge_20251029174548 branch October 30, 2025 02:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

merge main into amd-staging #430

merge main into amd-staging #430

Uh oh!

ronlieb commented Oct 29, 2025

Uh oh!

z1-cciauto commented Oct 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

27 participants

merge main into amd-staging #430

merge main into amd-staging #430

Uh oh!

Conversation

ronlieb commented Oct 29, 2025

Uh oh!

z1-cciauto commented Oct 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

27 participants