merge main into amd-staging #442

z1-cciauto · 2025-10-30T19:06:57Z

No description provided.

…n.cpp (NFC)

…lvm#162819) This PR enables AMDGPUUniformIntrinsicCombine pass in the llc pipeline. Also introduces the "amdgpu-uniform-intrinsic-combine" command-line flag to enable/disable the pass. see the PR:llvm#116953

This patch addresses two use-after-move issues: 1. `Timing.cpp` A variable was std::moved and then immediately passed to an `assert()` check. Since the moved-from state made the assertion condition trivially true, the check was effectively useless. The `assert()` is removed. 2. `Query.cpp` The `matcher` object was moved-from and then subsequently used as if it still retained valid state. The fix ensures no subsequent use for the moved-from variable. Testing: `ninja check-mlir`

In `TextDiagnostic.cpp`, we're using column- and byte indices everywhere, but we were using integers for them which made it hard to know what to pass where, and what was produced. To make matters worse, that `SourceManager` considers a "column" is a byte in `TextDiagnostic`. Add `Bytes` and `Columns` structs, which are not related so API using them can differentiate between values interpreted columns or bytes.

….py module (llvm#165535) This commit extracts some MIR-related code from `common.py` and `update_mir_test_checks.py` into a dedicated `mir.py` module to improve code organization. This is a preparation step for llvm#164965 and also moves some pieces already moved by llvm#140296 All code intentionally moved verbatim with minimal necessary adaptations: * `log()` calls converted to `print(..., file=sys.stderr)` at `mir.py` lines 62, 64 due to a `log` locality.

…. NFC

Follow on from llvm#164372 This changes the DW_AT_name for `_BitInt(N)` from `_BitInt` to `_BitInt(N)`

…vm#165527) Allow the stack move optimization (which merges two allocas) when the address of only one alloca is captured (and the provenance is not captured). Both addresses need to be captured to observe that the allocas were merged. Fixes llvm#165484.

) This documents two things: * The recommended way to go about adding a new pass. * The criteria for enabling a pass. RFC: https://discourse.llvm.org/t/rfc-guidelines-for-adding-enabling-new-passes/88290

We've upgraded to LLVM 22 now, so we can remove a bunch of TODOs.

MemoryAccess base class was included from Core.h when it was a subclass of ExecutorProcessControl, but this changed in 0faa181

Also rename map to Map, remove the m_ prefix from member variables and fix the naming of the existing color variables.

Initial parsing/sema/codegen support for threadset clause in task and taskloop directives [Section 14.8 in in OpenMP 6.0 spec] ---------

…ency." (llvm#165688) Reverts llvm#165496 Due to flaky failures on Arm 32-bit since this change. Detailed in llvm#165496 (comment).

Currently all `runInTerminal` test are skipped in debug builds because, when attaching it times out parsing the debug symbols of lldb-dap. Add this test since it is running in teminal.

….reduce intrinsics. (llvm#165400) This is the first step in removing some NEON reduction intrinsics that duplicate the behaviour of their llvm.vector.reduce counterpart. NOTE: The i8/i16 variants differ in that the NEON versions return an i32 result. However, this looks more about making their code generation convenient with SelectionDAG disgarding the extra bits. This is only relevant for the next phase because the Clang usage always truncate their result, making llvm.vector.reduce a drop in replacement.

…lvm#164246) This patch adds test cases that demonstrate missing dependencies in DA caused by the lack of overflow handling. These issues will be addressed by properly inserting overflow checks and bailing out when one is detected. It covers the following dependence test functions: - Strong SIV - Weak-Crossing SIV - Weak-Zero SIV - Symbolic RDIV - GCD MIV It does NOT cover: - Exact SIV - Exact RDIV - Banerjee MIV

Pulled out of the abandoned patch llvm#69710 to act as a baseline for llvm#165694

@Michael137

It looks like the documentation for `llvm-cxxfilt`'s `--[no-]strip-underscore` options weren't updated when llvm#106233 was made. CC @Michael137 (I don't have merge rights myself).

As noticed on llvm#165676 - if we're increasing the use of an operand we should freeze it

…gers (llvm#165540) This patch allows us to narrow single bit-test/twiddle operations for larger than legal scalar integers to efficiently operate just on the i32 sub-integer block actually affected. The BITOP(X,SHL(1,IDX)) patterns are split, with the IDX used to access the specific i32 block as well as specific bit within that block. BT comparisons are relatively simple, and builds on the truncated shifted loads fold from llvm#165266. BTC/BTR/BTS bit twiddling patterns need to match the entire RMW pattern to safely confirm only one block is affected, but a similar approach is taken and creates codegen that should allow us to further merge with matching BT opcodes in a future patch (see llvm#165291). The resulting codegen is notably more efficient than the heavily micro-coded memory folded variants of BT/BTC/BTR/BTS. There is still some work to improve the bit insert 'init' patterns included in bittest-big-integer.ll but I'm expecting this to be a straightforward future extension. Fixes llvm#164225

…4217) This reverts commit 78bf682. Original PR: llvm#157463 Revert PR: llvm#158566 The relevant buildbots have been updated to a ROCm version that does not use the macros anymore to avoid the failures. Implements SWDEV-522062.

Not sure if this warrants a PR, but I realized there was a typo in a test filename from my previous PR llvm#164387.

… AVX targets (llvm#165676) If the PTEST is just using the ZF result and one of the operands is a i32/i64 sign mask we can use the TESTPD/PS instructions instead and avoid the use of an extra constant. Fixes some codegen identified in llvm#156233

…sm (llvm#149308) First batch of changes to add support for inline-asm callbr for the AMDGPU backend.

) A collection of small changes to get a number of lit tests working on z/OS.

We do not have native instructions for direct bfloat comparisons. However, we can expand bfloat to float, and do float comparison instead. TODO: handle bfloat comparison for ballot intrinsic on global isel path. Fixes: SWDEV-563403

…ansformDialectExtension.cpp (NFC)

…r AMDGPU (llvm#164358) Introduces the builtins for extended image insts for amdgcn.

This adds a few new features to hdrgen, all meant to facilitate using it with inputs and outputs that are outside the llvm-libc source tree. The new `extra_standards` field is a dictionary to augment the set of names that can be used in `standards` lists. The keys are the identifiers used in YAML ("stdc") and the values are the pretty names generated in the header comments ("Standard C"). This lets a libc project that's leveraging the llvm-libc sources along with its own code define new APIs outside the formal and de facto standards that llvm-libc draws its supported APIs from. The new `license_text` field is a list of lines of license text that replaces the standard LLVM license text used at the top of each generated header. This lets other projects use hdrgen with their own inputs to produce generated headers that are not tied to the LLVM project. Finally, for any function attributes that are not in a canonical list known to be provided by __llvm-libc-common.h, an include will be generated for "llvm-libc-macros/{attribute name}.h", expecting that file to define the "attribute" name as a macro. All this can be used immediately by builds that drive hdrgen and build libc code outside the LLVM CMake build. Future changes could add CMake plumbing to facilitate augmenting the LLVM CMake build of libc with outside sources via overlays and cache files.

…llvm#165347) This PR switches from using `llvm::sys::fs::make_absolute()` to `FileManager::makeAbsolutePath()` so that `FileSystemOptions` (i.e. the `-working-directory` option) and the `VFS`'s CWD have a say in how the prebuilt module paths are resolved. This matches how the rest of the compiler treats input files.

…witch (llvm#165724) In the previous implementation, this would fail for cases like `TypeSwitch<T*, std::optional<U>>` because `std::nullopt` does not match `ResultT` exactly and the overload for callable types would be selected. Add new overloads that support `nullptr` and `std::nullopt`. These can be added alongside generic callables because we wouldn't want to call any 'null' function refs anyway. I selected the `nullptr` and `nullopt` specializations because how often they appear in the codebase -- currently, you will see lots of code like `.Default(std::optional<T>())` that can be simplified with this patch.

…ency." (llvm#165688)" This reverts commit f205be0. This new select mechanism has exposed the fact that the resources the Arm Linux bot has can vary a lot. We do limit it to a low number of parallel tests but in this case, I think it's write performance somewhere. Reland the changes since they work elsewhere, and disable lldb-dap tests on Arm Linux while I fix our buildbot.

Fixes 17dbd86.

…able (llvm#164936) [llvm#163091](llvm#163091) Remove unistd.faccessat entrypoint for x86 linux if faccessat2 syscall is not available. Tested with non existent symbol and exclusion works.

This patch fixes a bug in strftime's return value when the formatted output exactly fills the buffer, not including the null terminator. The previous check failed to account for the null terminator in this case, incorrectly returning the written count instead of 0.

…lvm#164035) In the case of a partial unswitch, we take the invariant part of an expression consisting of either conjunctions or disjunctions, and hoist it out of the loop, conditioning a branch on it (==the invariant part). We can't correctly calculate the branch probability of this new branch, but can use the probability of the existing branch as a bound. That would preserve block frequencies better than allowing for the default, static (50-50) probability for that branch. Issue llvm#147390

…nc ups.

This fails on MacOS because setting it to unlimited there just sets the limit to the max value which causes differences that show up in the check lines.

…lvm#165057) The C standard behavior of `assert` cannot be accomplished with clang modules, either as a normal modular header, or a textual header. As a normal modular header: #define NDEBUG #include <assert.h> This pattern doesn't work, NDEBUG has to be passed on the command line to take effect, and then will effect all `assert`s in the includer. As a textual header: #define NDEBUG #include <modular_header_that_has_an_assert.h> This pattern doesn't work for similar reasons, modular_header_that_has_an_assert.h captured the value of NDEBUG when its module built and won't pick it up from the includer. -DNDEBUG can be passed when building the module, but will similarly effect the entire module. This has the additional problem that every module will contain a declaration for `assert`, which can possibly conflict with each other if they use different values of NDEBUG. So really <assert.h> just doesn't work properly with clang modules. Avoid the issue by not mentioning it in the Modules documentation, and use "X macros" as the example for textual headers. Don't use [extern_c] in the example modules, that should very rarely be used. Don't put multiple `header` declarations in a submodule, that has the confusing effect of "fusing" the headers. e.g. <sys/errno.h> does not include <errno.h>, but if it's in the same submodule, then an `#include <sys/errno.h>` will mysteriously also include <errno.h>.

This enables the use of readfile substitutions for populating environment variables. This is necessary in some compiler-rt tests. Reviewers: pawosm-arm Reviewed By: pawosm-arm Pull Request: llvm#165140

This section type is about to be used by llvm#147424 so let's give it a more generic name. Reviewers: smithp35, MaskRay Reviewed By: MaskRay Pull Request: llvm#155540

) VPWidenCanonicalIV and VPBlend recipes are created by VPPredicator, and VPCanonicalIVPHI and VPInstruction recipes are created by VPlanConstruction. WidenPHIs are never created.

Fixes 17dbd86 (again)

…lvm#165745) This makes the sorting behavior more uniform: functions and macros are always sorted (separately), not only when merging. This changes the sort order used for functions and other things sorted by their symbol names. Symbols are sorted alphabetically without regard to leading underscores, and then for identifiers that differ only in the number of leading underscores, the fewer underscores the earlier in the sort order. For the functions declared in a generated header, adjacent names with and without underscores will be grouped together without blank lines. This is implemented by factoring the name field, equality, and sorting support out of the various entity classes into a new common superclass (hdrgen.Symbol). This uncovered YAML's requirement to quote the string "NULL" to avoid pyyaml parsing it as None (equivalent to Javascript null) rather than a string.

Some minor adjustmenets around environment variables to make a handful of tests work with the internal shell that did not before. Reviewers: fmayer, alexander-shaposhnikov Reviewed By: fmayer, alexander-shaposhnikov Pull Request: llvm#165141

…lvm#165651) The boolean expression to determine if more bytes are needed for a signed LEB128 value is quite complex: !((((Value == 0 ) && ((Byte & 0x40) == 0)) || ((Value == -1) && ((Byte & 0x40) != 0)))) This patch simplifies it to an equivalent expression using a ternary operator, which is much easier to understand.

Identified with modernize-use-default-member-init.

z1-cciauto · 2025-10-30T19:07:30Z

PSDB Link: https://compiler-ci.amd.com/job/compiler-psdb-amd-staging/2566

joker-eph and others added 30 commits October 29, 2025 23:48

[MLIR] Apply clang-tidy fixes for llvm-qualified-auto in Vectorizatio…

e0d9c9c

…n.cpp (NFC)

[AMDGPU] Enable "amdgpu-uniform-intrinsic-combine" pass in pipeline. (l…

4d7093b

…lvm#162819) This PR enables AMDGPUUniformIntrinsicCombine pass in the llc pipeline. Also introduces the "amdgpu-uniform-intrinsic-combine" command-line flag to enable/disable the pass. see the PR:llvm#116953

[clang] Update C++ DR status page

e760542

[AArch64][GlobalISel] Add some GISel test coverage for icmp-and tests…

31890c5

…. NFC

[DebugInfo] Add bit size to _BitInt name in debug info (llvm#165583)

30579c0

Follow on from llvm#164372 This changes the DW_AT_name for `_BitInt(N)` from `_BitInt` to `_BitInt(N)`

[DeveloperPolicy] Add guidelines for adding/enabling passes (llvm#158591

43ea75d

) This documents two things: * The recommended way to go about adding a new pass. * The criteria for enabling a pass. RFC: https://discourse.llvm.org/t/rfc-guidelines-for-adding-enabling-new-passes/88290

[libc++] Fix LLVM 22 TODOs (llvm#153367)

bb1158f

We've upgraded to LLVM 22 now, so we can remove a bunch of TODOs.

[GVN] Add tests for pointer replacement with different addr size (NFC)

689e95c

[AMDGPU] insert eof white space (llvm#165673)

eccbfde

[ORC] Fix missing include for MemoryAccess interface (NFC) (llvm#165576)

932fa0e

MemoryAccess base class was included from Core.h when it was a subclass of ExecutorProcessControl, but this changed in 0faa181

[clang][NFC] Make ellipse strings constexpr (llvm#165680)

96feee4

Also rename map to Map, remove the m_ prefix from member variables and fix the naming of the existing color variables.

[clang][OpenMP] New OpenMP 6.0 threadset clause (llvm#135807)

25ece5b

Initial parsing/sema/codegen support for threadset clause in task and taskloop directives [Section 14.8 in in OpenMP 6.0 spec] ---------

Revert "[lldb-dap] Improving consistency of tests by removing concurr…

f205be0

…ency." (llvm#165688) Reverts llvm#165496 Due to flaky failures on Arm 32-bit since this change. Detailed in llvm#165496 (comment).

[lldb-dap][test] skip io_redirection in debug builds (llvm#165593)

838f643

Currently all `runInTerminal` test are skipped in debug builds because, when attaching it times out parsing the debug symbols of lldb-dap. Add this test since it is running in teminal.

[LoongArch][NFC] Pre-commit tests for vector type average (llvm#161076)

84fc780

[X86] Add ldexp test coverage for avx512 targets (llvm#165698)

3b30010

Pulled out of the abandoned patch llvm#69710 to act as a baseline for llvm#165694

[llvm-cxxfilt] update docs to reflect llvm#106233 (llvm#165709)

a8656c5

It looks like the documentation for `llvm-cxxfilt`'s `--[no-]strip-underscore` options weren't updated when llvm#106233 was made. CC @Michael137 (I don't have merge rights myself).

[X86] combinePTESTCC - ensure repeated operands are frozen (llvm#165697)

8c8bead

As noticed on llvm#165676 - if we're increasing the use of an operand we should freeze it

[CIR] Upstream handling for __builtin_prefetch (Typo Fix) (llvm#165209)

5c5cef3

Not sure if this warrants a PR, but I realized there was a typo in a test filename from my previous PR llvm#164387.

[AMDGPU][FixIrreducible][UnifyLoopExits] Support callbr with inline-a…

8954011

…sm (llvm#149308) First batch of changes to add support for inline-asm callbr for the AMDGPU backend.

bunch of small changes to fix a number of LIT tests on z/OS (llvm#165567

6106b94

) A collection of small changes to get a number of lit tests working on z/OS.

changpeng and others added 25 commits October 30, 2025 09:44

[MLIR] Apply clang-tidy fixes for bugprone-argument-comment in TestTr…

9cf3e8a

…ansformDialectExtension.cpp (NFC)

[AMDGPU][Clang] Support for type inferring extended image builtins fo…

24c75a2

…r AMDGPU (llvm#164358) Introduces the builtins for extended image insts for amdgcn.

[lldb][test] Fix typo in Arm Linux lldb-dap skip

eec44c0

Fixes 17dbd86.

[libc] Remove faccessat entrypoint if faccessat2 syscall is not avail…

b1acd6d

…able (llvm#164936) [llvm#163091](llvm#163091) Remove unistd.faccessat entrypoint for x86 linux if faccessat2 syscall is not available. Tested with non existent symbol and exclusion works.

Move GlobalISel sync up meeting information from "past" to current sy…

8d9cd5b

…nc ups.

[lit] Move ulimit_unlimited.txt test to non Darwin tests

160058f

This fails on MacOS because setting it to unlimited there just sets the limit to the max value which causes differences that show up in the check lines.

[RISCV] Adjust stackmaps test to provide coverage for non-64 bit values

b73951f

[lit] Expand late substitutions before running builtins

28e98b8

This enables the use of readfile substitutions for populating environment variables. This is necessary in some compiler-rt tests. Reviewers: pawosm-arm Reviewed By: pawosm-arm Pull Request: llvm#165140

ELF: Rename RandomizePaddingSection to PaddingSection.

87673d3

This section type is about to be used by llvm#147424 so let's give it a more generic name. Reviewers: smithp35, MaskRay Reviewed By: MaskRay Pull Request: llvm#155540

[LV] Strengthen assert: VPlan0 doesn't have WidenPHIs (NFC) (llvm#165715

01fbbda

) VPWidenCanonicalIV and VPBlend recipes are created by VPPredicator, and VPCanonicalIVPHI and VPInstruction recipes are created by VPlanConstruction. WidenPHIs are never created.

[lldb][test] Fix typo in lldb-dap skip for Arm 32-bit

25afea7

Fixes 17dbd86 (again)

[ASan] Make tests work with internal shell

546e91b

Some minor adjustmenets around environment variables to make a handful of tests work with the internal shell that did not before. Reviewers: fmayer, alexander-shaposhnikov Reviewed By: fmayer, alexander-shaposhnikov Pull Request: llvm#165141

[Hexagon] Use default member initializations (NFC) (llvm#165653)

a1db777

Identified with modernize-use-default-member-init.

[llvm] Proofread HowToCrossCompileBuiltinsOnArm.rst (llvm#165655)

2504f5f

merge main into amd-staging

5723b8a

z1-cciauto requested review from Groverkss and nicolasvasilache as code owners October 30, 2025 19:06

z1-cciauto requested a review from a team October 30, 2025 19:06

ronlieb closed this Oct 30, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

merge main into amd-staging #442

merge main into amd-staging #442

Uh oh!

z1-cciauto commented Oct 30, 2025

Uh oh!

z1-cciauto commented Oct 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

58 participants

merge main into amd-staging #442

merge main into amd-staging #442

Uh oh!

Conversation

z1-cciauto commented Oct 30, 2025

Uh oh!

z1-cciauto commented Oct 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

58 participants