[CK_BUILDER] First fwd convolution builder implementation #3070

vpietila-amd · 2025-10-21T14:27:16Z

Proposed changes

Added concepts and builder functionality for building forward convolutions from CK library. The limitation of the current implementation is that the convolution specialization is hard-coded to DeviceGroupedConvFwdMultipleABD_Xdl_CShuffle_V3. The implementation will be generalized to other specializations in a later PR. Added unit tests to verify that the builder is building valid instances. Most of the checking for parameters is done at compile time.

The implementation in this PR builds upon the prototype from this branch: https://github.com/ROCm/composable_kernel/tree/jshumway/convolution-builder

Checklist

Please put an x into the boxes that apply. You can also fill these out after creating the PR. If you're not sure, please don't hesitate to ask.

I have added tests relevant to the introduced functionality, and the unit tests are passing locally
I have added the test to REGRESSION_TESTS list defined at the top of CMakeLists.txt in tests/CMakeLists.txt, IF the test takes more than 30 seconds to run.
I have added inline documentation which enables the maintainers with understanding the motivation
I have removed the stale documentation which is no longer relevant after this pull request
(If this change is user-facing) I have added release notes which provide the end users with a brief summary of the improvement from this pull request
I have run clang-format on all changed files
Any dependent changes have been merged

Discussion

- Add experimental/builder directory with README documentation. - Create initial test infrastructure with CMakeLists.txt and placeholder test. - Update root CMakeLists.txt to support CK_EXPERIMENTAL_BUILDER option. - Update .gitignore to not treat `experimental/builder` as a CMake build directory. This establishes the directory structure for a high-level builder pattern that will provide a semantically-clear interface for constructing CK operations, with initial focus on convolution kernels for MIOpen integration.

- Add experimental/builder CMakeLists.txt with proper subdirectory structure - Add placeholder include/ck_tile/builder CMakeLists.txt for header installation - Fix gtest.cmake to use include_guard to prevent multiple inclusions - Update root CMakeLists.txt to include full builder directory instead of just tests

Co-authored-by: Copilot <[email protected]>

…r-first-fwd-conv-builder

shumway · 2025-10-21T19:18:24Z

experimental/builder/include/ck_tile/builder/builder_utils.hpp

+    case DataType::FP32: return "FP32";
+    case DataType::BF16: return "BF16";
+    case DataType::FP8: return "FP8";
+    case DataType::I8: return "I8";


Is "S8" more common (signed eight bit integer)?

Changed to S8. Although we are not yet using the type.

shumway · 2025-10-21T19:20:56Z

experimental/builder/include/ck_tile/builder/builder_utils.hpp

+{
+    switch(layout)
+    {
+    case GroupConvLayout::CHANNELS_FIRST: return "Channels-first (NCHW)";


I've been trying to figure this out. I really like channels-first and channels-last for convolutions, but it looks like grouped convolution turn this into an alphabet soup. I think the filter in grouped convolutions can still be channels-first or channels-last, but the image and feature map (input and output) tensors appear to have a lot of different conventions where the group index fits in the layout. This is OK for now, but we probably want to think about how best to describe these layouts.

I revised the implementation and introduced separate conv layouts enums GroupConvLayout1D, GroupConvLayout2D, and GroupConvLayout3D. They indicate explicitly the layout of the input, filter, and output tensors. I'm wondering if they should be merged into a single enum since they carry a bit redundant dimension information. What do you think?

shumway · 2025-10-21T19:26:13Z

experimental/builder/include/ck_tile/builder/conv_factory.hpp

+    case ConvFwdSpecialization::FILTER_3x3:
+        return ck::tensor_operation::device::ConvolutionForwardSpecialization::Filter3x3;
+    case ConvFwdSpecialization::DEFAULT:
+    default: return ck::tensor_operation::device::ConvolutionForwardSpecialization::Default;


Robin suggested removing the "default:", making the function consteval, and throwing an error string for unexpected values. That way we get a compile time error if we have an unsupported input value. In principle the compiler will also error if we omit "default:" and miss a value, but the consteval + throw pattern is robust if we are only compile-time usage. The downside is the function can only be used at compile time, but that may be correct for this code.

Should we make all factory internal function consteval rather than constexpr? At least the ones that use switch?

- Add missing 2D layouts: GNHWC_GKYXC_GNHWK, NGCHW_GKCYX_NGKHW - Add missing 3D layout: GNDHWC_GKZYXC_GNDHWK - Add 1D layouts (NWGC, NGCW, GNWC, NGCW_GKCX) for future support - Add 3 tests for new 2D/3D layouts - All tests pass (5/5)

- Add test for 2D NGCHW_GKYXC_NGKHW (channels-first) with Filter1x1Stride1Pad0 - Add test for 3D NDHWGC_GKZYXC_NDHWGK (channels-last) - All 7 tests pass (complete coverage for all 2D/3D forward layouts)

…r-first-fwd-conv-builder

aosewski · 2025-10-24T09:04:09Z

experimental/builder/include/ck_tile/builder/conv_algorithm_concepts.hpp

+    { t.src_vector_dim } -> std::convertible_to<size_t>;
+    { t.src_scalar_per_vector } -> std::convertible_to<size_t>;
+    { t.dest_scalar_per_vector_k1 } -> std::convertible_to<size_t>;
+    { t.add_extra } -> std::convertible_to<bool>;


This should be name lds_padding

bartekxk

Looks very nice

bartekxk · 2025-10-24T10:40:23Z

experimental/builder/include/ck_tile/builder/conv_algorithm_concepts.hpp

+concept InputVectorTransferDescriptor = requires(T t) {
+    { t.src_vector_dim } -> std::convertible_to<size_t>;
+    { t.src_scalar_per_vector } -> std::convertible_to<size_t>;
+    { t.dest_scalar_per_vector_k1 } -> std::convertible_to<size_t>;


Maybe lds_dst_scalar_per_vector is better name

Indeed more descriptive.

Renamed as suggested.

bartekxk · 2025-10-24T10:41:41Z

experimental/builder/include/ck_tile/builder/conv_algorithm_concepts.hpp

+template <typename T>
+concept InputVectorTransferDescriptor = requires(T t) {
+    { t.src_vector_dim } -> std::convertible_to<size_t>;
+    { t.src_scalar_per_vector } -> std::convertible_to<size_t>;


Can we add boolean IsDirectLoad? It will be introduced in #3082

Added requirement for is_direct_load member variable.

bartekxk · 2025-10-24T10:45:52Z

experimental/builder/include/ck_tile/builder/conv_factory.hpp

+    {
+        return ck::tensor_operation::device::ConvolutionForwardSpecialization::Filter1x1Stride1Pad0;
+    }
+    else if constexpr(specialization == ConvFwdSpecialization::ODD_C)


We can delete this specialization since this is duplication of ODD_C. These instances should be dleted in #2281

bartekxk · 2025-10-24T10:46:28Z

experimental/builder/include/ck_tile/builder/types.hpp

+    FP16,
+    BF16,
+    FP8,
+    S8


Maybe I8 instead of S8?

I had initially I8, but @shumway was thinking that S8 is more standard for signed 8-bit integer.

I think the general convention is I for signed, U for unsigned

Here's a similar topic

bartekxk · 2025-10-24T10:46:45Z

experimental/builder/include/ck_tile/builder/types.hpp

+enum class BlockGemmPipelineVersion
+{
+    V1,
+    V3,


bartekxk · 2025-10-24T10:47:13Z

experimental/builder/include/ck_tile/builder/types.hpp

+// Fused element-wise operations.
+enum class ElementwiseOperation
+{
+    BIAS,


Missed BIAS_BNORM_CLAMP

aosewski · 2025-10-24T13:19:26Z

experimental/builder/include/ck_tile/builder/conv_signature_concepts.hpp

+concept ConvSignatureDescriptor = requires(T t) {
+    { t.spatial_dim } -> std::convertible_to<unsigned int>;
+    { t.direction } -> std::convertible_to<ConvDirection>;
+    requires std::convertible_to<decltype(t.layout), GroupConvLayout1D> ||
+                 std::convertible_to<decltype(t.layout), GroupConvLayout2D> ||
+                 std::convertible_to<decltype(t.layout), GroupConvLayout3D>;
+    { t.data_type } -> std::convertible_to<DataType>;


What about fused (or actually elementwise, cause it can be applied on inputs as well) op? Should we actually check it ?

shumway and others added 24 commits October 17, 2025 03:40

Fix clang formatting.

db25611

Merge branch 'develop' into jshumway/builder-setup

c9466c8

Merge branch 'develop' into jshumway/builder-setup

51b76a7

Scope C++20 settingto the test code

79f057b

Co-authored-by: Copilot <[email protected]>

Remove redundant GTest::gtest linkage

63a9d9f

Co-authored-by: Copilot <[email protected]>

Merge branch 'develop' into jshumway/builder-setup

289990f

Introduce basic types, and convolution algorithm concepts and limits.

dd7a6ed

Add convolution signature concepts.

fc258eb

Add convolution factory.

9c0fdff

Finalize conv factory implementation for fwd convolutions.

25837b4

Add type definitions for testing.

3aaf8b9

Add placeholder test.

7b2a622

Add convolution builder definition.

11e71ab

Fully functional fwd conv builder.

7b89486

Test improvements.

16df5ba

Clean-up include headers.

c6a1fa4

Enable the limit checks for the convolution algorithm parameters.

6cf8cc1

Remove dead code.

c76954b

clang formatting.

c3f5097

Add more tests and missing conv specialization argument.

28f6707

clang formatting.

fc5caa1

Merge remote-tracking branch 'origin/develop' into vpietila/ck-builde…

37e5aee

…r-first-fwd-conv-builder

shumway reviewed Oct 21, 2025

View reviewed changes

vpietila-amd and others added 3 commits October 22, 2025 10:11

Add explicit handling of the tensor layouts.

6ade5a1

Add complete 2D/3D layout support to CK Builder

806ddac

- Add missing 2D layouts: GNHWC_GKYXC_GNHWK, NGCHW_GKCYX_NGKHW - Add missing 3D layout: GNDHWC_GKZYXC_GNDHWK - Add 1D layouts (NWGC, NGCW, GNWC, NGCW_GKCX) for future support - Add 3 tests for new 2D/3D layouts - All tests pass (5/5)

Add tests for remaining 2D/3D layouts

2f2e86e

- Add test for 2D NGCHW_GKYXC_NGKHW (channels-first) with Filter1x1Stride1Pad0 - Add test for 3D NDHWGC_GKZYXC_NDHWGK (channels-last) - All 7 tests pass (complete coverage for all 2D/3D forward layouts)

vpietila-amd added 3 commits October 23, 2025 08:28

Fix clang formatting.

9d3f88c

Changed I8 -> S8.

c388a87

Fix signature.

3a33509

vpietila-amd marked this pull request as ready for review October 23, 2025 09:25

vpietila-amd requested review from a team, ThomasNing, afagaj, andriy-ca, aosewski, aska-0096, asleepzzz, bartekxk, carlushuang, cgmillette, coderfeli, ddembeckAMD, geyyer, illsilin, poyenc, qianfengz, tenpercent and vidyasagar-amd as code owners October 23, 2025 09:25

spolifroni-amd previously approved these changes Oct 23, 2025

View reviewed changes

Merge remote-tracking branch 'origin/develop' into vpietila/ck-builde…

d3fce7b

…r-first-fwd-conv-builder

vpietila-amd dismissed spolifroni-amd’s stale review via d3fce7b October 24, 2025 06:49

aosewski reviewed Oct 24, 2025

View reviewed changes

bartekxk reviewed Oct 24, 2025

View reviewed changes

Rename concepts and corresponding members.

b2a13a3

aosewski reviewed Oct 24, 2025

View reviewed changes

Rename LDS related parameters.

9c5f262

[CK_BUILDER] First fwd convolution builder implementation #3070

Are you sure you want to change the base?

[CK_BUILDER] First fwd convolution builder implementation #3070

Conversation

vpietila-amd commented Oct 21, 2025

Proposed changes

Checklist

Discussion

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bartekxk left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants