-
Couldn't load subscription status.
- Fork 247
[CK_TILE] Add mxfp4 flatmm #3080
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
DDEle
wants to merge
3
commits into
develop
Choose a base branch
from
cktile_mxfp4_flatmm
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
+2,954
−6
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
commit 3e1a851 Author: Ding, Yi <[email protected]> Date: Thu Oct 23 06:10:54 2025 +0000 Fix & clean after rebase commit 1edf485 Author: Ding, Yi <[email protected]> Date: Wed Oct 22 10:46:13 2025 +0000 Squashed commit of the following: commit 0b6b9db Author: mtgu0705 <[email protected]> Date: Mon Sep 22 02:04:27 2025 -0500 fix bandwidth calculation commit 9aebf53 Author: mtgu0705 <[email protected]> Date: Mon Sep 22 00:58:59 2025 -0500 updates commit 62607de Author: mtgu0705 <[email protected]> Date: Fri Sep 19 00:39:46 2025 -0500 fix a bug, set the A DS_read preload size to 4 for MXFP4 commit 92ad6fc Author: mtgu0705 <[email protected]> Date: Thu Sep 18 01:19:03 2025 -0500 fix a_wrap preload issue for large MPerBlock. commit f2db447 Author: mtgu0705 <[email protected]> Date: Wed Sep 17 21:34:03 2025 -0500 optimized the VGPR repack issue for MXFP4 commit 346a400 Author: Gino Lu <[email protected]> Date: Wed Sep 17 04:19:44 2025 -0500 fix time error commit 80c1743 Author: mtgu0705 <[email protected]> Date: Wed Sep 17 03:58:00 2025 -0500 updated, function passed. commit ce26d90 Author: mtgu0705 <[email protected]> Date: Tue Sep 16 22:21:39 2025 -0500 fix, function partially passed commit 0a89ed1 Author: mtgu0705 <[email protected]> Date: Tue Sep 16 03:01:12 2025 -0500 fix, reference function passed, next check kernel function commit ec9bcef Author: Gino Lu <[email protected]> Date: Tue Sep 16 02:29:01 2025 -0500 let pack/unpack return pk_fp4_t commit a333206 Author: mtgu0705 <[email protected]> Date: Mon Sep 15 20:50:26 2025 -0500 fix commit 3893c06 Author: Gino Lu <[email protected]> Date: Mon Sep 15 05:51:06 2025 -0500 fix bug commit 8052bea Author: mtgu0705 <[email protected]> Date: Mon Sep 15 04:02:05 2025 -0500 fix core dump issue, function is not correct. commit 9ceb3fd Author: mtgu0705 <[email protected]> Date: Mon Sep 15 03:03:02 2025 -0500 updates, build pass commit cc94eb6 Author: mtgu0705 <[email protected]> Date: Mon Sep 15 00:05:18 2025 -0500 updates commit 22586c3 Author: Gino Lu <[email protected]> Date: Sun Sep 14 23:40:28 2025 -0500 fix bug commit e92e67b Author: Gino Lu <[email protected]> Date: Fri Sep 12 03:28:50 2025 -0500 fix interface commit 8b1dd60 Author: Gino Lu <[email protected]> Date: Fri Sep 12 02:53:50 2025 -0500 add interface in warp_gemm_impl commit c6135f6 Author: mtgu0705 <[email protected]> Date: Wed Sep 10 05:03:08 2025 -0500 updates some fixes. commit b0d71b8 Author: mtgu0705 <[email protected]> Date: Tue Sep 9 04:37:42 2025 -0500 fix after merge ginolu/add_wgmfma_dispatcher commit f119c30 Merge: c5030e6 72c8ef8 Author: mtgu0705 <[email protected]> Date: Mon Sep 8 22:09:15 2025 -0500 Merge remote-tracking branch 'origin/ginolu/add_wgmfma_dispatcher' into mtgu/cktile_mxfp4_flatmm_dev commit c5030e6 Author: mtgu0705 <[email protected]> Date: Mon Sep 8 21:42:47 2025 -0500 update mx flatmm tail pipeline commit 72c8ef8 Merge: 9661bb4 e4a7728 Author: Gino Lu <[email protected]> Date: Mon Sep 8 19:10:23 2025 -0500 Merge branch 'develop' into ginolu/add_wgmfma_dispatcher commit 9661bb4 Author: Gino Lu <[email protected]> Date: Mon Sep 8 19:09:55 2025 -0500 fix type error commit 0509597 Author: mtgu0705 <[email protected]> Date: Mon Sep 8 04:01:40 2025 -0500 update hotloop pipeline commit 754ae04 Merge: 15d4440 83f607e Author: Gino Lu <[email protected]> Date: Fri Sep 5 04:22:26 2025 -0500 Merge branch 'develop' into ginolu/add_wgmfma_dispatcher commit 15d4440 Author: Gino Lu <[email protected]> Date: Fri Sep 5 04:21:26 2025 -0500 fix clang format commit 146963d Author: mtgu0705 <[email protected]> Date: Wed Sep 3 10:00:54 2025 -0500 some updates commit 12526b6 Merge: 47cee04 00fd72b Author: asleepzzz <[email protected]> Date: Wed Sep 3 13:22:03 2025 +0800 Merge branch 'develop' into ginolu/add_wgmfma_dispatcher commit 47cee04 Author: Gino Lu <[email protected]> Date: Mon Sep 1 02:11:02 2025 -0500 fix vec size error commit d289292 Author: Gino Lu <[email protected]> Date: Mon Sep 1 01:23:39 2025 -0500 fix format error commit 16993ac Author: mtgu0705 <[email protected]> Date: Sat Aug 30 03:19:07 2025 -0500 update codes commit 9c37e55 Author: mtgu0705 <[email protected]> Date: Fri Aug 29 11:27:33 2025 -0500 init ck_tile mxfp4 flatmm commit 5c484a5 Author: Feng Shijie <[email protected]> Date: Thu Aug 28 08:02:50 2025 +0000 Add bias for f16xf4 moe_flatmm commit dd6539f Author: Feng Shijie <[email protected]> Date: Wed Aug 27 13:39:47 2025 +0000 update case construction commit 65b7024 Author: Feng Shijie <[email protected]> Date: Tue Aug 26 12:32:29 2025 +0000 support swiglu activaion and use rcpf to accelerate silu commit b422e41 Author: Gino Lu <[email protected]> Date: Tue Aug 26 02:33:55 2025 -0500 first commit commit d05eed9 Author: root <[email protected]> Date: Fri Aug 22 04:01:59 2025 -0500 add line to last commit d69cab7 Author: root <[email protected]> Date: Fri Aug 22 03:20:46 2025 -0500 adjust A_LDS descriptor to avoid bankconflict commit 65989e9 Author: root <[email protected]> Date: Thu Aug 21 09:46:52 2025 -0500 enable hotloop commit c378e9b Author: Feng Shijie <[email protected]> Date: Thu Aug 21 09:12:21 2025 +0000 support atomic_pk_add_bf16 on gfx950 commit 85976b0 Author: Feng Shijie <[email protected]> Date: Thu Aug 21 06:58:55 2025 +0000 use int64_t as expert stride to avoid overflow commit 9fbcc8f Author: Feng Shijie <[email protected]> Date: Wed Aug 20 13:53:32 2025 +0000 use v4i32 as the storage type for B to avoid repack operation commit 81899bd Author: Feng Shijie <[email protected]> Date: Wed Aug 20 06:40:03 2025 +0000 add pk_fp4_t and e8m0_t support for amd_buffer_load_impl commit c27eb07 Author: Feng Shijie <[email protected]> Date: Wed Aug 20 04:39:14 2025 +0000 optimize cvt_pkf4_to_f16 implementation commit 3ca0bd5 Author: Feng Shijie <[email protected]> Date: Tue Aug 19 14:56:46 2025 +0000 optimize A_LDS descriptor to avoid bankconflict commit f7f0306 Author: Feng Shijie <[email protected]> Date: Mon Aug 18 18:43:37 2025 +0000 fix gate-up when GU_NRepeat > 1 commit be55c0f Author: Feng Shijie <[email protected]> Date: Mon Aug 18 17:28:11 2025 +0000 add fp16xf4 moe commit 599e1f5 Author: Feng Shijie <[email protected]> Date: Sun Aug 17 17:51:18 2025 +0000 rename example commit 7899fb4 Author: Feng Shijie <[email protected]> Date: Fri Aug 15 06:20:46 2025 +0000 remove additional check when e8m0->float commit 714b341 Author: Feng Shijie <[email protected]> Date: Thu Aug 14 09:34:12 2025 +0000 eliminate repeat dequant commit 53e8c0c Merge: 5de6208 cc9c7b9 Author: Feng Shijie <[email protected]> Date: Wed Aug 13 16:51:49 2025 +0000 Merge remote-tracking branch 'origin/moe_flatmm' into feat-mixed_input_flatmm commit 5de6208 Author: Feng Shijie <[email protected]> Date: Wed Aug 13 16:16:48 2025 +0000 update f16xMXF4 commit 732ebde Author: Feng Shijie <[email protected]> Date: Wed Aug 13 10:48:53 2025 +0000 update scale-preshuffle for MXF4 commit edb58d0 Author: Feng Shijie <[email protected]> Date: Mon Aug 11 11:24:34 2025 +0000 update commit cc9c7b9 Author: Feng Shijie <[email protected]> Date: Mon Aug 11 08:38:23 2025 +0000 optimize gemm2 atomic_add pattern commit 200a11a Author: Feng Shijie <[email protected]> Date: Mon Aug 11 07:59:47 2025 +0000 update scale for mxfp4 commit 87aed56 Author: Feng Shijie <[email protected]> Date: Mon Aug 11 07:56:14 2025 +0000 update case construction commit 8b85fa6 Author: Feng Shijie <[email protected]> Date: Mon Aug 11 06:03:06 2025 +0000 update granularity control commit 1b8c709 Author: Feng Shijie <[email protected]> Date: Mon Aug 11 03:42:46 2025 +0000 fix TileConfig commit 8ba1c70 Author: Gino Lu <[email protected]> Date: Thu Aug 7 21:37:28 2025 +0800 Add e8m0 scaled convert into CK_TILE (#2617) * first commit * remove redundent code * modify according to comments. * fix type_convert error with scaled_type_convert commit f788d3d Author: Feng Shijie <[email protected]> Date: Fri Aug 8 20:19:16 2025 +0000 add mixed_prec fp16xfp4 commit 3dea10a Author: Feng Shijie <[email protected]> Date: Thu Aug 7 09:22:04 2025 +0000 debug mixed_prec flatmm commit 0ba513b Merge: 90e910f c0cb4d0 Author: lalala-sh <[email protected]> Date: Wed Aug 6 16:49:47 2025 +0800 Merge pull request #2626 from ROCm/felix/flatmm_fix_splitk fix split k commit 6d3cbc7 Author: Feng Shijie <[email protected]> Date: Wed Aug 6 08:33:33 2025 +0000 add moe_flatmm commit c0cb4d0 Author: coderfeli <[email protected]> Date: Wed Aug 6 02:45:31 2025 +0000 fix split k commit 90e910f Author: Feng Shijie <[email protected]> Date: Mon Aug 4 07:16:36 2025 +0000 fix flatmm with scaling when WarpTileM == 32 commit aa5e008 Author: Feng Shijie <[email protected]> Date: Fri Aug 1 11:01:23 2025 +0000 optimize scaling epilogue commit ac5908c Author: Feng Shijie <[email protected]> Date: Fri Aug 1 07:28:38 2025 +0000 fix wrong config for fp8 scaling commit 3f43b84 Author: Feng Shijie <[email protected]> Date: Wed Jul 30 06:20:30 2025 +0000 prune debug message commit 2e5d4c7 Author: Feng Shijie <[email protected]> Date: Wed Jul 30 04:52:08 2025 +0000 fix compile error commit c117a19 Author: Feng Shijie <[email protected]> Date: Tue Jul 29 15:42:58 2025 +0000 Add persistent option on flatmm for tuning commit a587701 Author: AMD-dteng <[email protected]> Date: Tue Jul 29 22:48:00 2025 +0800 update pipeline v1: add atomic IGLP schedule commit f9e4814 Author: lalala-sh <[email protected]> Date: Thu Jul 24 09:09:27 2025 +0000 fix error log throwing commit 1b6d7cf Author: Feng Shijie <[email protected]> Date: Mon Jul 28 08:24:51 2025 +0000 crz idea commit 5473f06 Author: Feng Shijie <[email protected]> Date: Sun Jul 27 11:57:38 2025 +0000 Add permuteN optimzization when NRepeat % 2 == 0 on flatmm commit bfb9f40 Author: sjfeng <[email protected]> Date: Sun Jul 27 17:24:08 2025 +0800 try to remove c_shuffle_lds commit 1264f4d Author: Feng Shijie <[email protected]> Date: Fri Jul 25 07:41:48 2025 +0000 fix loop-dim mismatch and improve c_shuffle alu parallelism commit 1239d8a Merge: 4066454 b908f5e Author: lalala-sh <[email protected]> Date: Thu Jul 24 08:46:51 2025 +0000 merge flatmm -scale commit 4066454 Author: lalala-sh <[email protected]> Date: Thu Jul 24 16:19:58 2025 +0800 revert delete of inc file commit 6839098 Author: solin <[email protected]> Date: Thu Jul 24 04:38:16 2025 +0000 reorg flatmm code commit b908f5e Author: Feng Shijie <[email protected]> Date: Wed Jul 23 19:12:31 2025 +0000 fix flatmm syntax error on gfx950 commit 5a1183e Author: Feng Shijie <[email protected]> Date: Wed Jul 23 19:04:22 2025 +0000 support flatmm scaling commit 89fa639 Author: valarLip <[email protected]> Date: Wed Jul 23 08:44:12 2025 +0000 merge flatmm pipe v0 from dteng_flatmm_opt commit 3f7d848 Author: lalala-sh <[email protected]> Date: Wed Jul 23 15:38:12 2025 +0800 build pass commit 6dacf83 Author: lalala-sh <[email protected]> Date: Wed Jul 23 07:20:26 2025 +0000 fix bug commit 7e1bd4b Author: lalala-sh <[email protected]> Date: Wed Jul 23 15:01:53 2025 +0800 sync commit 46a538e Author: valarLip <[email protected]> Date: Tue Jul 22 08:09:35 2025 +0000 adaptive scheduler instead of Macro definition commit 9aa3396 Author: lalala-sh <[email protected]> Date: Thu Jul 17 08:40:35 2025 +0000 fix tail handler bug commit fb76450 Author: lalala-sh <[email protected]> Date: Wed Jul 16 10:12:19 2025 +0000 merge from dteng_flatmm_opt --------- Co-authored-by: lalala-sh <[email protected]> Co-authored-by: AMD-dteng <[email protected]> Co-authored-by: solin <[email protected]> Co-authored-by: sjfeng <[email protected]> Co-authored-by: valarLip <[email protected]> Co-authored-by: asleepzzz <[email protected]> Co-authored-by: Feng Shijie <[email protected]> Co-authored-by: coderfeli <[email protected]> Co-authored-by: Gino Lu <[email protected]> Co-authored-by: mtgu0705 <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Proposed changes
Please describe the motivation behind the pull request, whether it enables a new feature or fixes a bug. If there are associated pull requests or issues, please link them to the pull request.
Checklist
Please put an
xinto the boxes that apply. You can also fill these out after creating the PR. If you're not sure, please don't hesitate to ask.clang-formaton all changed filesDiscussion
If this is a relatively large or complex change, feel free to start a discussion by explaining why you chose the solution you did and what alternatives you considered