-
Notifications
You must be signed in to change notification settings - Fork 66
Add CutlassCompiledKernel deriving CompiledKernelBase #5088
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Review updated until commit fa97d4e Description
Changes walkthrough 📝
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
!test |
Co-authored-by: Ryan Spring <[email protected]>
Co-authored-by: Ryan Spring <[email protected]>
Co-authored-by: Ryan Spring <[email protected]>
Co-authored-by: Ryan Spring <[email protected]>
Co-authored-by: Ryan Spring <[email protected]>
…o jh/cutlass_executor.compiled_kernel
…o jh/cutlass_executor.compiled_kernel
!test |
This test is currently skipped in CI because |
!test |
This refactors
CompiledKernel
to create a new base classCompiledKernelBase
that mostly handles kernel naming and recording the generated CUDA code. Then it introducesCutlassCompiledKernel
which actually compiles and executes our customized CUTLASS kernels. This will be used inCutlassExecutor
in future PRs.Stacked on #5087.