Skip to content

Conversation

jacobhinkle
Copy link
Collaborator

@jacobhinkle jacobhinkle commented Aug 26, 2025

C++ test passes, but running NVFUSER_ENABLE=cutlass_scheduler pytest tests/python/test_narrow_precision.py -vs shows that test_scaled_mm fails with a link error 🤕

…tlassScheduler classes with basic infrastructure
…eduler - implementing JIT compilation framework
- Reset CutlassCompiledKernel to standalone class (not inheriting from CompiledKernel)
- Remove NVRTC compilation option, focus on nvcc system call compilation
- Implement nvcc compilation pipeline with temporary directories and dlopen
- Generate exact same kernel code as static nvfp4_scaled_mm.cu
- Add proper error handling and compilation output capture
- Update Cutlass scheduler to only accept ScaledMmaOp (already implemented)
- Add CursorPlan.md documenting the implementation approach
- Successfully builds and ready for testing with scaled_mm tests
- Added test_cutlass_executor.cpp with FusionExecutorCache tests
- Updated test_cutlass_scheduler.cpp to use CutlassExecutor directly
- Created cutlass_executor_example.py demonstrating nvfp4 scaled matmul
- Added test_scaled_mm.cpp for comprehensive scaled_mm testing
- Updated CMakeLists.txt to include new test files
- All tests are building successfully and CutlassExecutor integration is working
- Current issue: nvcc compilation failing due to missing cutlass_utils.h include path
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant