Skip to content

Conversation

@jacobhinkle
Copy link
Collaborator

No description provided.

@github-actions
Copy link

Description

  • Add CuTE layout printing for scheduled fusion

  • Introduce printCute() to dump Fusion IR in CuTE format

  • Implement CuteConverter for translating IterDomains to CuTE layouts

  • Support symbolic strides and shapes via MultipliedString and IntTuple


Changes walkthrough 📝

Relevant files
Enhancement
7 files
lower2device.cpp
Enable CuTE layout dumping in debug mode                                 
+3/-0     
fusion.cpp
Implement printCute for CuTE layout output                             
+66/-0   
cute_translation.cpp
Implement CuTE layout conversion logic                                     
+305/-0 
tensor_view.cpp
Add fullName method for TensorView                                             
+9/-3     
fusion.h
Declare printCute method for Fusion                                           
+10/-0   
cute_translation.h
Define CuTE layout translation structures                               
+240/-0 
interface_nodes.h
Declare fullName in TensorView                                                     
+2/-0     
Configuration changes
3 files
options.cpp
Add FusionIrCute debug dump option                                             
+1/-0     
options.h
Add FusionIrCute to DebugDumpOption                                           
+1/-0     
CMakeLists.txt
Include cute_translation.cpp in build                                       
+1/-0     

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🧪 No relevant tests
⚡ Recommended focus areas for review

Unimplemented Support

The Swizzle2D operation is not supported and throws an exception, which may lead to incomplete or incorrect CuTE layout generation when such operations are present in the fusion.

} else if (auto* swizzle = dynamic_cast<Swizzle2D*>(expr)) {
  // TODO
  NVF_THROW("Swizzle2D support not yet implemented ", swizzle->toString());
Incomplete Merge Handling

The handling of merge operations in the getLayout function does not check for contiguity and always represents merges as IntTuples, potentially missing optimization opportunities for contiguous merges.

  // TODO: Check if this is a contiguous merge. If so, then we can flatten
  // it IF NOT contiguous merge, then create a new IntTuple consisting of
  // the incoming shapes/strides
  // For now, we don't flatten any merges, we just represent them as
  // IntTuple always
  Int out_shape = std::make_shared<IntTuple>(
      std::vector<Int>{outer_shape, inner_shape});
  Int out_stride = std::make_shared<IntTuple>(
      std::vector<Int>{outer_stride, inner_stride});

  id_size_stride.emplace(
      merge->out(), std::pair<Int, Int>{out_shape, out_stride});
} else if (auto* swizzle = dynamic_cast<Swizzle*>(expr)) {
Unhandled Multiplication Case

The multiplication of two IntTuples in the operator* function for Int is not properly handled and defaults to a string representation, which may result in incorrect stride calculations.

// TODO: Handle multiplication of IntTuples properly
return {MultipliedString{a.toString() + "*" + b.toString()}};

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants