Fix transfer_size calculation in pointwise scheduler #5068

tbqh · 2025-08-25T15:53:02Z

Resolves #4773 which tracks a wrong calculation in the pointwise scheduler noticed by @zasdfgbnm:

The unit of cur_transfer_size and right_transfer_size were in the unit of byte ^ N, where N is the number of dimensions on the left/right of the break point... Later in code, we are comparing the transfer sizes with L2 cache. This makes no sense. We can not compare a quantity with unit byte ^ N with a quantity with unit byte.

Fixing it shows a perf improvement in test_nvfuser:

./bin/test_nvfuser --gtest_filter=PointwiseTest.*

- Before PR: 30 tests from PointwiseTest (7252 ms total)
- After PR:  30 tests from PointwiseTest (7500 ms total)

TODO: test benchmarks

github-actions · 2025-08-25T15:53:54Z

Description

Fix incorrect transfer_size calculation unit mismatch
Remove duplicated bit_multiple multiplication in loop
Correctly compute transfer size in bytes
Align transfer size with L2 cache comparison

Changes walkthrough 📝

Relevant files

Bug fix

pointwise.cpp `Correct transfer size computation in pointwise scheduler` csrc/scheduler/pointwise.cpp Initialize `cur_transfer_size_bit` and `right_transfer_size_bit` with correct bit multiples Remove redundant multiplication by `lhs_bit_multiple` and `rhs_bit_multiple` inside loops Correctly accumulate element counts across loop dimensions Ensure final transfer size is in bytes for proper L2 cache comparison	+4/-5

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🧪 No relevant tests

⚡ Recommended focus areas for review

Possible Issue

The initialization of cur_transfer_size_bit and right_transfer_size_bit has been changed to use only the bit multiples, but the subsequent multiplication with elem_counts may not correctly represent the total transfer size in bits, potentially leading to incorrect transfer cost estimation.

int64_t cur_transfer_size_bit = lhs_bit_multiple;
int64_t right_transfer_size_bit = rhs_bit_multiple;

for (const auto left_i : arange(break_point_i)) {
  cur_transfer_size_bit = cur_transfer_size_bit * elem_counts[left_i];
}

for (const auto right_i : arange(break_point_i, ref_loop.size())) {
  right_transfer_size_bit =
      right_transfer_size_bit * elem_counts[right_i];
}
cur_transfer_size_bit *= right_transfer_size_bit;

tbqh · 2025-08-25T15:56:04Z

csrc/scheduler/pointwise.cpp

-        int64_t cur_transfer_size_bit = 1;
-        int64_t right_transfer_size_bit = 1;
+        int64_t cur_transfer_size_bit = lhs_bit_multiple;
+        int64_t right_transfer_size_bit = rhs_bit_multiple;


The issue prior to this change is that we were multiplying by the element bit-size inside of the for loop that multiplies all dimensions. The bit-size should only be included in once in the whole product. Now the for-loop computes the total number of elements on each side of the break point.

The change makes sense. Let's see what the performance would look like.

zasdfgbnm · 2025-08-28T03:36:15Z

In the above code (line 372), we have

    // How much would this transfer cost if it was done as a 1-D schedule
    int64_t transfer_size_1d_bit = 1;

    for (const auto i : arange(ref_loop.size())) {
      transfer_size_1d_bit =
          transfer_size_1d_bit * elem_counts[i] * dtype_sum_bit;
    }

This number still has a unit of bit^n instead of bit.

Fix transfer_size calculation in pointwise scheduler

7980964

tbqh requested review from naoyam and zasdfgbnm August 25, 2025 15:53

tbqh commented Aug 25, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix transfer_size calculation in pointwise scheduler #5068

Fix transfer_size calculation in pointwise scheduler #5068

Uh oh!

tbqh commented Aug 25, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Aug 25, 2025

Uh oh!

tbqh Aug 25, 2025

Uh oh!

naoyam Aug 25, 2025

Uh oh!

zasdfgbnm commented Aug 28, 2025

Uh oh!

Uh oh!

Fix transfer_size calculation in pointwise scheduler #5068

Are you sure you want to change the base?

Fix transfer_size calculation in pointwise scheduler #5068

Uh oh!

Conversation

tbqh commented Aug 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Aug 25, 2025

Description

Changes walkthrough 📝

PR Reviewer Guide 🔍

Uh oh!

tbqh Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

naoyam Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

zasdfgbnm commented Aug 28, 2025

Uh oh!

Uh oh!

tbqh commented Aug 25, 2025 •

edited

Loading