Skip to content

Conversation

@wujingyue
Copy link
Collaborator

  • Replay loop unconditionally
  • WIP
  • Revert "Replay loop unconditionally"
  • Fix
  • Fix
  • Fix
  • Nit
  • Remove a check
  • Remove a fixme
  • Fix
  • Remove ignore_reductions
  • Build replay in a separate method
  • Refactor
  • Comment
  • Review
  • Minor
  • Smoke test

@wujingyue
Copy link
Collaborator Author

!test

@github-actions
Copy link

github-actions bot commented Oct 9, 2025

Description

  • Fix contiguity replay in TransformReplay::selfReplay

  • Refactor selfReplay to use IterDomainMap consistently

  • Handle symbolic and broadcast domains in contiguity logic

  • Add test for contiguity with empty allocation domain


Changes walkthrough 📝

Relevant files
Bug fix
nodes.cpp
Improve contiguity validation error handling                         

csrc/ir/nodes.cpp

  • Replace NVF_CHECK with NVF_CHECK_EQ for clearer error
  • Improve error message formatting for contiguity validation
  • +3/-6     
    transform_replay.cpp
    Refactor and fix selfReplay contiguity logic                         

    csrc/transform_replay.cpp

  • Refactor selfReplay into lambda returning IterDomainMap
  • Properly handle contiguity for symbolic and broadcast IDs
  • Fix reduction axis handling in loop and allocation replay
  • Remove unused ignore_reductions flag
  • +131/-85
    Tests
    test_alias.cpp
    Add test for slice accumulation aliasing                                 

    tests/cpp/test_alias.cpp

  • Add AccumulateSlices test for aliasing with symbolic domains
  • Use auto for tensor variable declarations
  • Include necessary ATen ops for new test
  • +55/-25 
    test_replay.cpp
    Add test for empty allocation contiguity                                 

    tests/cpp/test_replay.cpp

  • Add ContiguityWithEmptyAllocation test
  • Test contiguity replay when allocation domain is empty
  • Use IsFalse matcher for contiguity validation
  • +16/-0   

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    🧪 PR contains tests
    ⚡ Recommended focus areas for review

    Possible Issue

    The condition if (self->loop() != self->logical()) is only applied to the loop replay section, but the allocation replay logic that follows is always executed. This may lead to inconsistent behavior if the loop and allocation domains are expected to be handled together. The separation might unintentionally bypass important checks or transformations for allocation when the loop and logical domains match.

    if (self->loop() != self->logical()) {
      std::vector<IterDomain*> new_loop;
      for (auto* new_id : new_self->logical()) {
        if (mapped_new_ids.count(new_id) == 0) {
          NVF_THROW("");
          NVF_ERROR(
              new_id->isReduction(),
              new_id->toString(),
              " should be a reduction.");
          new_loop.push_back(new_id);
        }
      }
    
      for (IterDomain* loop_id : self->loop()) {
        IterDomain* new_loop_id = getOrDefault(replay, loop_id);
        if (new_loop_id == nullptr) {
          continue;
        }
        new_loop_id->parallelize(loop_id->getParallelType());
        new_loop.push_back(new_loop_id);
      }
    
      new_self->setLoopDomain(new_loop);
    Logic Gap

    In the allocation replay section, when self->loop() == self->logical(), the code maps contiguity from the logical domain instead of the allocation domain. However, it skips unmapped IDs using getOrDefault(replay, id), but does not validate whether all necessary IDs are properly mapped. This could result in incorrect or missing contiguity information if some IDs fail to map.

    } else {
      const std::vector<IterDomain*>& new_logical = new_self->logical();
      const auto new_rank = std::ssize(new_logical);
      std::vector<std::optional<bool>> new_contiguities(new_rank, std::nullopt);
    
      int new_pos = 0;
      for (auto [id, contiguity] : zip(self->logical(), self->contiguity())) {
        IterDomain* new_id = getOrDefault(replay, id);
        if (new_id == nullptr) {
          continue;
        }
    
        // Find the corresponding contiguity in new_logical. Mapped IterDomains
        // in self->logical() and new_logical follow the same order. So it's safe
        // to only increment `new_pos`.
        while (new_pos < new_rank && new_logical.at(new_pos) != new_id) {
          new_pos++;
        }
        NVF_ERROR_LT(
            new_pos,
            new_rank,
            "Failed to find ",
            new_id->toString(),
            " in ",
            new_logical);
        std::optional<bool>& new_contiguity = new_contiguities.at(new_pos);
    
        new_contiguity = contiguity;
        // When used during or before concretization, TransformReplay::selfReplay
        // can be applied to replay transformations from symbolic dimensions to
        // concrete dimensions, or in the reverse direction. Therefore,
        // `new_contiguity` is not always identical to `contiguity`.
        if (new_id->isBroadcast()) {
          new_contiguity = std::nullopt;
        } else if (new_id->isSymbolic()) {
          // See AliasTest.AccumulateSlices for an example. aliasOutputToInput is
          // called before concretization and tries to replay contiguity from a
          // broadcast IterDomain to a symbolic IterDomain. However, a symbolic
          // IterDomain can't have contiguity null.
          if (!new_contiguity.has_value()) {
            new_contiguity = true;
          }
        }
      }
    
      new_self->setContiguity(new_contiguities);
    }
    Redundant Check

    The use of NVF_CHECK_EQ with a custom error message duplicates functionality already provided by the macro itself. This may reduce code clarity and maintainability, as the error message is redundant and could become inconsistent with future changes to the macro's behavior.

    NVF_CHECK_EQ(
        contiguity.size(),
        allocation_domain.size(),
        "Invalid contiguity information provided, incorrect size.");

    @wujingyue wujingyue changed the base branch from main to wjy/replay October 9, 2025 04:38
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Labels

    None yet

    Projects

    None yet

    Development

    Successfully merging this pull request may close these issues.

    1 participant