Escleanup #5413

zasdfgbnm · 2025-10-18T00:02:00Z

No description provided.

github-actions · 2025-10-18T00:03:06Z

Review updated until commit f756717

Description

Introduce output evaluation via ExpressionEvaluator for specific outputs
Centralize output size inference using inferOutputSizes
Add debug check for contiguity flag consistency
Remove duplicated ExpressionEvaluator logic from runtime

Changes walkthrough 📝

Relevant files

Enhancement

allocations.cpp `Support direct output evaluation in allocation inference` csrc/runtime/allocations.cpp Add early evaluation path for outputs marked with AllocationType::Evaluate Use ExpressionEvaluator with meta tensors to infer output sizes Bind input tensors with preserved strides and dtype on meta device Return evaluated results directly without further allocation	+39/-0
fusion_kernel_runtime.cpp `Simplify input preparation with centralized output inference` csrc/runtime/fusion_kernel_runtime.cpp Replace inline ExpressionEvaluator logic with inferOutputSizes call Introduce consistency check between scheduler type and output alias Use segmented fusion to generate runtime fusion for evaluation Pass contiguity update flag based on scheduler type	+36/-59

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🧪 No relevant tests

⚡ Recommended focus areas for review

Possible Issue

The logic for handling outputs marked with AllocationType::Evaluate relies on dynamic casting and direct access to output aliases. This may not correctly handle all edge cases, especially if the output is not a TensorView or if alias information is missing, leading to incorrect evaluation or missed evaluations.

// If any output is marked to be evaluated by ExpressionEvaluator,
// evaluate outputs directly following ATen's striding rules.
bool has_evaluate_outputs = false;
for (Val* out : fusion->outputs()) {
  if (auto* tv = dynamic_cast<TensorView*>(out)) {
    if (fusion->getOutputAlias(tv).type == AllocationType::Evaluate) {
      has_evaluate_outputs = true;
      break;
    }
  }
}

if (has_evaluate_outputs) {
  ExpressionEvaluator eval_fusion;
  // Bind inputs; for tensors, bind meta tensors preserving size/stride and dtype.
  for (auto i : arange(fusion->inputs().size())) {
    const auto& pv = args[i];
    if (pv.is<at::Tensor>()) {
      const auto t = pv.as<at::Tensor>();
      if (t.defined()) {
        const auto meta_t = at::empty_strided(
            t.sizes(),
            t.strides(),
            c10::TensorOptions().device(c10::Device(c10::DeviceType::Meta)).dtype(t.dtype()));
        eval_fusion.bind(fusion->inputs()[i], meta_t);
      } else {
        eval_fusion.bind(fusion->inputs()[i], t);
      }
    } else {
      eval_fusion.bind(fusion->inputs()[i], pv);
    }
  }
  for (auto v : fusion->outputs()) {
    auto result = eval_fusion.evaluate(v);
    output_tensor_proxies.push(result);
  }
  return output_tensor_proxies;
}

Consistency Check

The debug check comparing need_update_contiguity and has_evaluate_outputs assumes these two conditions should always match. However, if the scheduler type and output alias type are not kept in sync by other parts of the system, this assertion could fail, indicating a deeper inconsistency in the design or state management.

// Debug check: ensure alias-based expectation matches scheduler-based need_update_contiguity
bool has_evaluate_outputs = false;
for (Val* out : fusion_to_run->outputs()) {
  if (auto* tv = dynamic_cast<TensorView*>(out)) {
    if (fusion_to_run->getOutputAlias(tv).type == AllocationType::Evaluate) {
      has_evaluate_outputs = true;
      break;
    }
  }
}
NVF_ERROR(
    need_update_contiguity == has_evaluate_outputs,
    "need_update_contiguity mismatches has_evaluate_outputs in prepareInputs");

Redundant Logic

The same pattern of checking for Evaluate outputs and setting need_update_contiguity is duplicated in multiple functions. This redundancy increases maintenance burden and the risk of inconsistencies if the logic is updated in one place but not the others.

// Debug check: ensure alias-based expectation matches scheduler-based need_update_contiguity
bool has_evaluate_outputs = false;
for (Val* out : fusion_to_run->outputs()) {
  if (auto* tv = dynamic_cast<TensorView*>(out)) {
    if (fusion_to_run->getOutputAlias(tv).type == AllocationType::Evaluate) {
      has_evaluate_outputs = true;
      break;
    }
  }
}
NVF_ERROR(
    need_update_contiguity == has_evaluate_outputs,
    "need_update_contiguity mismatches has_evaluate_outputs in prepareInputs");

zasdfgbnm added 3 commits October 17, 2025 15:15

Try cleanup

d468941

save

50f5827

save

e58243d

save

f756717

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Escleanup #5413

Escleanup #5413

Uh oh!

zasdfgbnm commented Oct 18, 2025

Uh oh!

github-actions bot commented Oct 18, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Escleanup #5413

Are you sure you want to change the base?

Escleanup #5413

Uh oh!

Conversation

zasdfgbnm commented Oct 18, 2025

Uh oh!

github-actions bot commented Oct 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes walkthrough 📝

PR Reviewer Guide 🔍

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions bot commented Oct 18, 2025 •

edited

Loading