[Compiler Toolkit] Enable nested_compile_region on TransformerBlock #1973
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Need to run with fix in pytorch/pytorch#166702
Current output: P2016557983
Observations
subgraph_0,subgraph_2... This is not what we want. we should see 1 instance of subgraph_0, and multiple invoke_subgraph nodes on the same subgraph_0, with different layer weights.subgraph_1), where subgraph_1 internally calls invoke_subgraph for he transformerblock. We are getting into nested HOP/subgraph region.