-
Notifications
You must be signed in to change notification settings - Fork 69
Fix issue 4929 #4935
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix issue 4929 #4935
Conversation
!test --diff |
Review updated until commit d11803f Description
Changes walkthrough 📝
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
expr, indexed_ids, for_loops, isSharedMemoryTvForLdStMatrix(tv, expr)); | ||
for (const auto& [indexed_id, index] : override_index) { | ||
index_info.index_map.emplace(traversalGraph().toGroup(indexed_id), index); | ||
index_info.index_map[traversalGraph().toGroup(indexed_id)] = index; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a trivial bug fix. override_index
didn't actually override existing mappings because of the use of emplace
.
!test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
68c955b revealed a bug with |
!test |
Fixes #4929.
This is a follow-up bug fix (#4742 (comment)).
There are actually two bugs. One is index overriding, which caused the overriding replacement to fail:
Wrong:
T4[((nvfuser_index_t)threadIdx.x)] = T3[0];
Correct:
T4[__to_index(T0[((nvfuser_index_t)threadIdx.x)])] = T3[0];
Another bug is missing RAW syncs. The sync analysis needed to be extended to consider indirect indexing.