Skip to content

[GPU] Update sdpa opt to support unaligned head size #30099

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

yeonbok
Copy link
Contributor

@yeonbok yeonbok commented Apr 13, 2025

Details:

  • Fixed sdpa_opt to support unaligned head_size
  • Previously, SDPA was decomposed for unaligned head_size such as 72 because sdpa_opt did not support that shape
  • Now SDPA is not decomposed for unaligned head_size

Tickets:

@yeonbok yeonbok requested review from a team as code owners April 13, 2025 11:48
@github-actions github-actions bot added the category: GPU OpenVINO GPU plugin label Apr 13, 2025
@yeonbok yeonbok force-pushed the taylor_sdpa_opt_unaligned_head branch from d07b14a to a54a3c3 Compare April 14, 2025 23:02
@yeonbok yeonbok force-pushed the taylor_sdpa_opt_unaligned_head branch from f3270db to 1eae485 Compare April 15, 2025 23:56
@yeonbok yeonbok added this to the 2025.2 milestone Apr 16, 2025
@yeonbok yeonbok changed the title [WIP] Support sdpa opt for unaligned head size [GPU] Support sdpa opt for unaligned head size Apr 16, 2025
@yeonbok yeonbok force-pushed the taylor_sdpa_opt_unaligned_head branch 2 times, most recently from c2107d0 to 80a5123 Compare April 16, 2025 07:59
@yeonbok yeonbok force-pushed the taylor_sdpa_opt_unaligned_head branch 3 times, most recently from edb323b to 48fc974 Compare April 16, 2025 08:13
@yeonbok yeonbok changed the title [GPU] Support sdpa opt for unaligned head size [GPU] Update sdpa opt to support unaligned head size Apr 16, 2025
@yeonbok yeonbok force-pushed the taylor_sdpa_opt_unaligned_head branch 2 times, most recently from e9d5c44 to a28bd5d Compare April 16, 2025 08:27
Comment on lines 1569 to 1571
#ifdef BEAM_TABLE_TYPE
const uint b_idx = beam_table[FUNC_CALL(get_bt_index_value)(OPTIONAL_SHAPE_INFO_TENSOR b0_idx, b1_idx, 0, 0, start_partition_idx + seq_len_leftovers_start + sglid, sgid * SUBGROUP_SIZE)];
const uint value_offset = FUNC_CALL(get_input2_index)(OPTIONAL_SHAPE_INFO_TENSOR b_idx, b1_idx, 0, 0, start_partition_idx + seq_len_leftovers_start + sglid, sgid * SUBGROUP_SIZE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also handle Value input leftovers reading, as it also uses blocked read:
const INPUT2_TYPE value_packed = VALUE_BLOCK_READ(value_input, value_offset);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dummy values are just loaded and applied to acc_output_res's leftover and then just wasted. But yeah.. I will handle them for the future driver change or usage of sdpa_opt in > Xe2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@yeonbok yeonbok force-pushed the taylor_sdpa_opt_unaligned_head branch from e9790f6 to 038cb58 Compare April 17, 2025 05:43
@yeonbok yeonbok requested a review from sshlyapn April 17, 2025 06:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: GPU OpenVINO GPU plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants