Skip to content

Conversation

zxd1997066
Copy link
Contributor

@zxd1997066 zxd1997066 commented Aug 20, 2025

This PR intends to add some ported distributed cases in torch-xpu-ops CI.

  • Add ZE_AFFINITY_MASK to ensure using Xelink.
  • Add CCL_ROOT for Xelink, this WA can be removed after oneCCL upgrade to 2021.16.2
  • Increase distributed test time limit. Currently, the test part needs about 1 hour after add ported cases.

disable_e2e
disable_ut
disable_build

@zxd1997066 zxd1997066 force-pushed the xiangdong/dist_cases branch 6 times, most recently from a498bbd to 5f55483 Compare August 22, 2025 15:28
@zxd1997066 zxd1997066 force-pushed the xiangdong/dist_cases branch from 5f55483 to ef62eaa Compare August 26, 2025 09:45
Copy link
Contributor

@daisyden daisyden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zxd1997066 zxd1997066 force-pushed the xiangdong/dist_cases branch 2 times, most recently from 90423a8 to a6b85e2 Compare August 29, 2025 02:44
@chuanqi129
Copy link
Contributor

@zxd1997066 please rebase the PR against with latest code base

@zxd1997066 zxd1997066 force-pushed the xiangdong/dist_cases branch 9 times, most recently from bf75fbe to bec9de4 Compare September 10, 2025 07:51
@zxd1997066 zxd1997066 force-pushed the xiangdong/dist_cases branch 3 times, most recently from df64f28 to eab58fa Compare September 11, 2025 02:32
@zxd1997066 zxd1997066 force-pushed the xiangdong/dist_cases branch 2 times, most recently from f6588df to 7187356 Compare September 11, 2025 05:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants