Skip to content

Asynchronous setup #514

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 8, 2025
Merged

Asynchronous setup #514

merged 4 commits into from
May 8, 2025

Conversation

chhwang
Copy link
Contributor

@chhwang chhwang commented Apr 25, 2025

Cherry-picked a part of features from #167: now Communicator::setup() is unneeded. Communicator::sendMemory() conducts the task inline, and Communicator::recvMemory() and Communicator::connect() conducts the task asynchronously without explicit setup.

@chhwang
Copy link
Contributor Author

chhwang commented Apr 25, 2025

/azp run

Copy link

Azure Pipelines successfully started running 3 pipeline(s).

@chhwang chhwang linked an issue Apr 25, 2025 that may be closed by this pull request
@chhwang chhwang requested a review from Binyang2014 April 25, 2025 08:28
@Binyang2014
Copy link
Contributor

Try to understand this PR correctly. It only addresses the connection phase and does not resolve the issue of supporting on-side registered memory. In the send/recv case, both ranks still need to send their registered memory information to the peer, even though the receiving side does not require the peer's registered memory right?

@chhwang
Copy link
Contributor Author

chhwang commented May 7, 2025

Try to understand this PR correctly. It only addresses the connection phase and does not resolve the issue of supporting on-side registered memory. In the send/recv case, both ranks still need to send their registered memory information to the peer, even though the receiving side does not require the peer's registered memory right?

This is not what this PR addresses. What you are talking about is the channel's requirement. This PR is only about simplifying the initialization phase.

@chhwang chhwang enabled auto-merge (squash) May 8, 2025 21:57
@chhwang chhwang merged commit d636093 into main May 8, 2025
14 checks passed
@chhwang chhwang deleted the chhwang/init branch May 8, 2025 22:01
chhwang added a commit that referenced this pull request May 10, 2025
@chhwang chhwang mentioned this pull request May 11, 2025
chhwang added a commit that referenced this pull request May 13, 2025
* In cases when the same `tag` is used for receiving data from the same
remote rank, #514 changed the behavior of `Communicator::connect` and
`Communicator::recvMemory` to receive data in the order of
`std::shared_future::get()` is called, instead of the original behvaior
that receive data in the order of the method calls. Since the original
behavior is more intuitive, we get that back. Now when `get()` is called
on a future, the async function will first call `wait()` on the latest
previously returned future. In a recursive manner, this will call
`wait()` on all previous futures that are not yet ready.
* Removed all deprecated API calls and replaced into the new ones.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] deviceHandle() interface is counter-intuitive
2 participants