Skip to content

Conversation

@raffenet
Copy link
Contributor

@raffenet raffenet commented Oct 7, 2025

Pull Request Description

Optimize progress for a batch of local communication requests by skipping netmod progress. Shows a 5-10% improvement in ch4/shm bandwidth measurements on a single node of Cascade Lake.

Author Checklist

  • Provide Description
    Particularly focus on why, not what. Reference background, issues, test failures, xfail entries, etc.
  • Commits Follow Good Practice
    Commits are self-contained and do not do two things at once.
    Commit message is of the form: module: short description
    Commit message explains what's in the commit.
  • Passes All Tests
    Whitespace checker. Warnings test. Additional tests via comments.
  • Contribution Agreement
    For non-Argonne authors, check contribution agreement.
    If necessary, request an explicit comment from your companies PR approval manager.

@raffenet
Copy link
Contributor Author

raffenet commented Oct 7, 2025

test:mpich/ch4/most

@raffenet
Copy link
Contributor Author

raffenet commented Oct 7, 2025

test:mpich/ch4/most

state->vci_count = idx;

if (local_only) {
state->flag &= ~MPIDI_PROGRESS_NM;
Copy link
Contributor

@hzhou hzhou Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should still perform netmod progress during "global" progress.

if (state->flag & MPIDI_PROGRESS_NM && (is_global || !made_progress)) { \

Make it

        if (is_global || (state->flag & MPIDI_PROGRESS_NM && !made_progress)) { \

@raffenet
Copy link
Contributor Author

raffenet commented Oct 8, 2025

test:mpich/ch4/most

@raffenet raffenet force-pushed the shm-local-only branch 2 times, most recently from 95989b8 to c405c76 Compare October 9, 2025 18:05
@raffenet
Copy link
Contributor Author

raffenet commented Oct 9, 2025

test:mpich/ch4/most

@raffenet
Copy link
Contributor Author

raffenet commented Oct 9, 2025

test:mpich/ch4/most

@raffenet
Copy link
Contributor Author

raffenet commented Oct 9, 2025

test:mpich/ch4/most

@raffenet
Copy link
Contributor Author

raffenet commented Oct 9, 2025

test:mpich/ch4/most

@raffenet
Copy link
Contributor Author

test:mpich/ch4/most

@raffenet
Copy link
Contributor Author

test:mpich/ch4/most

@raffenet
Copy link
Contributor Author

ch4/ofi vci build crashed when the node it was running on went offline. this is ready for review.

@raffenet raffenet requested a review from hzhou October 13, 2025 16:47
int vci_target = MPIDI_WIN_TARGET_VCI(win, target_rank);
sreq = MPIDIG_request_create(MPIR_REQUEST_KIND__RMA, 2, vci, vci_target);
MPIR_ERR_CHKANDSTMT(sreq == NULL, mpi_errno, MPIX_ERR_NOREQ, goto fn_fail, "**nomemreq");
MPIDI_REQUEST_SET_LOCAL(sreq, MPIDI_rank_is_local(target_rank, win->comm_ptr), NULL);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two separate comments:

  1. We should add is_local to MPIDIG_request_create to ensure request created in the ch4-layer always have locality set
  2. What about requests created in the MPIR-layer or requests that straddles between shm/netmod such as collectives or anysource receive? We may need expand MPIDI_REQUEST(rreq, is_local) to enum (or more like optional).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two separate comments:

  1. We should add is_local to MPIDIG_request_create to ensure request created in the ch4-layer always have locality set

Can do.

  1. What about requests created in the MPIR-layer or requests that straddles between shm/netmod such as collectives or anysource receive? We may need expand MPIDI_REQUEST(rreq, is_local) to enum (or more like optional).

Actually this already happens for anysrc_partner via MPID_Request_create_hook. We can also initialize is_local=0 there and then rely on the shm layer to set it to true when appropriate. That would work since the optimization is to skip NM progress.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also initialize is_local=0 there and then rely on the shm layer to set it to true when appropriate. That would work since the optimization is to skip NM progress.

Yeah, we can do that and refactor later.

@raffenet
Copy link
Contributor Author

test:mpich/ch4/most
test:mpich/ch4/gpu/most

@raffenet
Copy link
Contributor Author

test:mpich/ch4/xpmem

When global progress is triggered we should ignore the progress state
flags and just call every progress function. We also need to trigger
global progress in the single vci case because we plan to make the
progress loop locality aware.
When a request is created, initialize is_local to false. Avoids
accidentally reading an uninitialized value. Other layers will update
the value to true where appropriate, such as the shared memory or
generic active message code.
IPC protocol is local by definition, so make sure to reflect that in any
requests created in IPC code.
Skip netmod progress when the request(s) being waited on are determined
to be local only.
@raffenet
Copy link
Contributor Author

test:mpich/ch4/most
test:mpich/ch4/gpu/ofi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants