Skip to content

Conversation

@roiedanino
Copy link
Contributor

What?

Implements mca_spml_ucx_put_signal and mca_spml_ucx_put_signal_nb for OpenSHMEM signaling operations with UCX backend.

Why

The put_signal operations are part of the OpenSHMEM 1.5 specification and were previously unimplemented (returning OSHMEM_ERR_NOT_IMPLEMENTED).

@roiedanino roiedanino changed the title OSHMEM/MCA/SPML/UCX: implement put_signal and put_signal_nbi OSHMEM/MCA/SPML/UCX: v1.5 implement put_signal and put_signal_nbi Dec 14, 2025
@roiedanino
Copy link
Contributor Author

Hi @manjugv @janjust
Can take a look here as well?

Thanks!

janjust
janjust previously approved these changes Dec 16, 2025
gleon99
gleon99 previously approved these changes Dec 18, 2025
return res;
}

return mca_spml_ucx_signal_common(ctx, sig_addr, signal, sig_op, dst);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need a fence here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not according to the v1.5 spec

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not according to the v1.5 spec

can you pls elaborate why not needed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, should we use non-blocking signal here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From v1.5 documentation, unless I missed something:
image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad, we do need a fence here

@roiedanino roiedanino dismissed stale reviews from gleon99 and janjust via 63dbb0e December 18, 2025 13:46
janjust
janjust previously approved these changes Dec 18, 2025
return OSHMEM_ERROR;
}

return res;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe move the return to lines 1662/1658 directly?

return res;
}

return mca_spml_ucx_signal_common(ctx, sig_addr, signal, sig_op, dst);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need fence before this, to ensure ordering

return res;
}

return mca_spml_ucx_signal_common(ctx, sig_addr, signal, sig_op, dst);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not according to the v1.5 spec

can you pls elaborate why not needed?

return res;
}

return mca_spml_ucx_signal_common(ctx, sig_addr, signal, sig_op, dst);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, should we use non-blocking signal here?

@roiedanino
Copy link
Contributor Author

@yosefe Can re-review please?

}

/* This routine is not implemented */
static inline int mca_spml_ucx_signal_common(shmem_ctx_t ctx,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

split to mca_spml_ucx_signal_common and mca_spml_ucx_signal_common_nb

@roiedanino roiedanino force-pushed the oshmem/put-signal branch 2 times, most recently from 006d7f1 to 963553a Compare January 11, 2026 13:15
@roiedanino roiedanino requested a review from yosefe January 11, 2026 13:16
Comment on lines 1669 to 1672
uint64_t *sig_addr,
uint64_t signal,
int sig_op,
int dst)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls align

@roiedanino roiedanino requested a review from yosefe January 11, 2026 16:34
yosefe
yosefe previously approved these changes Jan 12, 2026
Copy link
Contributor

@yosefe yosefe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems my previous comment about using non-blocking atomic was wrong. "add" and "swap" a

int pe)
{
#if HAVE_DECL_UCP_ATOMIC_OP_NBX
return mca_atomic_ucx_op_nb(ctx, target, value, size, pe, UCP_ATOMIC_OP_SWAP);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how mca_atomic_ucx_op_nb is different from mca_atomic_ucx_op?
can't we just reuse mca_atomic_ucx_op?

@roiedanino roiedanino requested a review from yosefe January 14, 2026 13:39
return res;
}

return mca_spml_ucx_signal(ctx, sig_addr, signal, sig_op, dst);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need fence also here in the blocking version because it controls the remote ordering

Signed-off-by: Roie Danino <[email protected]>

OSHMEM/MCA/ATOMIC: added atomic_set mca function, and made non-retrieving atomic operations non-blocking

Signed-off-by: Roie Danino <[email protected]>
@janjust janjust merged commit a5a7068 into open-mpi:main Jan 15, 2026
20 of 21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants