Skip to content

Conversation

@awlauria
Copy link
Contributor

@awlauria awlauria commented Aug 17, 2021

The session directory created during the mpi process execution
sometimes will be left without cleanup even after the process
terminates, this scenario mostly happens when orte daemon is SIGKILL'd.

This ensures the smooth socket binding, by unlinking the exisiting
socket file (if any exists in the session_directory) and rebinding it, thus
avoiding bind() failure due to unclean session directories.

Signed-off-by: Austen Lauria [email protected]
(cherry picked from commit e228a1c)

@awlauria awlauria added this to the v4.1.2 milestone Aug 17, 2021
@awlauria awlauria changed the title v4.1.x: Cleanup session dir when orted exits unexpectedly. v4.1.x: v4.0.x: Unlink and rebind socket when session directory already exists Aug 18, 2021
@awlauria awlauria changed the title v4.1.x: v4.0.x: Unlink and rebind socket when session directory already exists v4.1.x: Unlink and rebind socket when session directory already exists Aug 18, 2021
The session directory created during the mpi process execution
sometimes will be left without cleanup even after the process
terminates, this scenario mostly happens when orte daemon is SIGKILL'd.

This ensures the smooth socket binding, by unlinking the exisiting
socket file (if any exists in the session_directory) and rebinding it, thus
avoiding bind() failure due to unclean session directories.

Signed-off-by: Austen Lauria <[email protected]>
(cherry picked from commit e228a1c)
@awlauria awlauria force-pushed the orte_exit_session_cheanup_v4.1.x branch from 5aa22fa to 1e6b91d Compare August 18, 2021 14:32
@awlauria awlauria requested a review from rhc54 August 23, 2021 19:40
@awlauria awlauria changed the title v4.1.x: Unlink and rebind socket when session directory already exists v4.1.x: Unlink and rebind socket when session directory already exists Aug 24, 2021
@jsquyres jsquyres merged commit ead8bfb into open-mpi:v4.1.x Aug 29, 2021
@awlauria awlauria deleted the orte_exit_session_cheanup_v4.1.x branch March 17, 2022 17:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants