Skip to content

v4.1.x: --do-not-launch throws misleading error #10643

@jjhursey

Description

@jjhursey

Background information

What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)

Open MPI v4.1.x branch

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

git clone -b v4.1.x [email protected]:open-mpi/ompi.git at commit 88bb627

Please describe the system on which you are running

  • Operating system/version: RHEL 8.4
  • Computer hardware: ppc64le
  • Network type: Ethernet

Details of the problem

shell$  mpirun --host f5n17:2 --do-not-launch hostname
 Data for JOB [17227,1] offset 0 Total slots allocated 2

 ========================   JOB MAP   ========================

 Data for node: f5n17	Num slots: 2	Max slots: 0	Num procs: 2
 	Process OMPI jobid: [17227,1] App: 0 Process rank: 0 Bound: N/A
 	Process OMPI jobid: [17227,1] App: 0 Process rank: 1 Bound: N/A

 =============================================================
[f5n18:3126629] LAUNCH MSG RAW SIZE: 783
--------------------------------------------------------------------------
An internal error has occurred in ORTE:

[[17227,0],0] FORCE-TERMINATE AT (null):0 - error base/plm_base_launch_support.c(595)

This is something that should be reported to the developers.
--------------------------------------------------------------------------

The "An internal error has occurred in ORTE" is incorrect since ORTE is just terminating without launching - which is what the user asked for.

We need to suppress the error message in this case.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions