Skip to content

develop: Fix hang bug by updating MPI bcast to be from 0 not IAPROC-1#1430

Merged
sbanihash merged 2 commits into
NOAA-EMC:developfrom
JessicaMeixner-NOAA:dev/initfixsavepoints
May 19, 2025
Merged

develop: Fix hang bug by updating MPI bcast to be from 0 not IAPROC-1#1430
sbanihash merged 2 commits into
NOAA-EMC:developfrom
JessicaMeixner-NOAA:dev/initfixsavepoints

Conversation

@JessicaMeixner-NOAA
Copy link
Copy Markdown
Collaborator

@JessicaMeixner-NOAA JessicaMeixner-NOAA commented May 14, 2025

Pull Request Summary

This PR fixes the bcast call.

Description

When saving point for the unstructured grid, the bcast root was incorrectly IAPROC-1 instead of 0 as intended. This PR is a bug fix and the same bug fix for the dev/ufs-weather-model branch has corresponding PR #1426

Issue(s) addressed

Refs #1350

Commit Message

Fix hang bug by updating MPI bcast to be from 0 not IAPROC-1

Check list

Testing

  • How were these changes tested? matrix regtests
  • Are the changes covered by regression tests? (If not, why? Do new tests need to be added?) yes, although you usually saw the hang on higher processor counts.
  • Have the matrix regression tests been run (if yes, please note HPC and compiler)? hera intel
  • Please indicate the expected changes in the regression test output, (Note the list of known non-identical tests.)
    only the expected non b4b
  • Please provide the summary output of matrix.comp (matrix.Diff.txt, matrixCompFull.txt and matrixCompSummary.txt):
**********************************************************************
********************* non-identical cases ****************************
**********************************************************************
mww3_test_03/./work_PR1_MPI_e                     (1 files differ)
mww3_test_03/./work_PR3_UQ_MPI_e_c                     (1 files differ)
mww3_test_03/./work_PR2_UQ_MPI_e                     (1 files differ)
mww3_test_03/./work_PR2_UNO_MPI_e                     (1 files differ)
mww3_test_03/./work_PR2_UNO_MPI_d2                     (9 files differ)
mww3_test_03/./work_PR1_MPI_d2                     (14 files differ)
mww3_test_03/./work_PR3_UNO_MPI_d2_c                     (10 files differ)
mww3_test_03/./work_PR3_UQ_MPI_d2_c                     (16 files differ)
mww3_test_03/./work_PR3_UNO_MPI_d2                     (16 files differ)
mww3_test_03/./work_PR2_UQ_MPI_d2                     (15 files differ)
mww3_test_03/./work_PR3_UQ_MPI_e                     (1 files differ)
mww3_test_03/./work_PR3_UQ_MPI_d2                     (16 files differ)
mww3_test_09/./work_MPI_ASCII                     (0 files differ)
ww3_tp2.10/./work_MPI_OMPH                     (6 files differ)
ww3_tp2.16/./work_MPI_OMPH                     (4 files differ)
ww3_tp2.6/./work_ST4_ASCII                     (0 files differ)
ww3_ufs1.3/./work_a                     (3 files differ)

matrixCompFull.txt
matrixCompSummary.txt
matrixDiff.txt

Copy link
Copy Markdown
Collaborator

@sbanihash sbanihash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reviewed and tested this PR and approve the changes.
MATDIFF_1430

matrixCompSummary.txt
matrixCompFull.txt
matrixDiff.txt

*will not merge since Ming is also testing this PR.

@mingchen-NOAA
Copy link
Copy Markdown
Collaborator

I have reviewed and tested this PR.

image

matrixCompFull.txt
matrixCompSummary.txt
matrixDiff.txt

@sbanihash sbanihash merged commit 93ee103 into NOAA-EMC:develop May 19, 2025
3 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants