Skip to content

Bugfix for regional runs when dycore is compiled in double precision#25

Merged
junwang-noaa merged 4 commits into
NOAA-EMC:dev/emcfrom
climbfuji:regional_mpibugfix
Jun 29, 2020
Merged

Bugfix for regional runs when dycore is compiled in double precision#25
junwang-noaa merged 4 commits into
NOAA-EMC:dev/emcfrom
climbfuji:regional_mpibugfix

Conversation

@climbfuji
Copy link
Copy Markdown

This PR fixes issue #24.

It also adds one additional line to the initial printout, which we found useful to make sure that all constants and tracers are set up correctly.

@climbfuji
Copy link
Copy Markdown
Author

@climbfuji
Copy link
Copy Markdown
Author

@mzhangw

…r bit-for-bit identical results on Cheyenne with Intel 19.1 and SGI MPT 2.19
@climbfuji climbfuji requested a review from bensonr June 25, 2020 03:21
Copy link
Copy Markdown
Collaborator

@bensonr bensonr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buf1-4 are now allocatable variables, but the count argument for irecv is still using the ibufexch parameter. I recommend changing the count in the irecv be the size of buf1 and buf2 respectively.

@climbfuji
Copy link
Copy Markdown
Author

buf1-4 are now allocatable variables, but the count argument for irecv is still using the ibufexch parameter. I recommend changing the count in the irecv be the size of buf1 and buf2 respectively.

Good catch. Thanks very much. Will cleanup!

@climbfuji
Copy link
Copy Markdown
Author

buf1-4 are now allocatable variables, but the count argument for irecv is still using the ibufexch parameter. I recommend changing the count in the irecv be the size of buf1 and buf2 respectively.

Good catch. Thanks very much. Will cleanup!

I fixed this - removed ibufexch entirely and used size(bufX) in the messages (since the count isn't known at the time the first MPI commands occur). I reran the tests on Cheyenne, it compiles and runs in DEBUG mode for both 32bit and 64bit without crashing, and the PROD runs in 64-bit give b4b identical results from run to run (which was the issue that triggered this PR in the first place).

Copy link
Copy Markdown
Collaborator

@bensonr bensonr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look consistent.

I would recommend some inline comments or even protection logic to ensure buffering/unbuffering goes as planned. Any future modification to the logic needs to understand index changes must be thoroughly tested - especially when packing multiple variables into the same buffer.

@climbfuji
Copy link
Copy Markdown
Author

Changes look consistent.

I would recommend some inline comments or even protection logic to ensure buffering/unbuffering goes as planned. Any future modification to the logic needs to understand index changes must be thoroughly tested - especially when packing multiple variables into the same buffer.

@bensonr does this look better? It still works on Cheyenne as expected.

Copy link
Copy Markdown
Collaborator

@bensonr bensonr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@climbfuji
Copy link
Copy Markdown
Author

Associated PRs:
#25
NOAA-EMC/ufsatm#133
ufs-community/ufs-weather-model#155

For regression testing, see ufs-community/ufs-weather-model#155.

@junwang-noaa junwang-noaa merged commit 61a6c1d into NOAA-EMC:dev/emc Jun 29, 2020
aerorahul pushed a commit that referenced this pull request Jul 17, 2021
updated RELEASE.md for the 2020.02 release (issue #25)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants