Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update to ESMCOMP cmeps1.0.24 #25

Open
wants to merge 25 commits into
base: noresm
Choose a base branch
from

Conversation

mvertens
Copy link
Collaborator

@mvertens mvertens commented Nov 24, 2024

Description of changes

Testing must still be carried out

Specific notes

Contributors other than yourself, if any:

CMEPS Issues Fixed (include github issue #):

Are changes expected to change answers? (specify if bfb, different at roundoff, more substantial)

Any User Interface Changes (namelist or namelist defaults changes)?

Testing performed

Please describe the tests along with the target model and machine(s)
If possible, please also added hashes that were used in the testing

jedwards4b and others added 23 commits August 12, 2024 11:25
This is needed for bit-for-bit reproducibility. Most of the calls to
FieldRegridStore weren't setting srcTermProcessing at all; this can lead
to irreproducibility of the results. Two of the calls had
srcTermProcessing set to 1 before, which leads to reproducibility but
(according to Gerhard Theurich) is often worse for performance than a
value of 0.
Fix for exchange grid with add_gusts false: Only add gust fields if add_gusts is true

### Description of changes

Runs with `aoflux_grid = "xgrid"` and `add_gusts = .false.` have been failing with:

```
20240903 141813.539 ERROR            PET003 ESMCI_Container.h:178 ESMCI::Container::get() Invalid argument  - key does not exist: So_ugustOut
```

This PR fixes this issue by avoiding adding the two new gust fields to the list of mediator fields if add_gusts is false.

The explanation for why this caused an issue with the exchange grid is:
- In `mediator/esmFldsExchange_cesm_mod.F90`: `esmFldsExchange_cesm`, `So_ugustOut` and `So_u10withGust` are added as atmosphere-ocean fluxes regardless of the value of `add_gusts`. 
- But in `mediator/med_phases_aofluxes_mod.F90`: `set_aoflux_out_pointers`, the `fldbun_getfldptr` call is only made if `add_gusts` is true.
- There is a difference in behavior with xgrid than with ogrid here: with xgrid, the call to `fldbun_getfldptr` is responsible for adding the field to `FBaof_x`. (I think with ogrid or agrid, the fields have been added to the relevant field bundle already.)
- So we end up with these gust fields not in `FBaof_x`, but still in `fldnames_aof_out`, and this leads to an inconsistency.

**@megandevlan and @jedwards4b: I'm not positive that this is the right fix: It does solve the issue, but I'd like one or both of your input on whether this is the right way to solve the issue. In particular, can we rely on CAM working correctly when add_gusts is false if these fields are absent? It looks to me like CAM will set the relevant fields to 0 if they are absent; will that lead to correct behavior when add_gusts = .false.?**

### Specific notes

Contributors other than yourself, if any:

CMEPS Issues Fixed (include github issue #):

Are changes expected to change answers? No, but there are FIELDLIST diffs for cases with add_gusts false: the following fields will no longer be present on the cpl hi files when add_gusts is false:
- Med_aoflux_ocn_So_u10withGust
- Med_aoflux_ocn_So_ugustOut
- Med_aoflux_atm_So_u10withGust
- Med_aoflux_atm_So_ugustOut
- atmExp_So_u10withGust
- atmExp_So_ugustOut

Any User Interface Changes (namelist or namelist defaults changes)? No

### Testing performed
Please describe the tests along with the target model and machine(s) 
If possible, please also added hashes that were used in the testing

Ran three versions of `SMS_D_Ln9.ne30pg3_ne30pg3_mg17.FLTHIST.derecho_intel.cam-outfrq9s` from cesm3_0_alpha03c:
1. `add_gusts = .false.`, `aoflux_grid = "ogrid"`
2. `add_gusts = .false.`, `aoflux_grid = "xgrid"` (**This is the case that was failing previously.**
3. `add_gusts = .true.`, `aoflux_grid = "xgrid"`

All three pass now.

Also ran `SMS_Ld40.TL319_t232.C_JRA.derecho_intel` (which has add_gusts = .false.) with `aoflux_grid = "xgrid"`; this also passes now but failed before the changes in this PR.

I have also run the following tests with comparisons against baselines: all pass and are bit-for-bit, though with FIELDLIST diffs as expected:
- SMS.TL319_t232.G_JRA.derecho_intel.mom-no_stoch_physics
-  ERP_Ln9.f09_f09_mg17.F1850.derecho_intel.cam-outfrq9s
- SMS_Ld2.ne30pg3_t232.BLT1850.derecho_intel.allactive-defaultio with change to user_nl_cpl to set add_gusts false
Fix xgrid reproducibility by using srcTermProcessing=0 for all xgrid FieldRegridStore calls

### Description of changes

Use srcTermProcessing=0 for all xgrid FieldRegridStore calls.

This is needed for bit-for-bit reproducibility. Most of the calls to FieldRegridStore weren't setting srcTermProcessing at all; this can lead to irreproducibility of the results. Two of the calls had srcTermProcessing set to 1 before, which leads to reproducibility but (according to Gerhard Theurich) is often worse for performance than a value of 0.

Note that we use a value of 0 in the calls in med_map_mod.

### Specific notes

Contributors other than yourself, if any: Guidance from @oehmke and @theurich

CMEPS Issues Fixed (include github issue #): Resolves ESCOMP#505 

Are changes expected to change answers? YES: roundoff-level differences expected for runs with `aoflux_grid = "xgrid"` (but those runs had expected roundoff-level differences from run to run before this fix anyway)

Any User Interface Changes (namelist or namelist defaults changes)? No

### Testing performed
Please describe the tests along with the target model and machine(s) 
If possible, please also added hashes that were used in the testing

In the context of cesm3_0_alpha03c, with the change here put on top of the change in ESCOMP#501, ran `REP_Ld2.ne30pg3_t232.BLT1850.derecho_intel.allactive-xgrid`, where the `xgrid` testmod contained:

include_user_mods:
```
../defaultio
```

user_nl_cpl:
```
aoflux_grid = "xgrid"
```
wav2ocn and ocn2wav maps are still needed in some cases
…u_config

Update XML variables for GPU configurations
This is needed to avoid a segmentation fault in the call to flux_atmocn
in SMS_Ld2.ne30pg3_t232.BMT1850.derecho_gnu.allactive-defaultio when
running with aoflux_grid=xgrid
This is needed to avoid a segmentation fault in the call to flux_atmocn
in SMS_Ld2.ne30pg3_t232.BMT1850.derecho_gnu.allactive-defaultio when
running with aoflux_grid=xgrid.

I think that the previous fix wasn't correct when using xgrid, because I
think nothing is originally in fldbun_a in that case.
Fix logic for adding psfc to aoflux_in

### Description of changes

Fix the logic for adding psfc to aoflux_in, particularly needed for cases using the exchange grid.

This is needed to avoid a segmentation fault in the call to flux_atmocn in SMS_Ld2.ne30pg3_t232.BMT1850.derecho_gnu.allactive-defaultio when running with aoflux_grid=xgrid. (This shows up when running with xgrid because the FB_fldchk can't be used in this situation - see details in my [comment below](ESCOMP#514 (comment)).)

### Specific notes

Contributors other than yourself, if any:

CMEPS Issues Fixed (include github issue #):

Are changes expected to change answers? No

Any User Interface Changes (namelist or namelist defaults changes)? No

### Testing performed

SMS_Ld2.ne30pg3_t232.BMT1850.derecho_gnu.allactive-defaultio with aoflux_grid="xgrid", from cesm3_0_beta03

Also ran many other tests from the CESM prealpha and prebeta test suite.
Make xgrid the default aoflux_grid when atm and ocn are running on different grids
@mvertens mvertens marked this pull request as draft November 24, 2024 21:42
@mvertens mvertens mentioned this pull request Nov 24, 2024
2 tasks
@mvdebolskiy mvdebolskiy marked this pull request as ready for review March 5, 2025 16:16
@mvdebolskiy mvdebolskiy self-requested a review March 5, 2025 16:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants