Skip to content

Add FV3 suite WoFS_v0#1158

Closed
MicroTed wants to merge 18 commits into
ufs-community:developfrom
MicroTed:wofs_sdf
Closed

Add FV3 suite WoFS_v0#1158
MicroTed wants to merge 18 commits into
ufs-community:developfrom
MicroTed:wofs_sdf

Conversation

@MicroTed
Copy link
Copy Markdown
Contributor

@MicroTed MicroTed commented Apr 5, 2022

PR Checklist

  • This PR is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR. Please consult the ufs-weather-model wiki if you are unsure how to do this.

  • This PR has been tested using a branch which is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR

  • An Issue describing the work contained in this PR has been created either in the subcomponent(s) or in the ufs-weather-model. The Issue should be created in the repository that is most relevant to the changes in contained in the PR. The Issue and the dependent sub-component PR
    are specified below.

  • Results for one or more of the regression tests change and the reasons for the changes are understood and explained below.

  • New or updated input data is required by this PR. If checked, please work with the code managers to update input data sets on all platforms.

Instructions: All subsequent sections of text should be filled in as appropriate.

Description

Adds new suite definition file to FV3 (FV3_WoFS_v0) to use the NSSL cloud microphysics scheme with Noah LSM. For use use with the next SRW app release.

Testing

Testing done on Jet

  • hera.intel
  • hera.gnu
  • orion.intel
  • cheyenne.intel
  • cheyenne.gnu
  • gaea.intel
  • jet.intel
  • wcoss_cray
  • wcoss_dell_p3
  • opnReqTest for newly added/changed feature
  • CI

Dependencies

If testing this branch requires non-default branches in other repositories, list them. Those branches should have matching names (ideally).

Do PRs in upstream repositories need to be merged first?

"waiting for other repos"

@junwang-noaa
Copy link
Copy Markdown
Collaborator

@MicroTed It looks to me you created two PRs, #1158 and #1159, maybe they can be combined? Also please fill up information on the checklist, description, issues( all the PRs need to have an issue associated), testing information and dependencies(any submodule needs to be updated). Thanks

@MicroTed
Copy link
Copy Markdown
Contributor Author

MicroTed commented Apr 6, 2022

@junwang-noaa Only one of the drafts will be used, depending on whether the RT tests for v1nssl should be converted to WoFS_v0 or not. I may have jumped the gun a little on that.. 1158 just adds the new SDF without touching the RT files, which is the easiest path.

@MicroTed MicroTed changed the title Wofs sdf Add FV3 suite WoFS_v0 Apr 8, 2022
@MicroTed MicroTed marked this pull request as ready for review April 8, 2022 03:13
@RatkoVasic-NOAA
Copy link
Copy Markdown
Collaborator

@MicroTed could you please try to run case with this new SDF by reversing MPI layout in input.nml file. For example, if you have layout = 6,11 - try with layout = 11,6 (this will change MPI decomposition)
For the compilation flag please add option -DREPRO=ON (this will use less aggressive optimization)
Purpose of this is to see if we get bit-wise identical results with two different MPI layouts.

NOTE Our current SDF (suite_FV3_GFS_v15_thompson_mynn_lam3km.xml) is reproducing results only with coarse resolution (currently tested in UFS regression tests), but at 3km it is creating different results.

@MicroTed
Copy link
Copy Markdown
Contributor Author

MicroTed commented Apr 8, 2022

@RatkoVasic-NOAA

please try to run case

Sure thing. Is there any particular current RT test that would be good?

by reversing MPI layout in input.nml file

My experience with changing the layout is that optimization has to be off completely (to turn off vectorization). Otherwise there can be a change in the "leftover" operations that don't fit into the vector and have different round-off characteristics. It has been a few years since I've tested that, so YMMV.

@RatkoVasic-NOAA
Copy link
Copy Markdown
Collaborator

@MicroTed
Dynamics-only is reproducing results in different MPI layout, so should with physics (which is column-only and independent of MPI).

There are two tests in rt,conf:
RUN | regional_control
RUN | regional_3km

First one uses low resolution and second one high resolution. Just reminder to use -DREPRO=ON in COMPILE line, and of course, use suite_FV3_WoFS_v0.xml in same line.
Also, you have to use your input.nml and model_configure files.
When you run rt.sh, run it with -k option (which will keep run directories). Then you go in each run directory (save output) and resubmit job_card (with edited layout line in input.nml).

@MicroTed
Copy link
Copy Markdown
Contributor Author

MicroTed commented Apr 8, 2022

@RatkoVasic-NOAA
I guess it is expected that regional_3km has no real convection in the domain for a 6hr forecast? There doesn't seem to be much for the microphysics to do there.

@RatkoVasic-NOAA
Copy link
Copy Markdown
Collaborator

@RatkoVasic-NOAA I guess it is expected that regional_3km has no real convection in the domain for a 6hr forecast? There doesn't seem to be much for the microphysics to do there.

Right

@junwang-noaa
Copy link
Copy Markdown
Collaborator

@MicroTed May I ask if you ran the decomposition test as Ratko suggested (reversing MPI layout in input.nml file)? We need to run the ORT test(threading, decomposition, restart reproducibility test and debug test ) for new physics suite. Thanks.

@MicroTed
Copy link
Copy Markdown
Contributor Author

Dynamics-only is reproducing results in different MPI layout, so should with physics (which is column-only and independent of MPI).

Not completely column-only -- the Thompson interface (mp_thompson.F90) has 2D loops that will vary in width with MPI. The NSSL scheme is 2D within the microphysics code (works on 2D slabs rather than just columns)

@junwang-noaa I did try flipping the layout on the regional_3km, and results do change, even after changing the REPRO options to '-fp-model precise'

@ChunxiZhang-NOAA ChunxiZhang-NOAA self-requested a review April 12, 2022 13:09
@junwang-noaa
Copy link
Copy Markdown
Collaborator

@MicroTed is the test reproducing with restart runs?
@ligiabernardet @ywangwof @arunchawla-NOAA The new suite file FV3_WoFS_v0 does not reproduce decomposition. Do you want it to be in SRW v2 release?

@ywangwof
Copy link
Copy Markdown

Yes. We want it to be in the SRW v2 release. Could you provide more details about the decomposition reproducing? Actually, this PR just adds one CCPP suite file. I am wondering why it will impact the decomposition?

@junwang-noaa
Copy link
Copy Markdown
Collaborator

junwang-noaa commented Apr 12, 2022

@ywangwof The requirement for public release is that the suite files are well tested. When developers run the test using the suite file (e.g. regional_3km_wofs) with different threads, different decomposition, and restart they will get same answer. The test can also run in debug mode. These tests are required for public release (@ligiabernardet please let me know if the requirement is changed).

@MinsukJi-NOAA
Copy link
Copy Markdown
Contributor

@MicroTed

In the rt.conf file, you have a new test regional_3km_wofs. However, your new test file is tests/tests/regional_3km_wofsv0. These two names have to match to be able to run the regression test.

@MicroTed
Copy link
Copy Markdown
Contributor Author

These two names have to match to be able to run the regression test.

@MinsukJi-NOAA Thanks for pointing that out! I renamed the file.

@MicroTed
Copy link
Copy Markdown
Contributor Author

could you please try to run case with this new SDF by reversing MPI layout in input.nml file. For example, if you have layout = 6,11 - try with layout = 11,6 (this will change MPI decomposition)

@RatkoVasic-NOAA
I have done some tests with this, and I find the same behavior you get with suite_FV3_GFS_v15_thompson_mynn_lam3km.xml. The WoFS suite has reproducible results for regional_control (10,6 vs. 6,10) but not for various tests on 3km grids. (Even with DEBUG=ON.) The NSSL microphysics module is virtually the same as in WRF, where it has no problems with the reproducibility tests. So I suspect there is something common to both SDFs or namelist options that could be the culprit.

At this point, it seems reasonable to note that there is an issue somewhere and move on?

@JeffBeck-NOAA
Copy link
Copy Markdown
Contributor

Pinging @junwang-noaa, since I think Ratko may be on leave. It sounds like there is a systematic problem with physics reproducibility at higher resolutions that we'll need to address at some point, but it's unrelated to the SDFs themselves. Therefore, do you think we could get this PR back into the merge queue? Thanks!

@MicroTed
Copy link
Copy Markdown
Contributor Author

MicroTed commented Apr 21, 2022

The dcp test for the global works if nssl_invertccn=.false. I don't understand yet why the "true" option fails, but it is used for LAM domains to avoid needing BCs for the CCN variable (BC can be just zero).

@MicroTed
Copy link
Copy Markdown
Contributor Author

The dcp issue seems to be fixed. I opened a PR for CCPP for both the restart and dcp fixes:

NCAR/ccpp-physics#904

@MinsukJi-NOAA
Copy link
Copy Markdown
Contributor

The dcp issue seems to be fixed. I opened a PR for CCPP for both the restart and dcp fixes:

NCAR/ccpp-physics#904

Passing of the rrfs_v1nssl decomposition test confirmed.

@MinsukJi-NOAA
Copy link
Copy Markdown
Contributor

MinsukJi-NOAA commented Apr 24, 2022

@MicroTed I have made a PR to your wofs_sdf branch. All changes are related only to ORT test and confined in the tests/opnReqTests/ directory. After you merge in these changes, you can invoke ./opnReqTests -n regional_3km_wofs -c thr,dcp,dbg. My finding so far is that the decomposition test fails. The only difference I see between suite_FV3_WoFS_v0.xml and suite_FV3_RRFS_V1nssl.xml is lsm_noah vs noahmpdrv.

@MicroTed
Copy link
Copy Markdown
Contributor Author

My finding so far is that the decomposition test fails.

@MinsukJi-NOAA OK, good to know, and thanks for doing that -- I have't gone back to the 3km test yet. Considering that FV3_GFS_v15_thompson_mynn_lam3km also has trouble with decomposition for the 3km test, maybe it is not surprising. Maybe even expected?

Modify opnReqTest related scripts to test regional_3km_wofs
@junwang-noaa
Copy link
Copy Markdown
Collaborator

@MicroTed If I understand correctly, the suite_FV3_RRFS_V1nssl.xml has decomposition test passed. So for the suite file suite_FV3_WoFS_v0.xml, are you considering using noahmpdrv instead of lsm_noah for the regional_3km_wofs decomposition test?

@MicroTed
Copy link
Copy Markdown
Contributor Author

the suite_FV3_RRFS_V1nssl.xml has decomposition test passed

Yes, it passes the rrfs_v1nssl RT case, which is a global coarse grid. But, like the lam3km suite, apparently not on the 3km regional grid. Is there a regional case that passes decomposition? The rt.conf has a comment that rap_decomp and rap_sfcdiff_decomp also don't work. Is there one that does work?

@junwang-noaa
Copy link
Copy Markdown
Collaborator

If the suite_FV3_WoFS_v0.xml is going to be used by SRW as a regional test, I think the decomposition test needs to be fixed, if not please set up a global test case with suite_FV3_WoFS_v0.xml, so the ORT will pass.

@MicroTed
Copy link
Copy Markdown
Contributor Author

MicroTed commented Apr 25, 2022

SRW as a regional test, I think the decomposition test needs to be fixed

Sure, I can add the global test. It works fine. I'm also testing regional_control adapted for the WoFS_v0 suite, but that has a note that restart tests don't work?

Update: "Test dcp regional_wofs PASS" So whatever the problem is with the 3km test, it is likely that same thing that affects both the lam3km and wofs suites.
@junwang-noaa Do you have a preference for adding the global test, the regional (coarse grid) test, or both for the WoFS suite?

@junwang-noaa
Copy link
Copy Markdown
Collaborator

@MicroTed I'd suggest to add the test that will be used in public release and fix the issue if there is. I think Ratko is also debugging the lam3km test case, maybe you can work with him.

@MicroTed
Copy link
Copy Markdown
Contributor Author

Update: We found that the regional_3km_wofs passes the decomposition test if nrows_blend=0. The problem with nrows_blend > 0 appears to be the setting of loop limits for the blending points in the corner MPI tiles (in the bc_time_interpolation subroutine in fv_regional_bc.F90).

@MicroTed
Copy link
Copy Markdown
Contributor Author

MicroTed commented May 12, 2022

Nevermind this -- I think there is another route.

Would be OK to proceed with the FV3 PR to add the WoFS suite so it can be included in the SRW-v3? NOAA-EMC/ufsatm#514
At this point the remaining ORT issues appear to be in the dycore rather than the physics suite.
Thanks!

@junwang-noaa
Copy link
Copy Markdown
Collaborator

A PR with the suite files was committed to the SRW public release branch.
This PR will be moved to Q4FY2022 regression test development EPIC, it will be committed after the restart issue is resolved..

@SamuelTrahanNOAA
Copy link
Copy Markdown
Collaborator

This PR included changes to GFDL_atmos_cubed_sphere which have already been merged.

@DeniseWorthen
Copy link
Copy Markdown
Collaborator

Closing in favor of PR #1460

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add a new suite definition file to FV3 (FV3_WoFS_v0) for SRW release