Skip to content

[develop:] Add the RRFS_SaS physics suite to SRW options#1201

Merged
MichaelLueken merged 27 commits into
ufs-community:developfrom
natalie-perlin:feature/rrfs_sas
Mar 6, 2025
Merged

[develop:] Add the RRFS_SaS physics suite to SRW options#1201
MichaelLueken merged 27 commits into
ufs-community:developfrom
natalie-perlin:feature/rrfs_sas

Conversation

@natalie-perlin
Copy link
Copy Markdown
Collaborator

@natalie-perlin natalie-perlin commented Feb 14, 2025

DESCRIPTION OF CHANGES:

RRFS_sas suite is added to the SRW physics suite options.
This scheme was used in RRFSv1 production runs between ~August to December 2024.
Fractional approach in RUC_LSF not yet used (mosaic_lu=0, mosaic_soil=0), and will be turned on when data generation for the fractional vegetation and soil, as part of the UFS_UTILS, is integrated into the SRW workflow. All other schemes that use RUC_LSM, such as FV3_HRRR and FV3_HRRR_gf, could also be configured to use fractional vegetation and soil data after the data generation becomes functional.

This code still uses sfc_data.nc older (v1) format.

Files modified or added (*):

  • parm/FV3.input.yml
  • parm/diag_table.RRFS_sas (*)
  • parm/field_table.RRFS_sas (*)
  • scripts/exregional_make_orog.sh
  • scripts/exregional_make_ics.sh
  • scripts/exregional_make_lbcs.sh
  • scripts/exregional_run_fcst.sh
  • ush/link_fix.py
  • ush/setup.py
  • ush/valid_param_vals.yaml
  • tests/WE2E/test_configs/grids_extrn_mdls_suites_community/config.grid_RRFS_CONUScompact_13km_ics_HRRR_lbcs_RAP_suite_RRFS_sas.yaml
  • tests/WE2E/test_configs/grids_extrn_mdls_suites_community/config.grid_SUBCONUS_Ind_3km_ics_HRRR_lbcs_HRRR_suite_RRFS_sas_plot.yaml
  • tests/WE2E/test_configs/grids_extrn_mdls_suites_community/config.grid_RRFS_CONUScompact_25km_ics_RRFS_lbcs_RRFS_suite_RRFS_sas.yaml

NB: With the currently used hashes of ufs-srweather-app and FV3atm in this draft PR, an additional namelist file suite_RRFS_sas.xml is to be placed under ./sorc/ufs-weather-model/FV3/./ccpp/suites/suite_RRFS_sas.xml

@MichaelLueken brought changes to the UFS_UTILS branch in NOAA-EPIC repository that allowed preparing input data correctly for the FV3atm model.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

TESTS CONDUCTED:

Tested on Hera

New WE2E tests prepared:
grid_RRFS_CONUScompact_13km_ics_HRRR_lbcs_RAP_suite_RRFS_sas
grid_SUBCONUS_Ind_3km_ics_HRRR_lbcs_HRRR_suite_RRFS_sas_plot
grid_RRFS_CONUScompact_25km_ics_RRFS_lbcs_RRFS_suite_RRFS_sas

Checking out the PR and testing it, example for Hera:

git clone https://github.com/ufs-community/ufs-srweather-app.git
cd ufs-srweather-app
git pull origin  pull/1201/head:PR1201
git checkout PR1201
./manage_externals/checkout_externals 
./devbuild.sh -v --platform=hera --compiler=intel 2>&1 | tee log.srw.build
module use $PWD/modulefiles
module load wflow_hera
conda activate srw_app
cd ./tests/WE2E ./run_WE2E_tests.py -m=hera -c=intel -a epic -t grid_RRFS_CONUScompact_13km_ics_HRRR_lbcs_RAP_suite_RRFS_sas

To run other tests, replace the last line by:

./run_WE2E_tests.py -m=hera -c=intel -a epic -t grid_SUBCONUS_Ind_3km_ics_HRRR_lbcs_HRRR_suite_RRFS_sas_plot

or

./run_WE2E_tests.py -m=hera -c=intel -a epic -t grid_RRFS_CONUScompact_25km_ics_RRFS_lbcs_RRFS_suite_RRFS_sas

TESTS on PLATFORMS:

  • derecho.intel
  • gaea.intel
  • gaea-c6.intel
  • hera.gnu
  • hera.intel
  • hercules.intel
  • jet.intel
  • orion.intel
  • wcoss2.intel
  • NOAA Cloud (indicate which platform)
  • Jenkins
  • fundamental test suite
  • comprehensive tests (specify which if a subset was used)

DEPENDENCIES:

DOCUMENTATION:

ISSUE:

This closes an issue #1192

CHECKLIST

  • My code follows the style guidelines in the Contributor's Guide
  • I have performed a self-review of my own code using the Code Reviewer's Guide
  • I have commented my code, particularly in hard-to-understand areas
  • My changes need updates to the documentation. I have made corresponding changes to the documentation
  • My changes do not require updates to the documentation (explain).
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • Any dependent changes have been merged and published

LABELS (optional):

A Code Manager needs to add the following labels to this PR:

  • Work In Progress
  • bug
  • enhancement
  • documentation
  • release
  • high priority
  • run_ci
  • run_we2e_fundamental_tests
  • run_we2e_comprehensive_tests
  • Needs Cheyenne test
  • Needs Jet test
  • Needs Hera test
  • Needs Orion test
  • help wanted

CONTRIBUTORS (optional):

@MichaelLueken
@ulmononian

Log files:
test_grid_RRFS_CONUScompact_13km_ics_HRRR_lbcs_RAP_suite_RRFS_sas.txt

test_grid_SUBCONUS_Ind_3km_ics_HRRR_lbcs_HRRR_suite_RRFS_sas_plot.txt

test_grid_RRFS_CONUScompact_25km_ics_RRFS_lbcs_RRFS_suite_RRFS_sas.txt

@natalie-perlin natalie-perlin marked this pull request as draft February 14, 2025 14:48
@MichaelLueken MichaelLueken added Work in Progress release This PR/issue is related to a release branch Priority: HIGH labels Feb 14, 2025
@MichaelLueken MichaelLueken linked an issue Feb 14, 2025 that may be closed by this pull request
@natalie-perlin natalie-perlin changed the title Add the RRFS_SaS physics suite to SRW options [develop:] Add the RRFS_SaS physics suite to SRW options Feb 14, 2025
@MichaelLueken
Copy link
Copy Markdown
Collaborator

@natalie-perlin -

All three rrfs_sas WE2E tests are failing on Hera:

  1. grid_RRFS_CONUScompact_13km_ics_HRRR_lbcs_RAP_suite_RRFS_sas - failing in run_fcst with the following seg fault [h5c05:1495455:0:1495455] Caught signal 11 (Segmentation fault: Sent by the kernel at address (nil))
  2. grid_RRFS_CONUScompact_25km_ics_RRFS_lbcs_RRFS_suite_RRFS_sas - failing in get_extrn_lbcs/ics due to looking for /scratch1/NCEPDEV/nems/role.epic/UFS_SRW_data/develop/input_model_data/RRFS/2024060517/rrfs.t17z.natlev.f000.grib2 while only prslev files are available. I suspect that the test would ultimately fail in run_fcst with a seg fault as well.
  3. grid_SUBCONUS_Ind_3km_ics_HRRR_lbcs_HRRR_suite_RRFS_sas_plot - failing in run_fcst with the same seg fault as grid_RRFS_CONUScompact_13km_ics_HRRR_lbcs_RAP_suite_RRFS_sas above.

The experiment directories on Hera can be found - /scratch1/NCEPDEV/stmp2/Michael.Lueken/ufs-srweather-app/hera/expt_dirs.

We need to make sure that these tests run correctly before moving forward with merging this PR and creating the release branch for SRW v3.0.0.

@natalie-perlin
Copy link
Copy Markdown
Collaborator Author

natalie-perlin commented Mar 4, 2025 via email

Comment thread parm/wflow/coldstart.yaml Outdated
Copy link
Copy Markdown
Collaborator

@mkavulich mkavulich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting in an official "request changes" review because these changes seem problematic. Not just the inclusion of a potentially bad partition name, but changing the default PPN value to not use multiple cores on a node.

@natalie-perlin
Copy link
Copy Markdown
Collaborator Author

Putting in an official "request changes" review because these changes seem problematic. Not just the inclusion of a potentially bad partition name, but changing the default PPN value to not use multiple cores on a node.

Resolved and reverted the changes.

@natalie-perlin natalie-perlin requested a review from mkavulich March 4, 2025 16:58
@MichaelLueken
Copy link
Copy Markdown
Collaborator

@natalie-perlin -

The grid_RRFS_CONUScompact_25km_ics_RRFS_lbcs_RRFS_suite_RRFS_sas WE2E test has successfully cleared the get_extrn_lbcs/ics tasks, but is now seg faulting in run_fcst.

All RRFS_sas WE2E tests are still seg faulting in run_fcst.

@natalie-perlin
Copy link
Copy Markdown
Collaborator Author

@natalie-perlin -

The grid_RRFS_CONUScompact_25km_ics_RRFS_lbcs_RRFS_suite_RRFS_sas WE2E test has successfully cleared the get_extrn_lbcs/ics tasks, but is now seg faulting in run_fcst.

All RRFS_sas WE2E tests are still seg faulting in run_fcst.

Are there any logs I could take a look at?..

@MichaelLueken
Copy link
Copy Markdown
Collaborator

@natalie-perlin -

Sure, my logs are available on Hera - /scratch1/NCEPDEV/stmp2/Michael.Lueken/ufs-srweather-app/hera/expt_dirs

@natalie-perlin
Copy link
Copy Markdown
Collaborator Author

@natalie-perlin -

Sure, my logs are available on Hera - /scratch1/NCEPDEV/stmp2/Michael.Lueken/ufs-srweather-app/hera/expt_dirs

@MichaelLueken @mkavulich -
A test with RRFS_sas completes if the smoke/dust tracers are added back to the field_table.RRFS_sas

For the test submitted below, I simply replaced field_table in the working directory by the field_table from the previous identical experiment that was successful, which contained three additional tracers for smoke and dust. Doing so before the forecasts started allowed it to finish successfully:

Took 0:40:41.104866; will no longer monitor.
All 1 experiments finished
Calculating core-hour usage and printing final summary
----------------------------------------------------------------------------------------------------
Experiment name                                                  | Status    | Core hours used 
----------------------------------------------------------------------------------------------------
grid_RRFS_CONUScompact_25km_ics_RRFS_lbcs_RRFS_suite_RRFS_sas_202  COMPLETE              16.13
----------------------------------------------------------------------------------------------------
Total                                                              COMPLETE              16.13

Detailed summary written to /scratch2/NCEPDEV/stmp1/Natalie.Perlin/SRW1/expt_dirs/WE2E_summary_20250304203559.txt

RRFS_sas definition in FV3.input.yaml contains several parameters for smoke/dust options under gfs_physics_nml. It could be that they require the tracers to be present in the field_table namelist. So I will have to put them back for the physics suite to work (see earlier request by @mkavulich questioning their need to be in the table: #1201 (comment)) .

Copy link
Copy Markdown
Collaborator

@mkavulich mkavulich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@natalie-perlin Thanks for the update. For the record, these field_table entries should not be necessary when smoke and dust are turned off, but I won't hold up this PR further as I hope that this issue can be resolved properly through this discussion between me and @benkozi before the release.

@natalie-perlin
Copy link
Copy Markdown
Collaborator Author

@natalie-perlin Thanks for the update. For the record, these field_table entries should not be necessary when smoke and dust are turned off, but I won't hold up this PR further as I hope that this issue can be resolved properly through this discussion between me and @benkozi before the release.

Thank you for pointing to the related discussion. Few things could be indeed hard-coded, so it would be helpful to consult the developers (?) who were putting together all the options for RRFS_sas and to get their perspective.

@natalie-perlin
Copy link
Copy Markdown
Collaborator Author

The additional tests using RRFS_sas were successful after the recent change to the fields_table.RRFS_sas:

Took 0:17:54.632937; will no longer monitor.
All 1 experiments finished
Calculating core-hour usage and printing final summary
----------------------------------------------------------------------------------------------------
Experiment name                                                  | Status    | Core hours used 
----------------------------------------------------------------------------------------------------
grid_RRFS_CONUScompact_13km_ics_HRRR_lbcs_RAP_suite_RRFS_sas_2025  COMPLETE              65.49
----------------------------------------------------------------------------------------------------
Total                                                              COMPLETE              65.49

Detailed summary written to /scratch2/NCEPDEV/stmp1/Natalie.Perlin/SRW1/expt_dirs/WE2E_summary_20250304211302.txt

and

Took 0:21:17.255083; will no longer monitor.
All 1 experiments finished
Calculating core-hour usage and printing final summary
----------------------------------------------------------------------------------------------------
Experiment name                                                  | Status    | Core hours used 
----------------------------------------------------------------------------------------------------
grid_SUBCONUS_Ind_3km_ics_HRRR_lbcs_HRRR_suite_RRFS_sas_plot_2025  COMPLETE              21.07
----------------------------------------------------------------------------------------------------
Total                                                              COMPLETE              21.07

Detailed summary written to /scratch2/NCEPDEV/stmp1/Natalie.Perlin/SRW1/expt_dirs/WE2E_summary_20250304214716.txt

@natalie-perlin
Copy link
Copy Markdown
Collaborator Author

(closed by accident, reopened immediately after)

@MichaelLueken MichaelLueken added the run_we2e_jenkins_coverage_tests SRW App automated CI testing with modified Jenkinsfile label Mar 5, 2025
@MichaelLueken
Copy link
Copy Markdown
Collaborator

@natalie-perlin -

The grid_RRFS_CONUScompact_13km_ics_HRRR_lbcs_RAP_suite_RRFS_sas WE2E test failed on Hera GNU in the run_fcst task with the following error:

FATAL from PE 10: NaN in input field of mpp_reproducing_sum(_2d), this indicates numerical instability

The directory for the logs can be found - /scratch2/NAGAPE/epic/role.epic/jenkins/workspace/s-srweather-app_pipeline_PR-1201/hera/expt_dirs/grid_RRFS_CONUScompact_13km_ics_HRRR_lbcs_RAP_suite_RRFS_sas/log

The grid_SUBCONUS_Ind_3km_ics_HRRR_lbcs_HRRR_suite_RRFS_sas_plot WE2E test also failed on Hera GNU in the run_fcst task with the same error:

FATAL from PE 10: NaN in input field of mpp_reproducing_sum(_2d), this indicates numerical instability

The directory for the logs can be found - /scratch2/NAGAPE/epic/role.epic/jenkins/workspace/s-srweather-app_pipeline_PR-1201/hera/expt_dirs/grid_SUBCONUS_Ind_3km_ics_HRRR_lbcs_HRRR_suite_RRFS_sas_plot/log

When the PR was closed, Jenkins dropped the run, leading to the tests not being run on Gaea-C5. I have connected this PR to the sandbox Jenkins pipeline for running the SRW App, and am rerunning them now so that we can have a successful run on Gaea-C5.

@MichaelLueken
Copy link
Copy Markdown
Collaborator

@natalie-perlin -

If the failures in grid_RRFS_CONUScompact_13km_ics_HRRR_lbcs_RAP_suite_RRFS_sas and grid_SUBCONUS_Ind_3km_ics_HRRR_lbcs_HRRR_suite_RRFS_sas_plot on Hera GNU will take a long time to debug, then we may need to remove these two tests from coverage.hera.gnu.com suite, rename comprehensive to comprehensive.hera.intel, and create comprehensive.hera.gnu with these two tests removed. We'll also need to ensure that the release documentation notes that RRFS_SaS will only work with RRFS ICs/LBCs on GNU-built executables.

I'll let you know once the Gaea-C5 tests complete and then we can decide on a path forward at that time.

@natalie-perlin
Copy link
Copy Markdown
Collaborator Author

natalie-perlin commented Mar 5, 2025

@natalie-perlin -

The grid_RRFS_CONUScompact_13km_ics_HRRR_lbcs_RAP_suite_RRFS_sas WE2E test failed on Hera GNU in the run_fcst task with the following error:

FATAL from PE 10: NaN in input field of mpp_reproducing_sum(_2d), this indicates numerical instability

The directory for the logs can be found - /scratch2/NAGAPE/epic/role.epic/jenkins/workspace/s-srweather-app_pipeline_PR-1201/hera/expt_dirs/grid_RRFS_CONUScompact_13km_ics_HRRR_lbcs_RAP_suite_RRFS_sas/log

The grid_SUBCONUS_Ind_3km_ics_HRRR_lbcs_HRRR_suite_RRFS_sas_plot WE2E test also failed on Hera GNU in the run_fcst task with the same error:

FATAL from PE 10: NaN in input field of mpp_reproducing_sum(_2d), this indicates numerical instability

The directory for the logs can be found - /scratch2/NAGAPE/epic/role.epic/jenkins/workspace/s-srweather-app_pipeline_PR-1201/hera/expt_dirs/grid_SUBCONUS_Ind_3km_ics_HRRR_lbcs_HRRR_suite_RRFS_sas_plot/log

When the PR was closed, Jenkins dropped the run, leading to the tests not being run on Gaea-C5. I have connected this PR to the sandbox Jenkins pipeline for running the SRW App, and am rerunning them now so that we can have a successful run on Gaea-C5.

Looked at the failed run and compared with the successful run done using Intel compiler.
The failure occurs during/after writing out restart, and printing out some diagnostics.
Diagnostic values reported before writing a restart files were OK in both cases, and are similar. However, diagnostic values in a restart file (or during writing a restart file) make no sense and are totally wrong in either case. The error is reported in the case of GNU compiler, before outputting sigmab, smoke, dust, and coarsepm diagnostics. Incidentally, these are the same tracer variables that are added to the field_table.RRFS_sas on top of other tracers, and which were questioned by @mkavulich in #1201 (comment)

[UPDATING/CORRECTING THE COMMENT after more debugging done for the printouts: the large values that looked wrong are not individual values, but sums across the domain, so it is naturally very large or small numbers]

Let me show the comparison between the two cases (in the following comment)...

@natalie-perlin
Copy link
Copy Markdown
Collaborator Author

natalie-perlin commented Mar 5, 2025

Log files from the test grid_RRFS_CONUScompact_13km_ics_HRRR_lbcs_RAP_suite_RRFS_sas compiled with the two compilers:
Intel: grid_RRFS_CONUScompact_13km_RRFS_sas_Intel.run_fcst_mem000_2020081000.log
GNU: grid_RRFS_CONUScompact_13km_RRFS_sas_GNU.run_fcst_mem000_2020081000.log

Snippets from these log files showing the OK diagnostics before getting to print out the restart files:

For INTEL:

 PS max =    1030.006      min =    668.5580
 Mean specific humidity (mg/kg) above 75 mb=   3.144656
 Total surface pressure (mb) =    959.8101
 mean dry surface pressure =    956.8243
 Total Water Vapor (kg/m**2) =   30.31927
 --- Micro Phys water substances (kg/m**2) ---
 Total cloud water=  1.2329951E-02
 Total rain  water=  1.0958425E-02
 Total cloud ice  =  2.1680442E-03
 Total snow       =  0.1021250
 Total graupel    =  6.9936033E-04
 ---------------------------------------------
 TE ( Joule/m^2 * E9) =   2.657238
 UA_top max =    3.326609      min =   -43.13605
 UA max =    50.27020      min =   -66.57390
 VA max =    40.04541      min =   -43.69222
 W  max =    9.640343      min =   -2.380886
 Bottom w max =   0.2637150      min =  -0.5851771
 Bottom: w/dz max =   5.7890285E-03  min =  -1.2974470E-02
 DZ (m) max =   -43.53698      min =   -7996.342
 Bottom DZ (m) max =   -43.53698      min =   -48.85549
 TA max =    312.7988      min =    191.4844
 OM max =    15.94011      min =   -33.98397
 sphum max =   2.5438514E-02  min =   1.1871029E-08
 liq_wat max =   7.0170858E-03  min =   0.0000000E+00
 ice_wat max =   1.7198172E-04  min =   0.0000000E+00
 rainwat max =   3.3556593E-03  min =   0.0000000E+00
 snowwat max =   8.7828189E-03  min =   0.0000000E+00
 graupel max =   2.0635454E-03  min =   0.0000000E+00
 water_nc max =   1.2106742E+09  min =   0.0000000E+00
 ice_nc max =    6034922.      min =   0.0000000E+00
 rain_nc max =    152626.3      min =   0.0000000E+00
 o3mr max =   7.0201381E-06  min =   8.7957503E-08
 liq_aero max =   1.2403390E+10  min =    1044834.
 ice_aero max =    3072979.      min =    83.73697
 sgs_tke max =    98.52718      min =   9.9647114E-05
 sigmab max =   0.9538116      min =   0.0000000E+00
 smoke max =   3.9311839E-12  min =  -4.9303807E-32
 dust max =   3.9311839E-12  min =  -4.9303807E-32
 coarsepm max =   3.9311839E-12  min =  -4.9303807E-32
  ---isec,seconds        3600       21600
  gfs diags time since last bucket empty:    1.00000000000000      hrs
 in atmos_model update, fhzero=   1.00000000000000      fhour=   6.000000
  0.0000000E+00
 write out restart at n_atmsteps=         539  seconds=       21600
 integration length=   5.988889
PASS: fcstRUN phase 2, n_atmsteps =              539 time is         0.120679
  aft fcst run output time=       21600 FBcount=           8 na=         540

For GNU:

PS max =    1030.00586      min =    668.552307
 Mean specific humidity (mg/kg) above 75 mb=   3.14498115
 Total surface pressure (mb) =    959.703674
 mean dry surface pressure =    956.737183
 Total Water Vapor (kg/m**2) =   30.1104698
 --- Micro Phys water substances (kg/m**2) ---
 Total cloud water=   1.16923004E-02
 Total rain  water=   1.09344041E-02
 Total cloud ice  =   3.80886067E-03
 Total snow       =  0.112456001
 Total graupel    =   6.85933861E-04
 ---------------------------------------------
 TE ( Joule/m^2 * E9) =   2.65678787
 UA_top max =    3.07937288      min =   -42.7700386
 UA max =    50.2304802      min =   -66.6390457
 VA max =    39.7548981      min =   -43.6520157
 W  max =    10.4711180      min =   -2.70421958
 Bottom w max =   0.262817174      min =  -0.577513099
 Bottom: w/dz max =    5.76946745E-03  min =   -1.28027778E-02
 DZ (m) max =   -43.5313873      min =   -7996.33496
 Bottom DZ (m) max =   -43.5313873      min =   -48.8622131
 TA max =    312.811615      min =    185.741989
 OM max =    16.5468864      min =   -37.9224739
 sphum max =    2.54375171E-02  min =    1.18712258E-08
 liq_wat max =    6.94136741E-03  min =    0.00000000
 ice_wat max =    2.51571881E-04  min =    0.00000000
 rainwat max =    3.37443710E-03  min =    0.00000000
 snowwat max =    8.33155029E-03  min =    0.00000000
 graupel max =    2.41297297E-03  min =    0.00000000
 water_nc max =    1.21068915E+09  min =    0.00000000
 ice_nc max =    12226132.0      min =    0.00000000
 rain_nc max =    207424.625      min =    0.00000000
 o3mr max =    7.00858664E-06  min =    8.79201778E-08
 liq_aero max =    1.24026952E+10  min =    790926.750
 ice_aero max =    3072979.50      min =    83.7369766
 sgs_tke max =    96.5668182      min =    9.94388611E-05
 sigmab max =          Infinity  min =         -Infinity
 smoke max =    5.61190591E-12  min =   -7.34683969E-40
 dust max =    5.61190591E-12  min =   -7.34683969E-40
 coarsepm max =    5.61190591E-12  min =   -7.34683969E-40
 ---isec,seconds        3600       21600
  gfs diags time since last bucket empty:    1.0000000000000000      hrs
 in atmos_model update, fhzero=   1.0000000000000000      fhour=   6.00000000       0.00000000
 write out restart at n_atmsteps=         539  seconds=       21600 integration length=   5.98888874
PASS: fcstRUN phase 2, n_atmsteps =              539 time is         0.255279
  aft fcst run output time=       21600 FBcount=           8 na=         540

What was different for the two compilers, domain-wide diagnostics (sums of variables across the domain), which have troubles getting sigmab tracer values in the GNU log.

from a log for Intel compiler:

fv_restart_end u    =         94555238797321
 fv_restart_end v    =       -287073339864390
 fv_restart_end w    =       -804312876631954
 fv_restart_end delp =       6747838341418396
 fv_restart_end phis =         75299195483211
 fv_restart_end pt   =       6653716402889439
 fv_restart_end q(prog) nq   =          17     44978875745104058
 fv_restart_end sphum =       5612476784143819
 fv_restart_end liq_wat =         53793424619715
 fv_restart_end ice_wat =        165856989231169
 fv_restart_end rainwat =        166244411791397
 fv_restart_end snowwat =        268293114067506
 fv_restart_end graupel =         53976310399506
 fv_restart_end water_nc =         71445093391257
 fv_restart_end ice_nc =        222539610350493
 fv_restart_end rain_nc =        207865452297167
 fv_restart_end o3mr =       5162549350763347
 fv_restart_end liq_aero =       7474436942872038
 fv_restart_end ice_aero =       6890171368760198
 fv_restart_end sgs_tke =       5728053129465091
 fv_restart_end sigmab =        270161926852600
 fv_restart_end smoke =       4210337278699585
 fv_restart_end dust =       4210337278699585
 fv_restart_end coarsepm =       4210337278699585
 ZS   3583.279      0.0000000E+00   507.9726
 PS    1030.006       668.5580       959.8101
 PS* max =    1030.006      min =    668.5580
 U  max =    50.70411      min =   -64.04237
 V  max =    41.54031      min =   -41.68392
 W  max =    9.640343      min =   -2.380886
 T  max =    312.7988      min =    191.4844
 sphum  2.5438514E-02  1.1871029E-08  1.2427761E-02
 liq_wat  7.0170858E-03  0.0000000E+00  7.2612288E-06
 ice_wat  1.7198172E-04  0.0000000E+00  0.0000000E+00
 rainwat  3.3556593E-03  0.0000000E+00  2.0496880E-06
 snowwat  8.7828189E-03  0.0000000E+00  0.0000000E+00
 graupel  2.0635454E-03  0.0000000E+00  0.0000000E+00
 water_nc  1.2106742E+09  0.0000000E+00   2035357.
 ice_nc   6034922.      0.0000000E+00  0.0000000E+00
 rain_nc   152626.3      0.0000000E+00   69.77973
 o3mr  7.0201381E-06  8.7957503E-08  9.5039326E-08
 liq_aero  1.2403390E+10   1044834.      6.2676973E+08
 ice_aero   3072979.       83.73697       68284.45
 sgs_tke   98.52718      9.9647114E-05  0.6085497
 sigmab  0.9538116      0.0000000E+00  3.7779235E-03
  smoke  3.9311839E-12 -4.9303807E-32  9.7542254E-13
 dust  3.9311839E-12 -4.9303807E-32  9.7542254E-13
 coarsepm  3.9311839E-12 -4.9303807E-32  9.7542254E-13
 MPP_DOMAINS_STACK high water mark=       84480

Tabulating mpp_clock statistics across    160 PEs...

from a log for GNU compiler:

fv_restart_end u    =        79586039562181
 fv_restart_end v    =      -282597543257808
 fv_restart_end w    =      -878812006546620
 fv_restart_end delp =      6747833189328867
 fv_restart_end phis =        75299195886392
 fv_restart_end pt   =      6653719932302147
 fv_restart_end q(prog) nq   =          17    44968673549739408
 fv_restart_end sphum =      5612521141829971
 fv_restart_end liq_wat =        52535943745418
 fv_restart_end ice_wat =       158426520765577
 fv_restart_end rainwat =       168570341772723
 fv_restart_end snowwat =       270735408983169
 fv_restart_end graupel =        58671201880286
 fv_restart_end water_nc =        69811891460157
 fv_restart_end ice_nc =       212427693727321
 fv_restart_end rain_nc =       210755774104853
 fv_restart_end o3mr =      5162551714123224
 fv_restart_end liq_aero =      7475109176285876
 fv_restart_end ice_aero =      6890246136495557
 fv_restart_end sgs_tke =      5721676368494276
 in wrt run, nfhour=   6.0000000000000000       cfhour=006
 fv_restart_end sigmab =       272617016832102
 fv_restart_end smoke =      4210672406412966
 fv_restart_end dust =      4210672406412966
 fv_restart_end coarsepm =      4210672406412966
 ZS   3583.27905       0.00000000       507.972626
 PS    1030.00586       668.552307       959.703674
 PS* max =    1030.00586      min =    668.552307
 U  max =    50.6850433      min =   -64.0808182
 ichunk2d,jchunk2d          -1          -1
 ichunk3d,jchunk3d,kchunk3d          -1          -1          -1
 in wrt run,filename=            1 ./dynf006.nc
 V  max =    41.1943817      min =   -41.8301888
 W  max =    10.4711180      min =   -2.70421958
 T  max =    312.811615      min =    185.741989
 sphum   2.54375171E-02   1.18712258E-08   1.24036120E-02
 liq_wat   6.94136741E-03   0.00000000       7.22720915E-06
 ice_wat   2.51571881E-04   0.00000000       0.00000000
 rainwat   3.37443710E-03   0.00000000       1.99346346E-06
 snowwat   8.33155029E-03   0.00000000       0.00000000
 graupel   2.41297297E-03   0.00000000       0.00000000
 water_nc   1.21068915E+09   0.00000000       2021423.50
 ice_nc   12226132.0       0.00000000       0.00000000
 rain_nc   207424.625       0.00000000       72.3030548
 o3mr   7.00858664E-06   8.79201778E-08   9.50611039E-08
 liq_aero   1.24026952E+10   790926.750       626512896.
 ice_aero   3072979.50       83.7369766       68771.9609
 sgs_tke   96.5668182       9.94388611E-05  0.604896128
  sgs_tke   96.5668182       9.94388611E-05  0.604896128

FATAL from PE    10: NaN in input field of mpp_reproducing_sum(_2d), this indicates numerical instability


FATAL from PE    11: NaN in input field of mpp_reproducing_sum(_2d), this indicates numerical instability

@MichaelLueken
Copy link
Copy Markdown
Collaborator

The Jenkins automated tests have successfully passed on both Hera GNU:

----------------------------------------------------------------------------------------------------
Experiment name                                                  | Status    | Core hours used
----------------------------------------------------------------------------------------------------
custom_ESGgrid_Central_Asia_3km_20250305231546                     COMPLETE             317.09
get_from_HPSS_ics_FV3GFS_lbcs_FV3GFS_fmt_nemsio_2019061200_202503  COMPLETE              19.56
get_from_NOMADS_ics_FV3GFS_lbcs_FV3GFS_20250305231549              COMPLETE              27.39
grid_RRFS_CONUS_13km_ics_FV3GFS_lbcs_FV3GFS_suite_HRRR_2025030523  COMPLETE             450.88
grid_SUBCONUS_Ind_3km_ics_HRRR_lbcs_RAP_suite_WoFS_v0_20250305231  COMPLETE              39.05
long_fcst_20250305231553                                           COMPLETE             465.69
MET_verification_only_vx_20250305231554                            COMPLETE               0.53
2019_halloween_storm_20250305231556                                COMPLETE             720.22
2020_jan_cold_blast_20250305231558                                 COMPLETE             727.85
vx-det_long-fcst_custom-vx-config_aiml-fourcastnet_20250305231559  COMPLETE               0.98
vx-det_long-fcst_custom-vx-config_aiml-panguweather_2025030523160  COMPLETE               0.99
vx-det_long-fcst_custom-vx-config_gfs_20250305231602               COMPLETE               1.00
vx-det_long-fcst_winter-wx_SRW-staged_20250305231603               COMPLETE               1.74
vx-det_multicyc_fcst-overlap_ncep-hrrr_20250305231605              COMPLETE               8.83
vx-det_multicyc_last-obs-00z_ncep-hrrr_20250305231606              COMPLETE               1.83
----------------------------------------------------------------------------------------------------
Total                                                              COMPLETE            2783.63

and Hera Intel:

----------------------------------------------------------------------------------------------------
Experiment name                                                  | Status    | Core hours used
----------------------------------------------------------------------------------------------------
2019_memorial_day_heat_wave_20250305230700                         COMPLETE              67.55
custom_ESGgrid_Peru_12km_20250305230702                            COMPLETE              45.90
get_from_HPSS_ics_FV3GFS_lbcs_FV3GFS_fmt_grib2_2019061200_2025030  COMPLETE              13.67
get_from_HPSS_ics_GDAS_lbcs_GDAS_fmt_netcdf_2022040400_ensemble_2  COMPLETE            1768.20
get_from_HPSS_ics_HRRR_lbcs_RAP_20250305230706                     COMPLETE              23.25
grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16_plot_20  COMPLETE              29.76
grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_RAP_suite_RAP_20250305230709  COMPLETE              18.53
grid_RRFS_CONUS_25km_ics_GSMGFS_lbcs_GSMGFS_suite_GFS_v15p2_20250  COMPLETE              13.81
grid_RRFS_CONUScompact_25km_ics_RRFS_lbcs_RRFS_suite_RRFS_sas_202  COMPLETE              18.18
grid_RRFS_CONUScompact_13km_ics_HRRR_lbcs_RAP_suite_RRFS_sas_2025  COMPLETE              68.87
grid_RRFS_CONUS_3km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v15p2_202503  COMPLETE             558.47
grid_RRFS_CONUS_3km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v16_20250305  COMPLETE             734.91
grid_RRFS_CONUScompact_3km_ics_HRRR_lbcs_RAP_suite_HRRR_202503052  COMPLETE             755.23
grid_SUBCONUS_Ind_3km_ics_HRRR_lbcs_HRRR_suite_RRFS_sas_plot_2025  COMPLETE              22.88
MET_ensemble_verification_only_vx_time_lag_20250305230720          COMPLETE               4.92
pregen_grid_orog_sfc_climo_20250305230722                          COMPLETE              15.44
vx-det_long-fcst_custom-vx-config_aiml-graphcast_20250305230724    COMPLETE               0.87
vx-det_multicyc_long-fcst-overlap_nssl-mpas_20250305230726         COMPLETE              12.95
vx-det_multicyc_long-fcst-no-overlap_nssl-mpas_20250305230727      COMPLETE              14.17
vx-det_multicyc_first-obs-00z_ncep-hrrr_20250305230729             COMPLETE               1.06
vx-det_multicyc_no-00z-obs_nssl-mpas_20250305230731                COMPLETE               1.02
vx-det_multicyc_no-fcst-overlap_ncep-hrrr_20250305230732           COMPLETE               2.42
----------------------------------------------------------------------------------------------------
Total                                                              COMPLETE            4192.06

Moving forward with merging this work now.

@MichaelLueken MichaelLueken merged commit 2a1c163 into ufs-community:develop Mar 6, 2025
natalie-perlin added a commit to natalie-perlin/ufs-srweather-app that referenced this pull request Mar 14, 2025
…ty#1201)

* RRFS_sas suite is added to the SRW physics suite options.

* New WE2E tests prepared:
  * grid_RRFS_CONUScompact_13km_ics_HRRR_lbcs_RAP_suite_RRFS_sas
  * grid_SUBCONUS_Ind_3km_ics_HRRR_lbcs_HRRR_suite_RRFS_sas_plot
  * grid_RRFS_CONUScompact_25km_ics_RRFS_lbcs_RRFS_suite_RRFS_sas

---------

Co-authored-by: Natalie Perlin <Natalie.Perlin@noaa.gov>
Co-authored-by: Brandon Selbig <156852197+selbigmtnwx23@users.noreply.github.com>
Co-authored-by: Michael Lueken <63728921+MichaelLueken@users.noreply.github.com>
Co-authored-by: Gillian Petro <96886803+gspetro-NOAA@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Priority: HIGH release This PR/issue is related to a release branch run_we2e_coverage_tests Run the coverage set of SRW end-to-end tests run_we2e_jenkins_coverage_tests SRW App automated CI testing with modified Jenkinsfile

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add a new physics suite RRFS_sas to SRW options

5 participants