Skip to content

Bug fix to explicit in-canopy vertical diffusion of tracers#3186

Merged
gspetro-NOAA merged 21 commits into
ufs-community:developfrom
noaa-oar-arl:fix_canopy_vdf
Apr 23, 2026
Merged

Bug fix to explicit in-canopy vertical diffusion of tracers#3186
gspetro-NOAA merged 21 commits into
ufs-community:developfrom
noaa-oar-arl:fix_canopy_vdf

Conversation

@iri01

@iri01 iri01 commented Apr 8, 2026

Copy link
Copy Markdown
Contributor

Commit Queue Requirements:

  • This PR addresses a relevant WM issue (if not, create an issue).
  • All subcomponent pull requests (if any) have been reviewed by their code managers.
  • Run the full Intel+GNU RT suite (compared to current baselines), preferably on Ursa (Derecho or Hercules are acceptable alternatives). Exceptions: documentation-only PRs, CI-only PRs, etc.
    • Commit log file w/full results from RT suite run (if applicable).
    • Verify that test_changes.list indicates which tests, if any, are changed by this PR. Commit test_changes.list, even if it is empty.
  • Fill out all sections of this template.

Description:

Commit Message:

* UFSWM - hash update for ccpp changes
  * UFSATM - hash update for ccpp changes
    * ccpp-physics - Bug fix to explicit in-canopy vertical diffusion of tracers

Priority:

  • Critical Bugfix: expected to be part of operational AQMv8 implementation, planned for retro runs starting in May

Sub component Pull Requests:

UFSWM Blocking Dependencies:

  • None

Documentation:

  • Documentation update NOT required.
    • Explanation: bug fix

Changes

Regression Test Changes (Please commit test_changes.list):

  • Ongoing RT
  • Baseline Changes:
    Updates are expected to change Canopy ON regression test baselines due to removal of unwanted explicit in-canopy diffusion effects on hbpl/kpbl, which alters the overall canopy effect on predicted meteorology and chemistry.

Input data Changes:

  • None.

Library Changes/Upgrades:

  • No Updates

Testing Log:

  • RDHPCS
    • Orion
    • Hercules
    • GaeaC6
    • Derecho
    • Ursa
  • WCOSS2
    • Dogwood/Cactus
    • Acorn
  • CI
  • opnReqTest (complete task if unnecessary)

@drnimbusrain

Copy link
Copy Markdown
Contributor

@gspetro-NOAA We would kindly request a high priority review of this bug fix . We are targeting getting this fix into UWM for UFS-AQMv8 operations, which has plans to start retro runs in May. Dependent on UFSATM PR NOAA-EMC/ufsatm#1091 and ccpp PR ufs-community/ccpp-physics#371

@BrianCurtis-NOAA BrianCurtis-NOAA added Baseline Updates Current baselines will be updated. Priority: Critical UFSATM There are changes to the UFSATM repository. CCPP There are changes to a CCPP repository. labels Apr 8, 2026
@BrianCurtis-NOAA

Copy link
Copy Markdown
Collaborator

@iri01 @drnimbusrain Please run the full suite on either Ursa or Hercules, and push the log and test_changes.list.

The description has atmos-cubed-sphere but I don't see a PR there or any changes for that, so I've removed it. Please add info on ACS if there is indeed changes there.

Once the full testing is complete and the CCPP-Physics and UFSATM repos get approvals from CM's, then this will get onto the commit queue ASAP.

@iri01

iri01 commented Apr 8, 2026

Copy link
Copy Markdown
Contributor Author

Thank you, @BrianCurtis-NOAA . Yes, I'm running the RT tests on ursa. Will update info soon.

@gspetro-NOAA gspetro-NOAA moved this from Evaluating to Pre-testing required in PRs to Process Apr 9, 2026
@jkbk2004

Copy link
Copy Markdown
Collaborator

@iri01 @drnimbusrain @grantfirl cpld_debug_sfs_intel case crashes on derecho
fv3.exe 0000000005EBC9D1 sfccyc_module_mp_ 6967 sfcsub.F
fv3.exe 0000000005E81EA0 sfccyc_module_mp_ 6238 sfcsub.F
fv3.exe 0000000005D0D71E sfccyc_module_mp_ 879 sfcsub.F
fv3.exe 0000000005CB2EDB gcycle_mod_mp_gcy 263 gcycle.F90
fv3.exe 0000000005A10C42 gfs_phys_time_var 927 GFS_phys_time_vary.fv3.F90
fv3.exe 0000000005514AEB _v17_coupled_p8_u 1332 ccpp_fv3_gfs_v17_coupled_p8_ugwpv1_tim
e_vary_cap.F90

@iri01

iri01 commented Apr 13, 2026

Copy link
Copy Markdown
Contributor Author

@jkbk2004, thank you for letting us know.
@BrianCurtis-NOAA @gspetro-NOAA, the RT for "cpld_debug_sfs_intel" passes on ursa. Do you want us to run any other RTs?

ursa path:
/scratch3/NAGAPE/gpu-arl-gpu/Irena.Ivanova/ufs-weather-mod/fix_canopy_vdf/tests/logs/RegressionTests_ursa.log
PASS -- TEST 'cpld_debug_sfs_intel' [12:32, 08:42](2439 MB)

@drnimbusrain

Copy link
Copy Markdown
Contributor

@gspetro-NOAA @jkbk2004 Yes, our "cpld_debug_sfs_intel" passes on Ursa as shown here in @iri01 RT, not sure why it fails on Derecho, but we do not have access to that system.

@gspetro-NOAA

Copy link
Copy Markdown
Collaborator

I will run the test on Derecho and see if the test will pass on rerun. Derecho can be finicky. 🙄 If not, I'll try to provide more error info.

@drnimbusrain

Copy link
Copy Markdown
Contributor

I will run the test on Derecho and see if the test will pass on rerun. Derecho can be finicky. 🙄 If not, I'll try to provide more error info.

Thank you @gspetro-NOAA !! Hope we can move it forward.

@gspetro-NOAA gspetro-NOAA moved this from Pre-testing required to Waiting for Reviews (subcomponent) in PRs to Process Apr 15, 2026
@gspetro-NOAA gspetro-NOAA moved this from Waiting for Reviews (subcomponent) to Schedule in PRs to Process Apr 15, 2026
@gspetro-NOAA

Copy link
Copy Markdown
Collaborator

@iri01 Could you sync w/develop?

@gspetro-NOAA

gspetro-NOAA commented Apr 16, 2026

Copy link
Copy Markdown
Collaborator

Ok, I am also getting that crash on Derecho in cpld_debug_sfs_intel. The first place an error appears in the err file shows:

dec1365.hsn.de.hpc.ucar.edu: rank 10 died from signal 9
forrtl: error (78): process killed (SIGTERM)
Image              PC                Routine            Line        Source
libc.so.6          000015493FE1D900  Unknown               Unknown  Unknown
libc.so.6          000015493FEDAC97  __sched_yield         Unknown  Unknown
libmpi_intel.so.1  00001549409D6AF8  Unknown               Unknown  Unknown
libmpi_intel.so.1  00001549409ED3FC  MPI_Probe             Unknown  Unknown
fv3.exe            00000000010B0319  Unknown               Unknown  Unknown
fv3.exe            0000000000DBE8C1  Unknown               Unknown  Unknown
fv3.exe            0000000000DBE337  Unknown               Unknown  Unknown
...

I'm wondering if it might be an MPI issue?
I also see an exit message:

dec1370.hsn.de.hpc.ucar.edu: rank 317 exited with code 1   
forrtl: error (78): process killed (SIGTERM)            
Image              PC                Routine            Line        Source 
libc.so.6          000014AE64E1D900  Unknown               Unknown  Unknown
fv3.exe            0000000005E3D156  sfccyc_module_mp_        5105  sfcsub.F
fv3.exe            0000000005EBBB7A  sfccyc_module_mp_        6949  sfcsub.F
fv3.exe            0000000005E81EA0  sfccyc_module_mp_        6238  sfcsub.F
fv3.exe            0000000005D0D71E  sfccyc_module_mp_         879  sfcsub.F
fv3.exe            0000000005CB2EDB  gcycle_mod_mp_gcy         263  gcycle.F90
fv3.exe            0000000005A10C42  gfs_phys_time_var         927  GFS_phys_time_vary.fv3.F90
fv3.exe            0000000005514AEB  _v17_coupled_p8_u        1332  ccpp_fv3_gfs_v17_coupled_p8_ugwpv1_time_vary_cap.F90
fv3.exe            00000000047A609D  ccpp_static_api_m         148  ccpp_static_api.F90
fv3.exe            000000000479888F  ccpp_driver_mp_cc         133  CCPP_driver.F90
fv3.exe            0000000001A0FF2A  atmos_model_mod_m         287  atmos_model.F90
fv3.exe            0000000001459CB4  module_fcst_grid_        1486  module_fcst_grid_comp.F90

The out file stops here:

i=          50  slmask(i)=   1.00000000000000       outlon=
   305.586423145820       outlat=  -34.9855817437601
  unable to interpolate.  filled with nearest point value at           22  point
 s
  in fixrdc for mon=           4  fngrib=IMS-NIC.blended.ice.monthly.clim.grb
  file IMS-NIC.blended.ice.monthly.clim.grb opened. unit=        9998
  first grib record.
  kpds( 1-10)=           7         120         255         192          91
         102           0          20           1           1
  kpds(11-20)=           0           0           1           0           0
          10          31           1           2           0
  kpds(21-  )=          21           2
  input grib file dates=          20           4           1           0
 imax,jmax,ijmax=        7200        3600    25920000
  kgds( 1-12)=           0        7200        3600       89975          25
         192      -89975      359975          50          50           0
           0
  kgds(13-22)=          -1          -1          -1          -1          -1
          -1           0         255          -1          -1
 lat/lon grid
 imax,jmax,ijmax,dlon,dlat,ijordr,wlon,rnlat=
        7200        3600    25920000  5.000000000000000E-002
 -5.000000000000000E-002 T  2.500000000000000E-002   89.9750000000000

What other information would it be useful for you to see? I can maybe move my whole run directory to a platform where you have access if you let me know where is best.

In case others have Derecho access, my run_dir is at /glade/derecho/scratch/gpetro/FV3_RT/rt_69698.

@gspetro-NOAA gspetro-NOAA moved this from Schedule to Review in PRs to Process Apr 16, 2026
@iri01

iri01 commented Apr 16, 2026

Copy link
Copy Markdown
Contributor Author

@iri01 Could you sync w/develop?

@gspetro-NOAA: We just synced.

@drnimbusrain

Copy link
Copy Markdown
Contributor

Ok, I am also getting that crash on Derecho in cpld_debug_sfs_intel. The first place an error appears in the err file shows:

dec1365.hsn.de.hpc.ucar.edu: rank 10 died from signal 9
forrtl: error (78): process killed (SIGTERM)
Image              PC                Routine            Line        Source
libc.so.6          000015493FE1D900  Unknown               Unknown  Unknown
libc.so.6          000015493FEDAC97  __sched_yield         Unknown  Unknown
libmpi_intel.so.1  00001549409D6AF8  Unknown               Unknown  Unknown
libmpi_intel.so.1  00001549409ED3FC  MPI_Probe             Unknown  Unknown
fv3.exe            00000000010B0319  Unknown               Unknown  Unknown
fv3.exe            0000000000DBE8C1  Unknown               Unknown  Unknown
fv3.exe            0000000000DBE337  Unknown               Unknown  Unknown
...

I'm wondering if it might be an MPI issue? I also see an exit message:

dec1370.hsn.de.hpc.ucar.edu: rank 317 exited with code 1   
forrtl: error (78): process killed (SIGTERM)            
Image              PC                Routine            Line        Source 
libc.so.6          000014AE64E1D900  Unknown               Unknown  Unknown
fv3.exe            0000000005E3D156  sfccyc_module_mp_        5105  sfcsub.F
fv3.exe            0000000005EBBB7A  sfccyc_module_mp_        6949  sfcsub.F
fv3.exe            0000000005E81EA0  sfccyc_module_mp_        6238  sfcsub.F
fv3.exe            0000000005D0D71E  sfccyc_module_mp_         879  sfcsub.F
fv3.exe            0000000005CB2EDB  gcycle_mod_mp_gcy         263  gcycle.F90
fv3.exe            0000000005A10C42  gfs_phys_time_var         927  GFS_phys_time_vary.fv3.F90
fv3.exe            0000000005514AEB  _v17_coupled_p8_u        1332  ccpp_fv3_gfs_v17_coupled_p8_ugwpv1_time_vary_cap.F90
fv3.exe            00000000047A609D  ccpp_static_api_m         148  ccpp_static_api.F90
fv3.exe            000000000479888F  ccpp_driver_mp_cc         133  CCPP_driver.F90
fv3.exe            0000000001A0FF2A  atmos_model_mod_m         287  atmos_model.F90
fv3.exe            0000000001459CB4  module_fcst_grid_        1486  module_fcst_grid_comp.F90

The out file stops here:

i=          50  slmask(i)=   1.00000000000000       outlon=
   305.586423145820       outlat=  -34.9855817437601
  unable to interpolate.  filled with nearest point value at           22  point
 s
  in fixrdc for mon=           4  fngrib=IMS-NIC.blended.ice.monthly.clim.grb
  file IMS-NIC.blended.ice.monthly.clim.grb opened. unit=        9998
  first grib record.
  kpds( 1-10)=           7         120         255         192          91
         102           0          20           1           1
  kpds(11-20)=           0           0           1           0           0
          10          31           1           2           0
  kpds(21-  )=          21           2
  input grib file dates=          20           4           1           0
 imax,jmax,ijmax=        7200        3600    25920000
  kgds( 1-12)=           0        7200        3600       89975          25
         192      -89975      359975          50          50           0
           0
  kgds(13-22)=          -1          -1          -1          -1          -1
          -1           0         255          -1          -1
 lat/lon grid
 imax,jmax,ijmax,dlon,dlat,ijordr,wlon,rnlat=
        7200        3600    25920000  5.000000000000000E-002
 -5.000000000000000E-002 T  2.500000000000000E-002   89.9750000000000

What other information would it be useful for you to see? I can maybe move my whole run directory to a platform where you have access if you let me know where is best.

In case others have Derecho access, my run_dir is at /glade/derecho/scratch/gpetro/FV3_RT/rt_69698.

Thank you @gspetro-NOAA for the additional information. @iri01 has synced our fork/branch with upstream, and is running another RT on Ursa. However, this is strange and doesn't seem related to our most recent canopy changes to UFSATM/ccpp PBL scheme in this PR.

Have other recent PRs been similarly tested for this cpld_debug_sfs_intel with success on Derecho?

Comment thread tests/logs/OpnReqTests_control_p8_ursa.log
@gspetro-NOAA gspetro-NOAA removed the In Testing The PR that is currently in testing stages label Apr 21, 2026
@gspetro-NOAA

Copy link
Copy Markdown
Collaborator

Testing has completed successfully; leaving a note in sub-PRs.

@drnimbusrain

Copy link
Copy Markdown
Contributor

@gspetro-NOAA OK, saw that UFSATM was also merged, so I reverted .gitmodules and submodule here. Thank you.

@gspetro-NOAA gspetro-NOAA merged commit b075cdc into ufs-community:develop Apr 23, 2026
10 checks passed
@github-project-automation github-project-automation Bot moved this from Schedule to Done in PRs to Process Apr 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Baseline Updates Current baselines will be updated. CCPP There are changes to a CCPP repository. Priority: Critical Ready for Commit Queue The PR is ready for the Commit Queue. All checkboxes in PR template have been checked. UFSATM There are changes to the UFSATM repository.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

8 participants