Skip to content

Use thinning for ice obs in LETKF#1814

Merged
guillaumevernieres merged 4 commits into
developfrom
feature/letkf_thinning
Jul 30, 2025
Merged

Use thinning for ice obs in LETKF#1814
guillaumevernieres merged 4 commits into
developfrom
feature/letkf_thinning

Conversation

@shlyaeva
Copy link
Copy Markdown
Collaborator

@shlyaeva shlyaeva commented Jul 24, 2025

Description

Use thinning for AMSR2 ice observations in LETKF (both makes sense scientifically, and brings down the runtime to ~8min on hera on 480 MPI tasks).

Since thinning is done with "reduce obs space" option, we need to run it only once. Hence the switch to using linear observer here: in this case filters are only run on ensemble mean.

Companion PRs

NOAA-EMC/jcb-gdas#153

Issues

Resolves #1542

Automated CI tests to run in Global Workflow

  • atm_jjob
  • C96C48_ufs_hybatmDA
  • C96C48_hybatmsnowDA
  • C96_gcafs_cycled
  • C48mx500_3DVarAOWCDA
  • C48mx500_hybAOWCDA
  • C96C48_hybatmDA

@shlyaeva shlyaeva self-assigned this Jul 24, 2025
@shlyaeva shlyaeva added hera-GW-RT Queue for automated testing with global-workflow on Hera orion-GW-RT Queue for automated testing with global-workflow on Orion labels Jul 24, 2025
@emcbot emcbot added hera-GW-RT-Running Automated testing with global-workflow running on Hera orion-GW-RT-Running Automated testing with global-workflow running on Orion and removed hera-GW-RT Queue for automated testing with global-workflow on Hera orion-GW-RT Queue for automated testing with global-workflow on Orion labels Jul 24, 2025
@emcbot
Copy link
Copy Markdown

emcbot commented Jul 24, 2025

Automated GW-GDASApp Testing Results:
Machine: hera

Start: Thu Jul 24 21:35:16 UTC 2025 on hfe11
---------------------------------------------------
Build:                                 *SUCCESS*
Build: Completed at Thu Jul 24 22:10:47 UTC 2025
---------------------------------------------------
Tests: ctest -j12 -R gdasapp -E atm_jjob|C96C48_ufs_hybatmDA|C96C48_hybatmsnowDA|C96_gcafs_cycled|C96C48_hybatmDA
Tests:                                 *SUCCESS*
Tests: Completed at Thu Jul 24 22:30:13 UTC 2025
Tests: 100% tests passed, 0 tests failed out of 65

@emcbot emcbot added hera-GW-RT-Passed Automated testing with global-workflow successful on Hera and removed hera-GW-RT-Running Automated testing with global-workflow running on Hera labels Jul 24, 2025
@emcbot
Copy link
Copy Markdown

emcbot commented Jul 24, 2025

Automated GW-GDASApp Testing Results:
Machine: orion

Start: Thu Jul 24 04:41:17 PM CDT 2025 on orion-login-1.hpc.msstate.edu
---------------------------------------------------
Build:                                 *SUCCESS*
Build: Completed at Thu Jul 24 05:36:55 PM CDT 2025
---------------------------------------------------
Tests: ctest -j12 -R gdasapp -E C96_gcafs_cycled|atm_jjob|C96C48_ufs_hybatmDA|C96C48_hybatmsnowDA|C96_gcafs_cycled|C96C48_hybatmDA
Tests:                                  *Failed*
Tests: Failed at Thu Jul 24 06:30:09 PM CDT 2025
Tests: 88% tests passed, 7 tests failed out of 56
	2133 - test_gdasapp_C48mx500_hybAOWCDA_gdas_fcst_202103241800 (Failed)
	2135 - test_gdasapp_C48mx500_hybAOWCDA_enkfgdas_fcst_202103241800 (Failed)
	2140 - test_gdasapp_C48mx500_hybAOWCDA_gdas_marineanlinit_202103250000 (Failed)
	2141 - test_gdasapp_C48mx500_hybAOWCDA_gdas_marineanlvar_202103250000 (Failed)
	2142 - test_gdasapp_C48mx500_hybAOWCDA_gdas_ocnanalecen_202103250000 (Failed)
	2143 - test_gdasapp_C48mx500_hybAOWCDA_gdas_marineanlchkpt_202103250000 (Failed)
	2144 - test_gdasapp_C48mx500_hybAOWCDA_gdas_marineanlfinal_202103250000 (Failed)
Tests: see output at /work2/noaa/da/role-da/CI/orion/GDASApp/workflow/PR/1814/global-workflow/sorc/gdas.cd/build/log.ctest

@emcbot emcbot added orion-GW-RT-Failed Automated testing with global-workflow failed on Orion and removed orion-GW-RT-Running Automated testing with global-workflow running on Orion labels Jul 24, 2025
@RussTreadon-NOAA
Copy link
Copy Markdown
Contributor

@shlyaeva and @guillaumevernieres

Orion and Hercules are suffering sluggish /work and /work2 performance. A check of the gdas_fcst and enkfgdas_fcst job log files for C48mx500_hybAOWCDA show that the jobs were killed for exceeding the specified wall clock limit

orion-login-3:/work2/noaa/da/role-da/CI/orion/GDASApp/workflow/PR/1814/global-workflow/sorc/gdas.cd/build/gdas/test/gw-ci/C48mx500_hybAOWCDA/COMROOT/C48mx500_hybAOWCDA/logs/2021032418$ grep -r "DUE TO TIME" .
./enkfgdas_fcst_mem001.log.0:slurmstepd: error: *** JOB 20652372 ON orion-12-05 CANCELLED AT 2025-07-24T18:00:09 DUE TO TIME LIMIT ***
./enkfgdas_fcst_mem002.log.0:slurmstepd: error: *** JOB 20652361 ON orion-11-65 CANCELLED AT 2025-07-24T17:59:39 DUE TO TIME LIMIT ***
./gdas_fcst_seg0.log:slurmstepd: error: *** JOB 20652355 ON orion-05-68 CANCELLED AT 2025-07-24T17:59:39 DUE TO TIME LIMIT ***
./enkfgdas_fcst_mem002.log:slurmstepd: error: *** JOB 20652724 ON orion-05-68 CANCELLED AT 2025-07-24T18:20:12 DUE TO TIME LIMIT ***
./enkfgdas_fcst_mem001.log:slurmstepd: error: *** JOB 20652727 ON orion-06-68 CANCELLED AT 2025-07-24T18:20:41 DUE TO TIME LIMIT ***

Copy link
Copy Markdown
Collaborator

@JohnSteffen-NOAA JohnSteffen-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We get the split altimeters with this too!

@guillaumevernieres
Copy link
Copy Markdown
Contributor

I just merged jcb-gdas @shlyaeva , can you update the #?
We should probably rerun with all the tests on.

@shlyaeva
Copy link
Copy Markdown
Collaborator Author

I think we probably don't need all the tests: jcb-gdas hash was recently updated on develop, and now the only updates are ocean-related. What do you think?

@shlyaeva shlyaeva added the hera-GW-RT Queue for automated testing with global-workflow on Hera label Jul 25, 2025
@emcbot emcbot added hera-GW-RT-Running Automated testing with global-workflow running on Hera and removed hera-GW-RT Queue for automated testing with global-workflow on Hera labels Jul 25, 2025
@emcbot
Copy link
Copy Markdown

emcbot commented Jul 26, 2025

Automated GW-GDASApp Testing Results:
Machine: hera

Start: Fri Jul 25 17:58:34 UTC 2025 on hfe04
---------------------------------------------------
Build:                                 *SUCCESS*
Build: Completed at Fri Jul 25 18:38:47 UTC 2025
---------------------------------------------------
Tests: ctest -j12 -R gdasapp
Tests:                                  *Failed*
Tests: Failed at Sat Jul 26 10:55:07 UTC 2025
Tests: 99% tests passed, 2 tests failed out of 156
	2148 - test_gdasapp_C48mx500_hybAOWCDA_enkfgdas_marineanlletkf_202103250000 (Timeout)
	2151 - test_gdasapp_C48mx500_hybAOWCDA_enkfgdas_marineanlecen_202103250000 (Timeout)
Tests: see output at /scratch3/NCEPDEV/da/role.jedipara/CI/GDASApp/workflow/PR/1814/global-workflow/sorc/gdas.cd/build/log.ctest

@emcbot emcbot added the hera-GW-RT-Failed Automated testing with global-workflow failed on Hera label Jul 26, 2025
@emcbot emcbot removed the hera-GW-RT-Running Automated testing with global-workflow running on Hera label Jul 26, 2025
@guillaumevernieres guillaumevernieres added hera-GW-RT Queue for automated testing with global-workflow on Hera and removed hera-GW-RT-Failed Automated testing with global-workflow failed on Hera orion-GW-RT-Failed Automated testing with global-workflow failed on Orion hera-GW-RT-Passed Automated testing with global-workflow successful on Hera labels Jul 26, 2025
@emcbot emcbot added hera-GW-RT-Running Automated testing with global-workflow running on Hera and removed hera-GW-RT Queue for automated testing with global-workflow on Hera labels Jul 26, 2025
@emcbot
Copy link
Copy Markdown

emcbot commented Jul 27, 2025

Automated GW-GDASApp Testing Results:
Machine: hera

Start: Sat Jul 26 17:08:30 UTC 2025 on hfe07
---------------------------------------------------
Build:                                 *SUCCESS*
Build: Completed at Sat Jul 26 17:42:09 UTC 2025
---------------------------------------------------
Tests: ctest -j12 -R gdasapp -E atm_jjob|C96C48_ufs_hybatmDA|C96_gcafs_cycled|C96C48_hybatmDA
Tests:                                  *Failed*
Tests: Failed at Sun Jul 27 09:54:05 UTC 2025
Tests: 98% tests passed, 2 tests failed out of 86
	2148 - test_gdasapp_C48mx500_hybAOWCDA_enkfgdas_marineanlletkf_202103250000 (Timeout)
	2151 - test_gdasapp_C48mx500_hybAOWCDA_enkfgdas_marineanlecen_202103250000 (Timeout)
Tests: see output at /scratch3/NCEPDEV/da/role.jedipara/CI/GDASApp/workflow/PR/1814/global-workflow/sorc/gdas.cd/build/log.ctest

@emcbot emcbot added hera-GW-RT-Failed Automated testing with global-workflow failed on Hera and removed hera-GW-RT-Running Automated testing with global-workflow running on Hera labels Jul 27, 2025
@RussTreadon-NOAA
Copy link
Copy Markdown
Contributor

@guillaumevernieres and @shlyaeva

A check of Hera /scratch3/NCEPDEV/da/role.jedipara/CI/GDASApp/workflow/PR/1814/global-workflow/sorc/gdas.cd/build/log.ctest shows

85/86 Test #2148: test_gdasapp_C48mx500_hybAOWCDA_enkfgdas_marineanlletkf_202103250000 ...***Timeout 28800.03 sec
booting  for cycle 202103250000
C48mx500_hybAOWCDA_enkfgdas_marineanlletkf_202103250000 is in unrecognized state: . Rewinding...
202103250000: Rewind tasks for 202103250000 in state "activated" since 2025-07-26 17:42:46
202103250000: No tasks to rewind.

A check of C48mx500_hybAOWCDA confirms that enkfgdas_marineanlletkf is not defined. Only gdas_marineanlletkf is in the xml file.

It looks like this GDASApp PR needs to be run inside @DavidNew-NOAA's g-w PR #3882. Item 2 in the description for g-w PR #3882 states

2. The `ocnanalecen` and `marineanlletkf` will now be `enkfgdas` rather than `gdas` jobs.

I think we are dealing with a case of GDASApp being ahead of g-w develop.

@DavidNew-NOAA
Copy link
Copy Markdown
Collaborator

@RussTreadon-NOAA Indeed, GDASApp is ahead of GW. I tested my NOAA-EMC/global-workflow#3882 PR for some standard cases which passed and I then I merged its companion, #1803. However the 96C48mx500_S2SW_cyc_gfs case, which I didn't test, is failing in CI, so I need to work out that bug this morning and hopefully have CI re-run today.

@shlyaeva
Copy link
Copy Markdown
Collaborator Author

No problem, this PR isn't urgent, and is not needed for v17.

@shlyaeva shlyaeva added hera-GW-RT Queue for automated testing with global-workflow on Hera and removed hera-GW-RT-Failed Automated testing with global-workflow failed on Hera labels Jul 30, 2025
@emcbot emcbot added hera-GW-RT-Running Automated testing with global-workflow running on Hera and removed hera-GW-RT Queue for automated testing with global-workflow on Hera labels Jul 30, 2025
@emcbot
Copy link
Copy Markdown

emcbot commented Jul 30, 2025

Automated GW-GDASApp Testing Results:
Machine: hera

Start: Wed Jul 30 17:06:21 UTC 2025 on hfe11
---------------------------------------------------
Build:                                 *SUCCESS*
Build: Completed at Wed Jul 30 17:40:22 UTC 2025
---------------------------------------------------
Tests: ctest -j12 -R gdasapp -E atm_jjob|C96C48_ufs_hybatmDA|C96_gcafs_cycled|C96C48_hybatmDA
Tests:                                 *SUCCESS*
Tests: Completed at Wed Jul 30 18:27:48 UTC 2025
Tests: 100% tests passed, 0 tests failed out of 86

@emcbot emcbot added hera-GW-RT-Passed Automated testing with global-workflow successful on Hera and removed hera-GW-RT-Running Automated testing with global-workflow running on Hera labels Jul 30, 2025
@guillaumevernieres guillaumevernieres merged commit 0e9024e into develop Jul 30, 2025
11 checks passed
@guillaumevernieres guillaumevernieres deleted the feature/letkf_thinning branch July 30, 2025 20:14
DavidNew-NOAA pushed a commit that referenced this pull request Jan 16, 2026
# Description

Use thinning for AMSR2 ice observations in LETKF (both makes sense
scientifically, and brings down the runtime to ~8min on hera on 480 MPI
tasks).

Since thinning is done with "reduce obs space" option, we need to run it
only once. Hence the switch to using linear observer here: in this case
filters are only run on ensemble mean.

# Companion PRs

NOAA-EMC/jcb-gdas#153

# Issues

Resolves #1542

# Automated CI tests to run in Global Workflow
<!-- Which Global Workflow CI tests are required to adequately test this
PR? -->
- [ ] atm_jjob <!-- JEDI atm single cycle DA !-->
- [ ] C96C48_ufs_hybatmDA <!-- JEDI atm cycled DA !-->
- [x] C96C48_hybatmsnowDA <!-- JEDI snow cycled DA !-->
- [ ] C96_gcafs_cycled <!-- JEDI aerosol cycled DA !-->
- [x] C48mx500_3DVarAOWCDA <!-- JEDI low-res marine 3DVar cycled DA !-->
- [x] C48mx500_hybAOWCDA <!-- JEDI marine hybrid envar cycled DA !-->
- [ ] C96C48_hybatmDA <!-- GSI atm cycled DA !-->

---------

Co-authored-by: Anna Shlyaeva <anna.v.shlyaeva@noaa.gov>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hera-GW-RT-Passed Automated testing with global-workflow successful on Hera

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Tune the number of obs per local volume in marine LETKF

6 participants