Skip to content

Fix GW jjob tests for upcoming GW PR #2420#1041

Merged
CoryMartin-NOAA merged 3 commits into
developfrom
feature/update_jjob_tests
Apr 15, 2024
Merged

Fix GW jjob tests for upcoming GW PR #2420#1041
CoryMartin-NOAA merged 3 commits into
developfrom
feature/update_jjob_tests

Conversation

@DavidNew-NOAA
Copy link
Copy Markdown
Collaborator

@DavidNew-NOAA DavidNew-NOAA commented Apr 12, 2024

This PR addresses issue #1011, related to the failure of the "gdasatmanlvar" jjob test due to a change in the name of the "gdasatmanlvar" run script, and, with the upcoming Global Workflow PR #2420, the impending failure of the "gdasatmanlfinal" jjob. This fixes the script name in the "gdasatmanlvar" test and adds a new test for "gdasatmanlfv3inc" which will fix the issure with "gdasatmanlfinal".

The "gdasatmanlfv3inc" and "gdasatmanlfinal" won't pass yes in GW develop, but will after #2420 merges GW feature/jediinc2fv3.

This PR is just a verbatim copy of @RussTreadon-NOAA 's work, taken from his comment here. I re-ran the new tests, and they also passed for me in GW feature/jediinc2fv3.

@RussTreadon-NOAA
Copy link
Copy Markdown
Contributor

While test_gdasapp_atm_jjob_var_inc runs to completion on Orion, it fails on Hera. Below is the Hera traceback

0: Info     :        Registry status:   0
2: corrupted size vs. prev_size
5: corrupted size vs. prev_size
0:
0: Run: Finishing gdasapp::fv3inc
0: OOPS_STATS

...

0: OOPS_STATS util::Timers::measured                              :     16002.43       1        16002.4301
0: OOPS_STATS ------------------------------------ Timing Statistics -------------------------------------
srun: error: h20c45: task 2: Aborted (core dumped)
srun: Terminating StepId=58208170.0
0: slurmstepd: error: *** STEP 58208170.0 ON h20c45 CANCELLED AT 2024-04-13T10:59:40 ***
srun: error: h20c45: tasks 0,3-4: Terminated
srun: error: h20c45: task 5: Aborted (core dumped)
srun: error: h20c45: task 1: Terminated

The printout

2: corrupted size vs. prev_size
5: corrupted size vs. prev_size

is not in the Orion job log file.

I reran the job on Hera with --mem=0 to request all memory on the Hera node. test_gdasapp_atm_jjob_var_inc still failed in the same way.

test_gdasapp_fv3jedi_fv3inc also failed on Hera despite feature/update_jjob_tests including GDASApp PR #1039. The Hera traceback for test_gdasapp_fv3jedi_fv3inc contains the following

Test     : FV3 Increment:

Test     : ----------------------------------------------------------------------------------------------------
Test     : Increment print | number of fields = 9 | cube sphere face size: C12
Test     : eastward_wind                                | Min:-4.3422836419308410e+00 Max:+1.2320940067737499e+01 RMS:+3.0957235443709130e-01
Test     : northward_wind                               | Min:-4.1090470888107049e+00 Max:+5.4552721209750796e+00 RMS:+3.1062842460180656e-01
Test     : air_temperature                              | Min:-5.2980343087781989e-01 Max:+5.1811022097894011e-01 RMS:+3.5920751813835923e-02
Test     : specific_humidity                            | Min:-2.8092260972819617e-04 Max:+2.9434075393080551e-04 RMS:+1.6532405760382759e-05
Test     : cloud_liquid_ice                             | Min:+0.0000000000000000e+00 Max:+0.0000000000000000e+00 RMS:+0.0000000000000000e+00
Test     : cloud_liquid_water                           | Min:+0.0000000000000000e+00 Max:+0.0000000000000000e+00 RMS:+0.0000000000000000e+00
Test     : ozone_mass_mixing_ratio                      | Min:+0.0000000000000000e+00 Max:+0.0000000000000000e+00 RMS:+0.0000000000000000e+00
Test     : air_pressure_thickness                       | Min:-2.9992886080290191e+00 Max:+1.5291703492039233e+00 RMS:+1.7535872214547940e-01
Test     : hydrostatic_layer_thickness                  | Min:-4.6699236754648155e-01 Max:+7.4693987323735200e-01 RMS:+3.1162055487823255e-02
Test     : ----------------------------------------------------------------------------------------------------
double free or corruption (!prev)
srun: error: h32m52: task 5: Aborted (core dumped)
srun: Terminating StepId=58208119.0
slurmstepd: error: *** STEP 58208119.0 ON h32m52 CANCELLED AT 2024-04-13T10:51:05 ***
srun: error: h32m52: tasks 0,2-4: Terminated
srun: error: h32m52: task 1: Terminated
srun: Force Terminated StepId=58208119.0

Not sure what's going on.

@RussTreadon-NOAA
Copy link
Copy Markdown
Contributor

Repeat Hera test on Orion. Interestingly, all tests pass on Orion.

Test project /work2/noaa/da/rtreadon/git/global-workflow/jediinc2fv3/sorc/gdas.cd/build
      Start 1393: test_gdasapp_util_coding_norms
 1/54 Test #1393: test_gdasapp_util_coding_norms ........................   Passed    4.74 sec
      Start 1394: test_gdasapp_util_ioda_example
 2/54 Test #1394: test_gdasapp_util_ioda_example ........................   Passed    8.84 sec
      Start 1395: test_gdasapp_util_prepdata
 3/54 Test #1395: test_gdasapp_util_prepdata ............................   Passed    5.37 sec
      Start 1396: test_gdasapp_util_rads2ioda
 4/54 Test #1396: test_gdasapp_util_rads2ioda ...........................   Passed    0.53 sec
      Start 1397: test_gdasapp_util_ghrsst2ioda
 5/54 Test #1397: test_gdasapp_util_ghrsst2ioda .........................   Passed    0.19 sec
      Start 1398: test_gdasapp_util_smap2ioda
 6/54 Test #1398: test_gdasapp_util_smap2ioda ...........................   Passed    0.19 sec
      Start 1399: test_gdasapp_util_smos2ioda
 7/54 Test #1399: test_gdasapp_util_smos2ioda ...........................   Passed    0.20 sec
      Start 1400: test_gdasapp_util_viirsaod2ioda
 8/54 Test #1400: test_gdasapp_util_viirsaod2ioda .......................   Passed    0.18 sec
      Start 1401: test_gdasapp_util_icecamsr2ioda
 9/54 Test #1401: test_gdasapp_util_icecamsr2ioda .......................   Passed    0.17 sec
      Start 1739: test_gdasapp_check_python_norms
10/54 Test #1739: test_gdasapp_check_python_norms .......................   Passed    3.10 sec
      Start 1740: test_gdasapp_check_yaml_keys
11/54 Test #1740: test_gdasapp_check_yaml_keys ..........................   Passed    2.39 sec
      Start 1741: test_gdasapp_jedi_increment_to_fv3
12/54 Test #1741: test_gdasapp_jedi_increment_to_fv3 ....................   Passed   17.00 sec
      Start 1742: test_gdasapp_setup_cycled_exp
13/54 Test #1742: test_gdasapp_setup_cycled_exp .........................   Passed    3.69 sec
      Start 1743: test_gdasapp_fv3jedi_fv3inc
14/54 Test #1743: test_gdasapp_fv3jedi_fv3inc ...........................   Passed   36.63 sec
      Start 1744: test_gdasapp_convert_bufr_temp_dbuoy
15/54 Test #1744: test_gdasapp_convert_bufr_temp_dbuoy ..................   Passed    2.38 sec
      Start 1745: test_gdasapp_convert_bufr_salt_dbuoy
16/54 Test #1745: test_gdasapp_convert_bufr_salt_dbuoy ..................   Passed    0.33 sec
      Start 1746: test_gdasapp_convert_bufr_temp_mbuoyb
17/54 Test #1746: test_gdasapp_convert_bufr_temp_mbuoyb .................   Passed    0.30 sec
      Start 1747: test_gdasapp_convert_bufr_salt_mbuoyb
18/54 Test #1747: test_gdasapp_convert_bufr_salt_mbuoyb .................   Passed    0.29 sec
      Start 1748: test_gdasapp_convert_bufr_tesacprof
19/54 Test #1748: test_gdasapp_convert_bufr_tesacprof ...................   Passed    0.27 sec
      Start 1749: test_gdasapp_convert_bufr_trkobprof
20/54 Test #1749: test_gdasapp_convert_bufr_trkobprof ...................   Passed    0.28 sec
      Start 1750: test_gdasapp_convert_bufr_sfcships
21/54 Test #1750: test_gdasapp_convert_bufr_sfcships ....................   Passed    0.28 sec
      Start 1751: test_gdasapp_convert_bufr_sfcshipsu
22/54 Test #1751: test_gdasapp_convert_bufr_sfcshipsu ...................   Passed    0.30 sec
      Start 1752: test_gdasapp_soca_nsst_increment_to_mom6
23/54 Test #1752: test_gdasapp_soca_nsst_increment_to_mom6 ..............   Passed   47.27 sec
      Start 1753: test_gdasapp_soca_prep
24/54 Test #1753: test_gdasapp_soca_prep ................................   Passed    6.92 sec
      Start 1754: test_gdasapp_soca_run_clean
25/54 Test #1754: test_gdasapp_soca_run_clean ...........................   Passed    0.14 sec
      Start 1755: test_gdasapp_soca_setup_obsprep
26/54 Test #1755: test_gdasapp_soca_setup_obsprep .......................   Passed   21.09 sec
      Start 1756: test_gdasapp_soca_JGLOBAL_PREP_OCEAN_OBS
27/54 Test #1756: test_gdasapp_soca_JGLOBAL_PREP_OCEAN_OBS ..............   Passed   44.66 sec
      Start 1757: test_gdasapp_soca_JGDAS_GLOBAL_OCEAN_ANALYSIS_PREP
28/54 Test #1757: test_gdasapp_soca_JGDAS_GLOBAL_OCEAN_ANALYSIS_PREP ....   Passed   74.31 sec
      Start 1758: test_gdasapp_soca_JGDAS_GLOBAL_OCEAN_ANALYSIS_BMAT
29/54 Test #1758: test_gdasapp_soca_JGDAS_GLOBAL_OCEAN_ANALYSIS_BMAT ....   Passed   74.27 sec
      Start 1759: test_gdasapp_soca_JGDAS_GLOBAL_OCEAN_ANALYSIS_RUN
30/54 Test #1759: test_gdasapp_soca_JGDAS_GLOBAL_OCEAN_ANALYSIS_RUN .....   Passed   42.24 sec
      Start 1760: test_gdasapp_soca_JGDAS_GLOBAL_OCEAN_ANALYSIS_ECEN
31/54 Test #1760: test_gdasapp_soca_JGDAS_GLOBAL_OCEAN_ANALYSIS_ECEN ....   Passed   42.24 sec
      Start 1761: test_gdasapp_soca_copy_scratch
32/54 Test #1761: test_gdasapp_soca_copy_scratch ........................   Passed    3.30 sec
      Start 1762: test_gdasapp_soca_JGDAS_GLOBAL_OCEAN_ANALYSIS_CHKPT
33/54 Test #1762: test_gdasapp_soca_JGDAS_GLOBAL_OCEAN_ANALYSIS_CHKPT ...   Passed   42.33 sec
      Start 1763: test_gdasapp_soca_JGDAS_GLOBAL_OCEAN_ANALYSIS_POST
34/54 Test #1763: test_gdasapp_soca_JGDAS_GLOBAL_OCEAN_ANALYSIS_POST ....   Passed   10.49 sec
      Start 1764: test_gdasapp_soca_JGDAS_GLOBAL_OCEAN_ANALYSIS_VRFY
35/54 Test #1764: test_gdasapp_soca_JGDAS_GLOBAL_OCEAN_ANALYSIS_VRFY ....   Passed  140.24 sec
      Start 1765: test_gdasapp_soca_socahybridweights
36/54 Test #1765: test_gdasapp_soca_socahybridweights ...................   Passed   10.40 sec
      Start 1766: test_gdasapp_soca_incr_handler
37/54 Test #1766: test_gdasapp_soca_incr_handler ........................   Passed   10.37 sec
      Start 1767: test_gdasapp_soca_ens_handler
38/54 Test #1767: test_gdasapp_soca_ens_handler .........................   Passed   10.39 sec
      Start 1768: test_gdasapp_snow_create_ens
39/54 Test #1768: test_gdasapp_snow_create_ens ..........................   Passed    6.62 sec
      Start 1769: test_gdasapp_snow_imsproc
40/54 Test #1769: test_gdasapp_snow_imsproc .............................   Passed    5.02 sec
      Start 1770: test_gdasapp_snow_apply_jediincr
41/54 Test #1770: test_gdasapp_snow_apply_jediincr ......................   Passed    7.35 sec
      Start 1771: test_gdasapp_snow_letkfoi_snowda
42/54 Test #1771: test_gdasapp_snow_letkfoi_snowda ......................   Passed   37.91 sec
      Start 1772: test_gdasapp_convert_bufr_adpsfc_snow
43/54 Test #1772: test_gdasapp_convert_bufr_adpsfc_snow .................   Passed    3.70 sec
      Start 1773: test_gdasapp_convert_bufr_adpsfc
44/54 Test #1773: test_gdasapp_convert_bufr_adpsfc ......................   Passed    4.78 sec
      Start 1774: test_gdasapp_convert_gsi_satbias
45/54 Test #1774: test_gdasapp_convert_gsi_satbias ......................   Passed    2.13 sec
      Start 1775: test_gdasapp_setup_atm_cycled_exp
46/54 Test #1775: test_gdasapp_setup_atm_cycled_exp .....................   Passed    1.38 sec
      Start 1776: test_gdasapp_atm_jjob_var_init
47/54 Test #1776: test_gdasapp_atm_jjob_var_init ........................   Passed   45.87 sec
      Start 1777: test_gdasapp_atm_jjob_var_run
48/54 Test #1777: test_gdasapp_atm_jjob_var_run .........................   Passed  107.90 sec
      Start 1778: test_gdasapp_atm_jjob_var_inc
49/54 Test #1778: test_gdasapp_atm_jjob_var_inc .........................   Passed   74.80 sec
      Start 1779: test_gdasapp_atm_jjob_var_final
50/54 Test #1779: test_gdasapp_atm_jjob_var_final .......................   Passed   42.21 sec
      Start 1780: test_gdasapp_atm_jjob_ens_init
51/54 Test #1780: test_gdasapp_atm_jjob_ens_init ........................   Passed   45.75 sec
      Start 1781: test_gdasapp_atm_jjob_ens_run
52/54 Test #1781: test_gdasapp_atm_jjob_ens_run .........................   Passed  298.43 sec
      Start 1782: test_gdasapp_atm_jjob_ens_final
53/54 Test #1782: test_gdasapp_atm_jjob_ens_final .......................   Passed   42.22 sec
      Start 1783: test_gdasapp_aero_gen_3dvar_yaml
54/54 Test #1783: test_gdasapp_aero_gen_3dvar_yaml ......................   Passed    0.40 sec

100% tests passed, 0 tests failed out of 54

Label Time Summary:
gdas-utils    =  20.41 sec*proc (9 tests)
script        =  20.41 sec*proc (9 tests)

Total Test time (real) = 1396.98 sec

What's causing the Hera failure? Orion still runs CentOS 7. Hera runs Rocky 8. Both the Hera and Orion GDASApp builds use spack-stack/1.6.0.

@DavidNew-NOAA
Copy link
Copy Markdown
Collaborator Author

@RussTreadon-NOAA Yeah, this is really perplexing. I'm going to take a closer look on Monday.

@RussTreadon-NOAA
Copy link
Copy Markdown
Contributor

Recompile Orion installation on Hercules. test_gdasapp_soca_JGDAS_GLOBAL_OCEAN_ANALYSIS_ECEN and test_gdasapp_atm_jjob_var_inc fail due to job control variables not being set in g-w env/HERCULES.env. Add ocnanalecen and atmanlfv3inc to working copy of HERCULES.env. After this change all 54 test_gdasapp pass on Hercules.

Test project /work/noaa/da/rtreadon/git/global-workflow/jediinc2fv3/sorc/gdas.cd/build
      Start 1393: test_gdasapp_util_coding_norms
 1/54 Test #1393: test_gdasapp_util_coding_norms ........................   Passed    0.97 sec
      Start 1394: test_gdasapp_util_ioda_example
 2/54 Test #1394: test_gdasapp_util_ioda_example ........................   Passed    0.96 sec
      Start 1395: test_gdasapp_util_prepdata
 3/54 Test #1395: test_gdasapp_util_prepdata ............................   Passed    0.43 sec
      Start 1396: test_gdasapp_util_rads2ioda
 4/54 Test #1396: test_gdasapp_util_rads2ioda ...........................   Passed    0.15 sec
      Start 1397: test_gdasapp_util_ghrsst2ioda
 5/54 Test #1397: test_gdasapp_util_ghrsst2ioda .........................   Passed    0.10 sec
      Start 1398: test_gdasapp_util_smap2ioda
 6/54 Test #1398: test_gdasapp_util_smap2ioda ...........................   Passed    0.09 sec
      Start 1399: test_gdasapp_util_smos2ioda
 7/54 Test #1399: test_gdasapp_util_smos2ioda ...........................   Passed    0.10 sec
      Start 1400: test_gdasapp_util_viirsaod2ioda
 8/54 Test #1400: test_gdasapp_util_viirsaod2ioda .......................   Passed    0.17 sec
      Start 1401: test_gdasapp_util_icecamsr2ioda
 9/54 Test #1401: test_gdasapp_util_icecamsr2ioda .......................   Passed    0.11 sec
      Start 1739: test_gdasapp_check_python_norms
10/54 Test #1739: test_gdasapp_check_python_norms .......................   Passed    1.82 sec
      Start 1740: test_gdasapp_check_yaml_keys
11/54 Test #1740: test_gdasapp_check_yaml_keys ..........................   Passed    0.05 sec
      Start 1741: test_gdasapp_jedi_increment_to_fv3
12/54 Test #1741: test_gdasapp_jedi_increment_to_fv3 ....................   Passed    0.31 sec
      Start 1742: test_gdasapp_setup_cycled_exp
13/54 Test #1742: test_gdasapp_setup_cycled_exp .........................   Passed    0.70 sec
      Start 1743: test_gdasapp_fv3jedi_fv3inc
14/54 Test #1743: test_gdasapp_fv3jedi_fv3inc ...........................   Passed    7.08 sec
      Start 1744: test_gdasapp_convert_bufr_temp_dbuoy
15/54 Test #1744: test_gdasapp_convert_bufr_temp_dbuoy ..................   Passed    0.17 sec
      Start 1745: test_gdasapp_convert_bufr_salt_dbuoy
16/54 Test #1745: test_gdasapp_convert_bufr_salt_dbuoy ..................   Passed    0.16 sec
      Start 1746: test_gdasapp_convert_bufr_temp_mbuoyb
17/54 Test #1746: test_gdasapp_convert_bufr_temp_mbuoyb .................   Passed    0.16 sec
      Start 1747: test_gdasapp_convert_bufr_salt_mbuoyb
18/54 Test #1747: test_gdasapp_convert_bufr_salt_mbuoyb .................   Passed    0.16 sec
      Start 1748: test_gdasapp_convert_bufr_tesacprof
19/54 Test #1748: test_gdasapp_convert_bufr_tesacprof ...................   Passed    0.19 sec
      Start 1749: test_gdasapp_convert_bufr_trkobprof
20/54 Test #1749: test_gdasapp_convert_bufr_trkobprof ...................   Passed    0.16 sec
      Start 1750: test_gdasapp_convert_bufr_sfcships
21/54 Test #1750: test_gdasapp_convert_bufr_sfcships ....................   Passed    0.16 sec
      Start 1751: test_gdasapp_convert_bufr_sfcshipsu
22/54 Test #1751: test_gdasapp_convert_bufr_sfcshipsu ...................   Passed    0.18 sec
      Start 1752: test_gdasapp_soca_nsst_increment_to_mom6
23/54 Test #1752: test_gdasapp_soca_nsst_increment_to_mom6 ..............   Passed    1.07 sec
      Start 1753: test_gdasapp_soca_prep
24/54 Test #1753: test_gdasapp_soca_prep ................................   Passed    1.29 sec
      Start 1754: test_gdasapp_soca_run_clean
25/54 Test #1754: test_gdasapp_soca_run_clean ...........................   Passed    0.21 sec
      Start 1755: test_gdasapp_soca_setup_obsprep
26/54 Test #1755: test_gdasapp_soca_setup_obsprep .......................   Passed    7.23 sec
      Start 1756: test_gdasapp_soca_JGLOBAL_PREP_OCEAN_OBS
27/54 Test #1756: test_gdasapp_soca_JGLOBAL_PREP_OCEAN_OBS ..............   Passed   42.57 sec
      Start 1757: test_gdasapp_soca_JGDAS_GLOBAL_OCEAN_ANALYSIS_PREP
28/54 Test #1757: test_gdasapp_soca_JGDAS_GLOBAL_OCEAN_ANALYSIS_PREP ....   Passed   42.14 sec
      Start 1758: test_gdasapp_soca_JGDAS_GLOBAL_OCEAN_ANALYSIS_BMAT
29/54 Test #1758: test_gdasapp_soca_JGDAS_GLOBAL_OCEAN_ANALYSIS_BMAT ....   Passed   42.13 sec
      Start 1759: test_gdasapp_soca_JGDAS_GLOBAL_OCEAN_ANALYSIS_RUN
30/54 Test #1759: test_gdasapp_soca_JGDAS_GLOBAL_OCEAN_ANALYSIS_RUN .....   Passed   42.14 sec
      Start 1760: test_gdasapp_soca_JGDAS_GLOBAL_OCEAN_ANALYSIS_ECEN
31/54 Test #1760: test_gdasapp_soca_JGDAS_GLOBAL_OCEAN_ANALYSIS_ECEN ....   Passed  106.14 sec
      Start 1761: test_gdasapp_soca_copy_scratch
32/54 Test #1761: test_gdasapp_soca_copy_scratch ........................   Passed    0.32 sec
      Start 1762: test_gdasapp_soca_JGDAS_GLOBAL_OCEAN_ANALYSIS_CHKPT
33/54 Test #1762: test_gdasapp_soca_JGDAS_GLOBAL_OCEAN_ANALYSIS_CHKPT ...   Passed   42.13 sec
      Start 1763: test_gdasapp_soca_JGDAS_GLOBAL_OCEAN_ANALYSIS_POST
34/54 Test #1763: test_gdasapp_soca_JGDAS_GLOBAL_OCEAN_ANALYSIS_POST ....   Passed   42.14 sec
      Start 1764: test_gdasapp_soca_JGDAS_GLOBAL_OCEAN_ANALYSIS_VRFY
35/54 Test #1764: test_gdasapp_soca_JGDAS_GLOBAL_OCEAN_ANALYSIS_VRFY ....   Passed  170.16 sec
      Start 1765: test_gdasapp_soca_socahybridweights
36/54 Test #1765: test_gdasapp_soca_socahybridweights ...................   Passed   10.12 sec
      Start 1766: test_gdasapp_soca_incr_handler
37/54 Test #1766: test_gdasapp_soca_incr_handler ........................   Passed   10.10 sec
      Start 1767: test_gdasapp_soca_ens_handler
38/54 Test #1767: test_gdasapp_soca_ens_handler .........................   Passed   10.11 sec
      Start 1768: test_gdasapp_snow_create_ens
39/54 Test #1768: test_gdasapp_snow_create_ens ..........................   Passed    0.45 sec
      Start 1769: test_gdasapp_snow_imsproc
40/54 Test #1769: test_gdasapp_snow_imsproc .............................   Passed    1.75 sec
      Start 1770: test_gdasapp_snow_apply_jediincr
41/54 Test #1770: test_gdasapp_snow_apply_jediincr ......................   Passed    3.56 sec
      Start 1771: test_gdasapp_snow_letkfoi_snowda
42/54 Test #1771: test_gdasapp_snow_letkfoi_snowda ......................   Passed    8.31 sec
      Start 1772: test_gdasapp_convert_bufr_adpsfc_snow
43/54 Test #1772: test_gdasapp_convert_bufr_adpsfc_snow .................   Passed    2.25 sec
      Start 1773: test_gdasapp_convert_bufr_adpsfc
44/54 Test #1773: test_gdasapp_convert_bufr_adpsfc ......................   Passed    2.96 sec
      Start 1774: test_gdasapp_convert_gsi_satbias
45/54 Test #1774: test_gdasapp_convert_gsi_satbias ......................   Passed    1.08 sec
      Start 1775: test_gdasapp_setup_atm_cycled_exp
46/54 Test #1775: test_gdasapp_setup_atm_cycled_exp .....................   Passed    0.60 sec
      Start 1776: test_gdasapp_atm_jjob_var_init
47/54 Test #1776: test_gdasapp_atm_jjob_var_init ........................   Passed   44.01 sec
      Start 1777: test_gdasapp_atm_jjob_var_run
48/54 Test #1777: test_gdasapp_atm_jjob_var_run .........................   Passed  106.12 sec
      Start 1778: test_gdasapp_atm_jjob_var_inc
49/54 Test #1778: test_gdasapp_atm_jjob_var_inc .........................   Passed   42.12 sec
      Start 1779: test_gdasapp_atm_jjob_var_final
50/54 Test #1779: test_gdasapp_atm_jjob_var_final .......................   Passed   42.12 sec
      Start 1780: test_gdasapp_atm_jjob_ens_init
51/54 Test #1780: test_gdasapp_atm_jjob_ens_init ........................   Passed   43.90 sec
      Start 1781: test_gdasapp_atm_jjob_ens_run
52/54 Test #1781: test_gdasapp_atm_jjob_ens_run .........................   Passed  266.14 sec
      Start 1782: test_gdasapp_atm_jjob_ens_final
53/54 Test #1782: test_gdasapp_atm_jjob_ens_final .......................   Passed   74.13 sec
      Start 1783: test_gdasapp_aero_gen_3dvar_yaml
54/54 Test #1783: test_gdasapp_aero_gen_3dvar_yaml ......................   Passed    0.32 sec

100% tests passed, 0 tests failed out of 54

Label Time Summary:
gdas-utils    =   3.09 sec*proc (9 tests)
script        =   3.09 sec*proc (9 tests)

Total Test time (real) = 1224.53 sec

Hercules, like Hera, runs Rocky, specifically Rocky Linux 9.1 (Blue Onyx). Recall that modulefiles/GDAS/hera.intel.lua differs from the Orion and Hercules modules in that not all the same modules are loaded. Also Hera modulefile does not load a python virtual environment. Does this provide any clues as to why we experience ctest failures on Hera?

@DavidNew-NOAA
Copy link
Copy Markdown
Collaborator Author

@RussTreadon-NOAA OK, I figured it out. In the increment converter, I was indexing the height dimension of an array to an index larger than the size of that dimension. I naively copied some Vader code that uses Atlas fieldsets, but it was for a variable on half-levels, so it had to be indexed to nLevels + 1, but ordinary grid-centered variables should be indexed to nLevels. My latest commit fixes that, and now test_gdasapp_fv3jedi_fv3inc passes on Hera.

@RussTreadon-NOAA
Copy link
Copy Markdown
Contributor

Thank you @DavidNew-NOAA for troubleshooting over the weekend. This is above and beyond effort.

I recompiled GDASApp with the updated utils/fv3jedi/fv3jedi_fv3inc.h. All 54 tests still pass on Hercules and Orion. 53 out of 54 tests pass on Hera. The only Hera failure is

98% tests passed, 1 tests failed out of 54

Label Time Summary:
gdas-utils    =   4.80 sec*proc (9 tests)
script        =   4.80 sec*proc (9 tests)

Total Test time (real) = 1017.26 sec

The following tests FAILED:
        1763 - test_gdasapp_soca_JGDAS_GLOBAL_OCEAN_ANALYSIS_VRFY (Failed)

A check of JGDAS_GLOBAL_OCEAN_ANALYSIS_VRFY.out shows the failure to be due to

+ slurm_script[52]: set +u
+ slurm_script[53]: conda activate eva
/var/spool/slurmd/job58232992/slurm_script: line 53: conda: command not found
+ slurm_script[1]: postamble slurm_script 1713126685 127

This failure is not related to this PR.

Copy link
Copy Markdown
Contributor

@RussTreadon-NOAA RussTreadon-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good. ctests using modified code pass.

Approve.

@RussTreadon-NOAA
Copy link
Copy Markdown
Contributor

FYI @guillaumevernieres and @AndrewEichmann-NOAA:

test_gdasapp_soca_JGDAS_GLOBAL_OCEAN_ANALYSIS_VRFY fails on Hera. JGDAS_GLOBAL_OCEAN_ANALYSIS_VRFY.out contains

+ slurm_script[52]: set +u
+ slurm_script[53]: conda activate eva
/var/spool/slurmd/job58232992/slurm_script: line 53: conda: command not found
+ slurm_script[1]: postamble slurm_script 1713126685 127

@CoryMartin-NOAA CoryMartin-NOAA merged commit c2fdf1e into develop Apr 15, 2024
@CoryMartin-NOAA CoryMartin-NOAA deleted the feature/update_jjob_tests branch April 15, 2024 12:23
danholdaway added a commit that referenced this pull request Apr 15, 2024
* upstream/develop:
  remove seviri from gdas_prototype_3d yaml (#1043)
  Fix GW jjob tests for upcoming GW PR #2420 (#1041)
  Fix test output for fv3jedi_fv3inc.h (#1039)
  Run g-w linker script before ctest for prepoceanobs task (#1034)
  Update femps and fv3-jedi-lm (#1036)
  Add ability for JEDI-to-FV3 increment converter to process ensembles (#1022)
  Add AVHRR/NOAA-15/18/19 assimilation to end-to-end GDASApp validation (#997)
  Catch error when trying to copy missing obs files from DATA to ROTDIR in prepoceanobs (#1028)
DavidNew-NOAA added a commit that referenced this pull request Jan 16, 2026
This PR addresses issue
[#1011](#1011), related to the
failure of the "gdasatmanlvar" jjob test due to a change in the name of
the "gdasatmanlvar" run script, and, with the upcoming Global Workflow
PR [#2420](NOAA-EMC/global-workflow#2420), the
impending failure of the "gdasatmanlfinal" jjob. This fixes the script
name in the "gdasatmanlvar" test and adds a new test for
"gdasatmanlfv3inc" which will fix the issure with "gdasatmanlfinal".

The "gdasatmanlfv3inc" and "gdasatmanlfinal" won't pass yes in GW
develop, but will after
[#2420](NOAA-EMC/global-workflow#2420) merges GW
feature/jediinc2fv3.

This PR is just a verbatim copy of @RussTreadon-NOAA 's work, taken from
his comment
[here](#1011 (comment)).
I re-ran the new tests, and they also passed for me in GW
feature/jediinc2fv3.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants