Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion FV3
Submodule FV3 updated from edcdfc to 8198ce
282 changes: 141 additions & 141 deletions tests/RegressionTests_cheyenne.gnu.log

Large diffs are not rendered by default.

912 changes: 456 additions & 456 deletions tests/RegressionTests_cheyenne.intel.log

Large diffs are not rendered by default.

902 changes: 451 additions & 451 deletions tests/RegressionTests_gaea.intel.log

Large diffs are not rendered by default.

282 changes: 141 additions & 141 deletions tests/RegressionTests_hera.gnu.log

Large diffs are not rendered by default.

930 changes: 465 additions & 465 deletions tests/RegressionTests_hera.intel.log

Large diffs are not rendered by default.

848 changes: 424 additions & 424 deletions tests/RegressionTests_jet.intel.log

Large diffs are not rendered by default.

918 changes: 459 additions & 459 deletions tests/RegressionTests_orion.intel.log

Large diffs are not rendered by default.

574 changes: 287 additions & 287 deletions tests/RegressionTests_wcoss_cray.log

Large diffs are not rendered by default.

1,120 changes: 605 additions & 515 deletions tests/RegressionTests_wcoss_dell_p3.log

Large diffs are not rendered by default.

10 changes: 5 additions & 5 deletions tests/rt.sh
Original file line number Diff line number Diff line change
Expand Up @@ -300,12 +300,12 @@ elif [[ $MACHINE_ID = jet.* ]]; then

QUEUE=batch
COMPILE_QUEUE=batch
ACCNR=h-nems
ACCNR=${ACCNR:-h-nems}
PARTITION=xjet
DISKNM=/lfs4/HFIP/h-nems/emc.nemspara/RT
dprefix=/lfs4/HFIP/h-nems/$USER
STMP=$dprefix/RT_BASELINE
PTMP=$dprefix/RT_RUNDIRS
dprefix=${dprefix:-/lfs4/HFIP/$ACCNR/$USER}
STMP=${STMP:-$dprefix/RT_BASELINE}
PTMP=${PTMP:-$dprefix/RT_RUNDIRS}

SCHEDULER=slurm
cp fv3_conf/fv3_slurm.IN_jet fv3_conf/fv3_slurm.IN
Expand Down Expand Up @@ -467,7 +467,7 @@ if [[ $TESTS_FILE =~ '35d' ]] || [[ $TESTS_FILE =~ 'weekly' ]]; then
TEST_35D=true
fi

BL_DATE=20211230
BL_DATE=20220103
if [[ $MACHINE_ID = hera.* ]] || [[ $MACHINE_ID = orion.* ]] || [[ $MACHINE_ID = cheyenne.* ]] || [[ $MACHINE_ID = gaea.* ]] || [[ $MACHINE_ID = jet.* ]] || [[ $MACHINE_ID = s4.* ]]; then
RTPWD=${RTPWD:-$DISKNM/NEMSfv3gfs/develop-${BL_DATE}/${RT_COMPILER^^}}
else
Expand Down
15 changes: 8 additions & 7 deletions tests/rt_utils.sh
Original file line number Diff line number Diff line change
Expand Up @@ -385,8 +385,10 @@ rocoto_create_compile_task() {

# serialize WW3 builds. FIXME
DEP_STRING=""
if [[ ${MAKE_OPT^^} =~ "WW3=Y" && ${COMPILE_PREV_WW3_NR} != '' ]]; then
DEP_STRING="<dependency><taskdep task=\"compile_${COMPILE_PREV_WW3_NR}\"/></dependency>"
if [[ ${COMPILE_PREV_WW3_NR} != '' ]]; then
if [[ ${MAKE_OPT^^} =~ "-DAPP=ATMW" ]] || [[ ${MAKE_OPT^^} =~ "-DAPP=S2SW" ]] || [[ ${MAKE_OPT^^} =~ "-DAPP=HAFSW" ]] || [[ ${MAKE_OPT^^} =~ "-DAPP=HAFS-ALL" ]] ; then
DEP_STRING="<dependency><taskdep task=\"compile_${COMPILE_PREV_WW3_NR}\"/></dependency>"
fi
fi

NATIVE=""
Expand All @@ -401,7 +403,7 @@ rocoto_create_compile_task() {
BUILD_CORES=24
NATIVE="<exclusive></exclusive> <envar><name>PATHTR</name><value>&PATHTR;</value></envar>"
fi
if [[ ${MACHINE_ID} == jet ]]; then
if [[ ${MACHINE_ID} == jet.* ]]; then
BUILD_WALLTIME="01:00:00"
fi
if [[ ${MACHINE_ID} == orion.* ]]; then
Expand All @@ -421,8 +423,7 @@ rocoto_create_compile_task() {
<partition>${PARTITION}</partition>
<cores>${BUILD_CORES}</cores>
<walltime>${BUILD_WALLTIME}</walltime>
<stdout>&RUNDIR_ROOT;/compile_${COMPILE_NR}/out</stdout>
<stderr>&RUNDIR_ROOT;/compile_${COMPILE_NR}/err</stderr>
<join>&RUNDIR_ROOT;/compile_${COMPILE_NR}.log</join>
${NATIVE}
</task>
EOF
Expand Down Expand Up @@ -459,8 +460,8 @@ rocoto_create_run_task() {
<partition>${PARTITION}</partition>
<nodes>${NODES}:ppn=${TPN}</nodes>
<walltime>00:${WLCLK}:00</walltime>
<stdout>&RUNDIR_ROOT;/${TEST_NAME}${RT_SUFFIX}/out</stdout>
<stderr>&RUNDIR_ROOT;/${TEST_NAME}${RT_SUFFIX}/err</stderr>
<stdout>&RUNDIR_ROOT;/${TEST_NAME}${RT_SUFFIX}.out</stdout>
<stderr>&RUNDIR_ROOT;/${TEST_NAME}${RT_SUFFIX}.err</stderr>
${NATIVE}
</task>
EOF
Expand Down
5 changes: 4 additions & 1 deletion tests/run_compile.sh
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,10 @@ if [[ $ROCOTO = 'false' ]]; then
submit_and_wait job_card
else
chmod u+x job_card
./job_card
( ./job_card 2>&1 1>&3 3>&- | tee err ) 3>&1 1>&2 | tee out
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this and the similar change in run_test.sh necessary? I ran RT with and without these changes and did not notice any difference in any of the log files.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I'm mistaken, you need to redirect via tee to see logs in the job's own log file. Otherwise it only goes to the out and err files. There is similar code (but less reliable) in the non-Rocoto portions of the script.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point we should remove all rocoto code, we do not use it and do not test it. For parallel execution we use ecflow.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll retest this. I know I added this change in two locations. Let me check if I really need them in both locations. This may take a whole day due to queue wait times.

Copy link
Copy Markdown
Collaborator Author

@SamuelTrahanNOAA SamuelTrahanNOAA Jan 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DusanJovic-NOAA - It is premature to remove Rocoto support since most people outside EMC and NCO have no experience with ecFlow. I can keep it working on Jet and Hera until that interferes with my official duties.

Edit: I meant to say no experience with ecFlow.

Copy link
Copy Markdown
Collaborator Author

@SamuelTrahanNOAA SamuelTrahanNOAA Jan 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MinsukJi-NOAA I retested. Yes, the changes are necessary. This is half of the bug fix for the critical bug in the workflow: the job deleted the out and err files Rocoto created in its #SBATCH lines, causing the rt_test.sh to conclude the test failed (because its grep failed). Now, Rocoto and the job_card each have their own log files, and this line ensures the logging goes to both.

# The above shell redirection copies stdout to "out" and stderr to "err"
# while still sending them to stdout and stderr. It does this without
# relying on bash-specific extensions or non-standard OS features.
fi

ls -l ${PATHTR}/tests/fv3_${COMPILE_NR}.exe
Expand Down
5 changes: 4 additions & 1 deletion tests/run_test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -238,7 +238,10 @@ else
submit_and_wait job_card
else
chmod u+x job_card
./job_card
( ./job_card 2>&1 1>&3 3>&- | tee err ) 3>&1 1>&2 | tee out
# The above shell redirection copies stdout to "out" and stderr to "err"
# while still sending them to stdout and stderr. It does this without
# relying on bash-specific extensions or non-standard OS features.
fi

fi
Expand Down