Skip to content

Ensure proper stack size setting on Orion#1086

Merged
BrianCurtis-NOAA merged 2 commits into
ufs-community:developfrom
GeorgeGayno-NOAA:bugfix/orion_ss
Aug 7, 2025
Merged

Ensure proper stack size setting on Orion#1086
BrianCurtis-NOAA merged 2 commits into
ufs-community:developfrom
GeorgeGayno-NOAA:bugfix/orion_ss

Conversation

@GeorgeGayno-NOAA
Copy link
Copy Markdown
Collaborator

@GeorgeGayno-NOAA GeorgeGayno-NOAA commented Jul 29, 2025

DESCRIPTION OF CHANGES:

Remove any setting of stack size in the Orion regression test scripts, which can
cause script crashes. Instead, users (including the role account) will be required to
set a proper stack size in their environment

TESTS CONDUCTED:

  • Run all consistency tests on Orion using both the role account and personal account (George). Done using 88cd6c0.
  • Run all consistency tests on Orion using both the role account and personal account (Brian)

Describe any additional tests performed.

DEPENDENCIES:

None.

DOCUMENTATION:

N/A

ISSUE:

Fixes #1079.

Orion. Add diagnostic print of user stack size.

Fixes ufs-community#1079.
@BrianCurtis-NOAA
Copy link
Copy Markdown
Collaborator

I'm timing out on my personal account. Looking into it.

@GeorgeGayno-NOAA
Copy link
Copy Markdown
Collaborator Author

When trying to run the rt.sh script under my own account from the cron, I was getting seg faults. I had to adjust my crontab to set the stack size. The crontab apparently does not run my .bashrc file. Here is what I did:

13 09 * * * ulimit -S -s unlimited; /bin/bash -l /work/noaa/da/ggayno/save/ufs_utils.git/UFS_UTILS/reg_tests/rt.sh > /home/ggayno/reg.log 2>&1

@BrianCurtis-NOAA
Copy link
Copy Markdown
Collaborator

When trying to run the rt.sh script under my own account from the cron, I was getting seg faults. I had to adjust my crontab to set the stack size. The crontab apparently does not run my .bashrc file. Here is what I did:

13 09 * * * ulimit -S -s unlimited; /bin/bash -l /work/noaa/da/ggayno/save/ufs_utils.git/UFS_UTILS/reg_tests/rt.sh > /home/ggayno/reg.log 2>&1

if you change https://github.com/GeorgeGayno-NOAA/UFS_UTILS/blob/9c21f89b7865716403d24abf360eb2e49d4c474b/reg_tests/rt.sh#L1 to #!/bin/bash -l that might help use the .bashrc file from the user.

@BrianCurtis-NOAA
Copy link
Copy Markdown
Collaborator

Started on  orion-login-2
Commit hash: 9c21f89b7865716403d24abf360eb2e49d4c474b

regrid_sfc consistency tests PASSED
weight_gen consistency tests PASSED
ocnice_prep consistency tests PASSED
cpld_gridgen consistency tests PASSED
chgres_cube consistency tests PASSED
grid_gen consistency tests PASSED
global_cycle consistency tests PASSED
ice_blend consistency tests PASSED
snow2mdl consistency tests PASSED

@BrianCurtis-NOAA
Copy link
Copy Markdown
Collaborator

My role-nems run was already posted here: #1079 (comment)

@GeorgeGayno-NOAA
Copy link
Copy Markdown
Collaborator Author

GeorgeGayno-NOAA commented Jul 30, 2025

Using 9c21f89, I was able to run all tests (under my personal account) using rt.sh from the cron using this command:

13 09 * * * ulimit -S -s unlimited; /bin/bash -l /work/noaa/da/ggayno/save/ufs_utils.git/UFS_UTILS/reg_tests/rt.sh > /home/ggayno/reg.log 2>&1

and got the following email:

Started on  orion-login-1
Commit hash: 9c21f89b7865716403d24abf360eb2e49d4c474b

regrid_sfc consistency tests PASSED
weight_gen consistency tests PASSED
ocnice_prep consistency tests PASSED
cpld_gridgen consistency tests PASSED
chgres_cube consistency tests PASSED
grid_gen consistency tests PASSED
global_cycle consistency tests PASSED
ice_blend consistency tests PASSED
snow2mdl consistency tests PASSED

@GeorgeGayno-NOAA
Copy link
Copy Markdown
Collaborator Author

Using 9c21f89, I was able to run all tests (under my personal account) using rt.sh from the cron using this command:

13 09 * * * ulimit -S -s unlimited; /bin/bash -l /work/noaa/da/ggayno/save/ufs_utils.git/UFS_UTILS/reg_tests/rt.sh > /home/ggayno/reg.log 2>&1

and got the following email:

Started on  orion-login-1
Commit hash: 9c21f89b7865716403d24abf360eb2e49d4c474b

regrid_sfc consistency tests PASSED
weight_gen consistency tests PASSED
ocnice_prep consistency tests PASSED
cpld_gridgen consistency tests PASSED
chgres_cube consistency tests PASSED
grid_gen consistency tests PASSED
global_cycle consistency tests PASSED
ice_blend consistency tests PASSED
snow2mdl consistency tests PASSED

I repeated the test, but logged in as role-nems. All tests passed.

@GeorgeGayno-NOAA
Copy link
Copy Markdown
Collaborator Author

@BrianCurtis-NOAA - let's merge #1076 first. This is low priority.

@BrianCurtis-NOAA
Copy link
Copy Markdown
Collaborator

@GeorgeGayno-NOAA I think our testing on the previous PR has concluded. Did we want to merge this one?

@GeorgeGayno-NOAA
Copy link
Copy Markdown
Collaborator Author

@GeorgeGayno-NOAA I think our testing on the previous PR has concluded. Did we want to merge this one?

We can merge this one next. Let me pull the latest updates from 'develop' to my branch and do a quick retest.

@GeorgeGayno-NOAA
Copy link
Copy Markdown
Collaborator Author

Retested my branch on Orion after the merge from 'develop' (88cd6c0). All tests passed under my account and the role account.

@BrianCurtis-NOAA - go ahead and merge this.

@BrianCurtis-NOAA BrianCurtis-NOAA merged commit dd84754 into ufs-community:develop Aug 7, 2025
4 checks passed
@GeorgeGayno-NOAA
Copy link
Copy Markdown
Collaborator Author

Oh, since the rt.sh script was updated for Orion. I would have the role account on that machine point to the head of 'develop'.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Adjust stack size when running on Orion

2 participants