Skip to content

Feature/detect frontera#2265

Closed
benjamin-cash wants to merge 5 commits into
ufs-community:developfrom
benjamin-cash:feature/detect_frontera
Closed

Feature/detect frontera#2265
benjamin-cash wants to merge 5 commits into
ufs-community:developfrom
benjamin-cash:feature/detect_frontera

Conversation

@benjamin-cash
Copy link
Copy Markdown

@benjamin-cash benjamin-cash commented May 3, 2024

Commit Queue Requirements:

  • Fill out all sections of this template.
  • All sub component pull requests have been reviewed by their code managers.
  • Run the full Intel+GNU RT suite (compared to current baselines) on either Hera/Derecho/Hercules
  • Commit 'test_changes.list' from previous step

Description:

Commit Message:

* UFSWM - This only affect detect_machine.sh
  

Priority:

  • Normal

Git Tracking

UFSWM:

Sub component Pull Requests:

  • None

Changes

Regression Test Changes (Please commit test_changes.list):

  • No Baseline Changes.

Input data Changes:

  • None.

Library Changes/Upgrades:

Testing Log:

  • RDHPCS
    • Hera
    • Orion
    • Hercules
    • Jet
    • Gaea
    • Derecho
  • WCOSS2
    • Dogwood/Cactus
    • Acorn
  • CI
  • opnReqTest (complete task if unnecessary)

@BrianCurtis-NOAA
Copy link
Copy Markdown
Collaborator

BrianCurtis-NOAA commented May 3, 2024

Letting @NOAA-EMC/teams/global-workflow-admins know so they can get a PR to bring in these changes as needed.

Thought that team links worked, but alas they don't. @aerorahul letting you know detect_machines.sh is getting a change.

@benjamin-cash
Copy link
Copy Markdown
Author

@BrianCurtis-NOAA - I have another set of small changes in this same line. One is a minor update to modules-setup.sh, and the other is to add ufs_frontera.intel.lua to modulefiles. Would it make sense to just add those into this PR, or should I let this one close out and then open a new one?

@BrianCurtis-NOAA
Copy link
Copy Markdown
Collaborator

You can keep making changes here, just let me know when you're done.

@uturuncoglu
Copy link
Copy Markdown
Collaborator

@benjamin-cash Are you also plaining to activate rt.sh on Frontera? I lost track of what I add to ufs-coastal but you could check it from there if you want. https://github.com/oceanmodeling/ufs-coastal. BTW, let me know if you need help. It would be nice to have frontera support in ufs-weather-model level. Thanks for doing it.

@benjamin-cash
Copy link
Copy Markdown
Author

@uturuncoglu - If you have rt.sh working for Frontera I would love to get that into ufs-weather-model as well. I think that can be separated from this PR though, so I won't add anything more to this one. (@BrianCurtis-NOAA)

@uturuncoglu
Copy link
Copy Markdown
Collaborator

@benjamin-cash Yes, rt.sh is working on UFS coastal and we are running ufs-coastal specific RTs with it. I sync ufs-coastal couple of days ago with ufs-weather-model. So, if you look at the diff from here you might see those changes around rt.sh. develop...oceanmodeling:ufs-coastal:feature/coastal_app This also has changes related to the ufs-coastal like extra components etc.

@benjamin-cash
Copy link
Copy Markdown
Author

@uturuncoglu - I tried running cpld_control_c192_p8 via the coastal rt.sh, and ran into errors. It looks like in default_var.sh the only variable added for frontera was TPN=56, and none of the other variables like INPES_dflt. Did I miss a step, or would those still need to be updated to run the rest of the tests?

@uturuncoglu
Copy link
Copy Markdown
Collaborator

@benjamin-cash yes. that needs to be extended. I have no experience about those numbers. maybe @BrianCurtis-NOAA could help about it

@uturuncoglu
Copy link
Copy Markdown
Collaborator

@benjamin-cash I just set a that number and it is working with coastal app but probably other RTs uses more platform specific parameters. If we could add others that would be great.

@benjamin-cash
Copy link
Copy Markdown
Author

@uturuncoglu - Makes sense. I'm going to try just copying in the settings for Derecho and see how far it makes it.

@uturuncoglu
Copy link
Copy Markdown
Collaborator

@benjamin-cash Okay. If we could an other platform as much as close to Frontera, it would be a good starting point.

@benjamin-cash
Copy link
Copy Markdown
Author

@uturuncoglu - That was enough to at least get the test started, but then it failed because it was looking for the wrong WW3_input_data_* directory, I'll have to track down where that is set.

@uturuncoglu
Copy link
Copy Markdown
Collaborator

@benjamin-cash probably it is pointing my input directory. Is there any place on Frontera that we could stage at least part of the UFS input data? Then maybe we could just put input files of coupled control p8 and point that one as disknm variable in rt.sh frontera part.

@benjamin-cash
Copy link
Copy Markdown
Author

There are a couple of options for data. One is that we could store the data on Ranch (Frontera archive system), and then stage the data to $SCRATCH on Frontera and recopy as needed. Or someone who is working on UFS on the system but not storing a lot of data otherwise could keep the files in their $WORK space. Do you know what the data volume is?

@uturuncoglu
Copy link
Copy Markdown
Collaborator

@benjamin-cash The input folder on Derecho is around 275G /glade/derecho/scratch/epicufsrt/ufs-weather-model/RT/NEMSfv3gfs/input-data-20240501. I think this includes all the data. But, if it is too much maybe we could selectively copy just couple of folders to run major tests.

@benjamin-cash
Copy link
Copy Markdown
Author

@uturuncoglu - At 275GB we can definitely find somewhere for that to sit. I don't think I have access to derecho at this point, could you globus that directory to $SCRATCH on Frontera and let me know where you put it? I can figure out a more permanent location for it from there.

@uturuncoglu
Copy link
Copy Markdown
Collaborator

@benjamin-cash Sure. Let me copy it over. I'll let you know when it is finished.

@uturuncoglu
Copy link
Copy Markdown
Collaborator

@benjamin-cash I copied files to /scratch1/01118/tg803972/RT/NEMSfv3gfs. Let me know if you have any issue to access it. I think you need to create develop-20240430 folder under this directory to run any RT. Then, maybe we could place baseline of couple of RT under develop-20240430. I am not sure running full test suite under Frontera is feasible or not at this point.

@benjamin-cash
Copy link
Copy Markdown
Author

@uturuncoglu - This discussion has wandered pretty far afield from the PR, so I'm going to move the discussion of the rt files to email. :)

@benjamin-cash
Copy link
Copy Markdown
Author

@BrianCurtis-NOAA - It looks like this is stuck waiting on reviews to come in (assuming updating didn't break anything just now), any chance you could help nudge this along?

@BrianCurtis-NOAA
Copy link
Copy Markdown
Collaborator

@benjamin-cash and @uturuncoglu Just confirming this has been tested on Frontera and works as expected with the tests you are able to run?

If so, @jkbk2004 can make sure to get this combined with another PR as we don't need to worry about baselines (as far as I understand).

@benjamin-cash
Copy link
Copy Markdown
Author

Hi @BrianCurtis-NOAA - The changes to detect_machine.sh is something I've used multiple times when I've downloaded the weather model so they should be good to go. The module changes have been somewhat overtaken by events - we now have spack-stack working via container on Frontera.

@uturuncoglu
Copy link
Copy Markdown
Collaborator

@benjamin-cash @BrianCurtis-NOAA I am testing in my end too. I'll update you soon.

@uturuncoglu
Copy link
Copy Markdown
Collaborator

@benjamin-cash I think there is an issue with ufs_frontera.intel.lua file. There are some html tag in it. So, probably it is corrupted.

@benjamin-cash
Copy link
Copy Markdown
Author

Hi @uturuncoglu - Yikes, yeah, no idea how I managed to do that. Could you point me to the module file you have tested on Frontera and I will replace?

@uturuncoglu
Copy link
Copy Markdown
Collaborator

@benjamin-cash We are using following with UFS Coastal - https://github.com/oceanmodeling/ufs-weather-model/blob/feature/coastal_app/modulefiles/ufs_frontera.intel.lua but I think you need to change the paths for your installation.

@uturuncoglu
Copy link
Copy Markdown
Collaborator

uturuncoglu commented Jul 8, 2024

I did not try to fix yours yet but if you want I could try.

@uturuncoglu
Copy link
Copy Markdown
Collaborator

@benjamin-cash BTW, it seems that you don't have any change in rt.sh side. Are you plaining to do it? UFS Coastal could still maintain its own changes related with the rt.sh.

@jkbk2004
Copy link
Copy Markdown
Collaborator

jkbk2004 commented Jul 8, 2024

When this pr is ready, we can combine with #2335 and #2278.

@benjamin-cash
Copy link
Copy Markdown
Author

Hi @uturuncoglu - for this PR the module file was meant to be an exact copy of yours and to use the non-container version of spack-stack on Frontera. I hadn't made any changes to rt.sh in this PR, but maybe it would make sense to fold them in as well.

@uturuncoglu
Copy link
Copy Markdown
Collaborator

@benjamin-cash Okay. That makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants