Faster, safer, compilation options targeted for HAFS operational upgrade by SamuelTrahanNOAA · Pull Request #1554 · ufs-community/ufs-weather-model

SamuelTrahanNOAA · 2022-12-31T17:50:45Z

Description

Adds a set of faster, safer, optimization options planned for the HAFS operational upgrade whose code freeze is in January 2023. The FV3 uses the operational HRRR optimization options, without -ip, since that option doesn't speed up the model on WCOSS2, and it does slow down the compilation. To enable this, use -DFASTER=ON when running the ufs-weather-model cmake.

Also, regardless of cmake options, the HYCOM no longer uses -xSSE4.2 option that shouldn't be turned on in any WCOSS2 build.

One test, hafs_regional_storm_following_1nest_atm_ocn_wav, runs with an executable that uses the new compilation options. That means the results from that test will change. The default optimization options, and the actions of other -Dwhatever flags, are not changed at all. Therefore, no other tests should change results.

FV3 Compilation Option Change

The new FV3 compilation options use value-safe optimization via -fp-model precise, rather than -fp-model fast like the default FV3 optimization options. They are far faster because just about every optimization is turned on, other than the ones that are turned off automatically due to not being value safe. Most importantly, -fp-model consistent is turned off. That option is often misunderstood; the only place it serves any real purpose is Jet, a heterogeneous cluster, where it is possible to get the same results across several machines. The cost of that is a massive performance penalty, even with -fp-model fast. Any other floating-point consistency safety already comes in -fp-model precise.

Oh, what was that? You don't see -fp-model fast in the current optimization options?

It is on by default. The Intel compiler has four categories of fp-model options. One of them is the value safety vs. speed trade-off: precise, fast, and fast2. If you don't specify any of those options, you get -fp-model fast. The -fp-model consistent is another category, and really serves no purpose if fp-model precise is enabled. (If you're wondering, the often used -fp-model strict turns on several options, including -fp-model precise.)

These options speed up the HAFS by 12% on WCOSS2.

HYCOM Compilation Option Change

HYCOM uses AMD chips with the Intel compiler, and none of the Intel compiler settings do a great job of optimizing for AMD. The craype module already loads the best platform target settings. Specifying additional ones will slow down the model, or cause it to crash, if they have any effect at all.

This HYCOM compilation option change has only a small effect on the runtime since FV3 is the slow component in our current configuration.

Top of commit queue on: TBD

There are no sub-component PRs.

Input data additions/changes

No changes are expected to input data.
~~There will be new input data.~~
~~Input data will be updated.~~

Anticipated changes to regression tests:

Changes are expected to the following tests: all tests that include "hafs" in their name.

The HYCOM compilation options have changed, so all tests that use HYCOM will have new results. That includes all tests with "hafs" in their name, but no other tests.

Subcomponents involved:

Combined with PR's (If Applicable):

Commit Queue Checklist:

Link PR's from all sub-components involved
Confirm reviews completed in sub-component PR's
Add all appropriate labels to this PR.
Run full RT suite on either Hera/Cheyenne with both Intel/GNU compilers
Add list of any failed regression tests to "Anticipated changes to regression tests" section.

Linked PR's and Issues:

Fixes #1553

Testing Day Checklist:

This PR is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR.
~~Move new/updated input data on RDHPCS Hera and propagate input data changes to all supported systems.~~

Testing Log (for CM's):

RDHPCS
- Intel
  - Hera
  - Orion
  - Jet
  - Gaea
  - Cheyenne
- GNU
  - Hera
  - Cheyenne
WCOSS2
- Dogwood/Cactus
- Acorn
CI
- Completed
opnReqTest
- N/A
- Log attached to comment

SamuelTrahanNOAA · 2022-12-31T17:54:55Z

It would be great if @BinLiu-NOAA @ZhanZhang-NOAA and @BijuThomas-NOAA could review this, as they're the HAFS developers that helped test these options on WCOSS2.

BrianCurtis-NOAA · 2023-01-01T17:43:27Z

It looks like you ran RT with new baselines. Since it does not look like baselines are changed can you please run against current baselines ./rt.sh -e > rt.out 2>&1 &. We would only expect the new RT to fail. You can attach that log to a comment instead of needing to push it to your repo if you'd like.

Also with any new RT, we need ORT to be run with ./opnReqTest -e -c thr,mpi,dcp,rst,bit,dbg,fhz -n hafs_regional_storm_following_1nest_atm_ocn_wav_supafast > ort.out 2>&1 & it will reject some of the tests as we don't support all ORT with regional applications. So you can keep removing those until it starts officially. Please put that log into a comment in this PR.

SamuelTrahanNOAA · 2023-01-01T18:48:12Z

It looks like you ran RT with new baselines. Since it does not look like baselines are changed can you please run against current baselines ./rt.sh -e > rt.out 2>&1 &. We would only expect the new RT to fail. You can attach that log to a comment instead of needing to push it to your repo if you'd like.

You are incorrect. I ran against old baselines by rsyncing the BL_DATE baseline directory into my directory. Only the new test has a new directory.

BrianCurtis-NOAA · 2023-01-01T21:48:36Z

It looks like you ran RT with new baselines. Since it does not look like baselines are changed can you please run against current baselines ./rt.sh -e > rt.out 2>&1 &. We would only expect the new RT to fail. You can attach that log to a comment instead of needing to push it to your repo if you'd like.

You are incorrect. I ran against old baselines by rsyncing the BL_DATE baseline directory into my directory. Only the new test has a new directory.

OK great. It didn't appear that way with your log, which confused me. Thanks.

BrianCurtis-NOAA · 2023-01-01T21:55:42Z

@jkbk2004 @DeniseWorthen Please review code before commit queue assignment.

jkbk2004 · 2023-01-01T23:56:00Z

So, performance impact is mostly expected on wcoss2, right?

SamuelTrahanNOAA · 2023-01-02T01:06:33Z

So, performance impact is mostly expected on wcoss2, right?

No. It'll be faster everywhere.

SamuelTrahanNOAA · 2023-01-02T14:51:30Z

Wasn't quite awake in my last post. There is no FV3 PR that I forgot to make. All of the build options are set at the ufs-weather-model level; nothing below that overrides them (I checked).

…lag for that

junwang-noaa · 2023-01-02T15:10:20Z

@SamuelTrahanNOAA Have your run the reproducibility tests with the fast compile options? Do we have run-to-run reproducibility? Will the results change with different threads? If I remember correctly, the hafs test does not reproduce with different MPI tasks or in restart runs yet.

SamuelTrahanNOAA · 2023-01-02T15:51:01Z

Have your run the reproducibility tests with the fast compile options?

I did, and so did other people.

Do we have run-to-run reproducibility?

Yes.

If I remember correctly, the hafs test does not reproduce with different MPI tasks

No, it still doesn't. Several people tried several approaches to get this working, but we've run out of time for the implementation.

or in restart runs yet

I don't even know how to set up a coupled wave-ocean-atmosphere restart test.

Will the results change with different threads?

I don't think this has been tested with any optimization options (or lack thereof) because the main focus was MPI task decomposition. Someone else would have to do this test on Dogwood, since Cactus is unusable at the moment, and Acorn is too small.

SamuelTrahanNOAA · 2023-01-02T15:55:16Z

The automatic github tests are failing due to an internal error:

cat: /home/runner/id_file: No such file or directory

I manually checked, and my repo is up-to-date with develop.

SamuelTrahanNOAA · 2023-01-02T15:59:56Z

Have your run the reproducibility tests with the fast compile options?

@junwang-noaa Are you asking me to run all of the tests in rt.conf this way? I don't even know if they'll work. Some may fail to build, getting the dreaded "internal compiler error," which happens more often at higher optimization levels.

My only target for this was the two operational HAFS configurations, due to lack of time; the model took 50% more resources than it should, but now it runs in its window with a minute or three to spare. If people want to use this for other configurations, they'll have to test it first.

I'll try an "all suites" build with the SUPAFAST=ON and see what happens.

DusanJovic-NOAA · 2023-01-11T18:44:35Z

I also ran #1558 branch code on Gaea and cpld_control_c192_p8 test passed against the current baseline.

jkbk2004 · 2023-01-11T18:45:18Z

Yeah, it's running on my side. Hopefully, we can finish today.

jkbk2004 · 2023-01-11T18:47:26Z

@SamuelTrahanNOAA I like to confirm one thing. this pr only changes affect hafs_regional_storm_following_1nest_atm_ocn_wav. no imapct on all other hafs tests, right?

jkbk2004 · 2023-01-11T19:49:01Z

@SamuelTrahanNOAA @BrianCurtis-NOAA All tests are done. the gaea issue was my mistake. All ok now. This pr affects only on hafs_regional_storm_following_1nest_atm_ocn_wav. We are ready to merge the pr.

BrianCurtis-NOAA

@SamuelTrahanNOAA @BrianCurtis-NOAA All tests are done. the gaea issue was my mistake. All ok now. This pr affects only on hafs_regional_storm_following_1nest_atm_ocn_wav. We are ready to merge the pr.

@jkbk2004 We haven't confirmed from @SamuelTrahanNOAA why the hafs tests don't fail comparing to previous baselines. We need that to continue.

BrianCurtis-NOAA · 2023-01-11T20:01:38Z

Compared on WCOSS2 to previous baselines and hafs_regional_atm_ocn, hafs_regional_atm_ocn_wav, hafs_regional_storm_following_1nest_atm_ocn, and hafs_regional_storm_following_1nest_atm_ocn_wav all do not compare sucessfully. They all required new baselines.

jkbk2004 · 2023-01-11T20:50:19Z

Compared on WCOSS2 to previous baselines and hafs_regional_atm_ocn, hafs_regional_atm_ocn_wav, hafs_regional_storm_following_1nest_atm_ocn, and hafs_regional_storm_following_1nest_atm_ocn_wav all do not compare sucessfully. They all required new baselines.

@BrianCurtis-NOAA cmake/Intel.cmake lines 23-27 same as develop branch. so all cases reproduce except hafs_regional_storm_following_1nest_atm_ocn_wav

BinLiu-NOAA · 2023-01-11T20:56:30Z

@jkbk2004, @BrianCurtis-NOAA, there is a HYCOM build option change taking out "-xSSE4.2". So some other HAFS HYCOM coupled RTs could also get result changing.

jkbk2004 · 2023-01-11T21:00:57Z

@jkbk2004, @BrianCurtis-NOAA, there is a HYCOM build option change taking out "-xSSE4.2". So some other HAFS HYCOM coupled RTs could also get result changing.

Let me confirm again with some hafs tests on hera. Weird thing is all other hafs cases not affacted.

DeniseWorthen · 2023-01-11T21:16:40Z

@BinLiu-NOAA Doesn't the removal of -xSSE4.2 only happen for Gaea? Meaning that testing on Hera won't tell you anything about how the tests will compare on Gaea?

BinLiu-NOAA · 2023-01-11T21:27:22Z

@BinLiu-NOAA Doesn't the removal of -xSSE4.2 only happen for Gaea? Meaning that testing on Hera won't tell you anything about how the tests will compare on Gaea?

@DeniseWorthen, the removal of "-xSSE4.2" is meant to happen to at least on WCOSS2 (maybe also on Hera/Orion/Jet). I think @SamuelTrahanNOAA made Gaea to keep using "-xSSE4.2" (since it failed on Gaea after taking it out).

DeniseWorthen · 2023-01-11T21:30:25Z

@BinLiu-NOAA Thanks, I guess I had it reversed!

SamuelTrahanNOAA · 2023-01-11T22:26:05Z

What is the status of this PR?

BinLiu-NOAA · 2023-01-11T22:35:07Z

@SamuelTrahanNOAA, I think we are waiting for @jkbk2004 to double check if the HAFS HYCOM coupling RTs also got result changing on Hera (which may or may not be the case). But, otherwise, it looks all the regression tests have gone through.

jkbk2004 · 2023-01-11T22:53:53Z

Confirms hafs_regional_atm_ocn is not changing on hera. So all hafs cases on hera/orion/gaea/jet/cheyenne same results except hafs_regional_storm_following_1nest_atm_ocn_wav. So, we can move one at least if "-xSSE4.2" handling behaves as expected on wcoss2. @BinLiu-NOAA what do you think?

BinLiu-NOAA · 2023-01-11T22:58:07Z

@jkbk2004, @BrianCurtis-NOAA, @SamuelTrahanNOAA, we would like to remove using "-xSSE4.2" option for all platforms (except on Gaea, because it fails). Even if it turns out to change results, we can just accept the fact of result changing for those HAFS-HYCOM coupled RTs. Thanks!

jkbk2004 · 2023-01-11T23:02:34Z

I agree. we can investigate more across machines. I will create an issue.

jkbk2004 · 2023-01-11T23:03:17Z

@BrianCurtis-NOAA let's go ahead to merge.

BinLiu-NOAA · 2023-01-11T23:11:08Z

I agree. we can investigate more across machines. I will create an issue.

@jkbk2004, Thanks for confirming all HAFS HYCOM coupling RTs (except the one except hafs_regional_storm_following_1nest_atm_ocn_wav which modified to use the FASTER build option) can reproduce the previous baseline results on hera/orion/gaea/jet/cheyenne. And the HAFS HYCOM RTs are expected to change results on WCOSS2 (since on WCOSS2 we are not recommended to this -xSSE4.2 option). And that's why we are taking out this -xSSE4.2 option in this PR. So, for me, all the result changing the corresponding HAFS RTs are expected/understandable. Thus, I do not see any issue to be concerned/created. Thanks!

jkbk2004 · 2023-01-11T23:30:43Z

created an issue #1565

SamuelTrahanNOAA added 5 commits December 22, 2022 23:58

SUPAFAST option sets build flags ideal for HAFS

23c9a79

add test for HAFS with -DSUPAFAST=ON

3775f0f

Merge remote-tracking branch 'community/develop' into merge-supafast

4b35d10

update supafast test

66f096b

bug fixes, and hera.intel passed

4350554

BrianCurtis-NOAA added the No Baseline Change No Baseline Change label Jan 1, 2023

BrianCurtis-NOAA added the Ready for Commit Queue The PR is ready for the Commit Queue. All checkboxes in PR template have been checked. label Jan 1, 2023

BrianCurtis-NOAA requested review from BrianCurtis-NOAA, DeniseWorthen and jkbk2004 January 1, 2023 21:53

BrianCurtis-NOAA removed the Ready for Commit Queue The PR is ready for the Commit Queue. All checkboxes in PR template have been checked. label Jan 1, 2023

BrianCurtis-NOAA approved these changes Jan 1, 2023

View reviewed changes

BinLiu-NOAA reviewed Jan 2, 2023

View reviewed changes

Comment thread HYCOM-interface/CMakeLists.txt Outdated

BinLiu-NOAA reviewed Jan 2, 2023

View reviewed changes

Comment thread HYCOM-interface/CMakeLists.txt Outdated

BinLiu-NOAA reviewed Jan 2, 2023

View reviewed changes

Comment thread cmake/Intel.cmake

BinLiu-NOAA reviewed Jan 2, 2023

View reviewed changes

Comment thread tests/rt.conf

SamuelTrahanNOAA added 2 commits January 2, 2023 14:52

remove _supafast test

85f9fd8

turn off bad -x option in HYCOM by default; do not require SUPAFAST f…

609f921

…lag for that

jkbk2004 added 2 commits January 11, 2023 14:41

turn off regional_atmaq_debug on gaea: time limit

3e308e7

add gaea.intel RT log: passed

0348070

BrianCurtis-NOAA requested changes Jan 11, 2023

View reviewed changes

jkbk2004 self-requested a review January 11, 2023 23:31

jkbk2004 approved these changes Jan 11, 2023

View reviewed changes

BrianCurtis-NOAA approved these changes Jan 12, 2023

View reviewed changes

jkbk2004 merged commit de1546f into ufs-community:develop Jan 12, 2023

BrianCurtis-NOAA mentioned this pull request Jan 30, 2023

RT Failure using only -n in rt.sh still causes RT to pass. #1416

Closed

grantfirl mentioned this pull request Mar 3, 2025

Introduce prognostic updraft velocity to saSAS and C3 convection schemes #2567

Merged

14 tasks

DusanJovic-NOAA mentioned this pull request Oct 8, 2025

Long initialization times for GFS forecasts #2831

Closed

DusanJovic-NOAA mentioned this pull request Oct 17, 2025

Expand support for FASTER build option to more subcomponents #2943

Closed

Conversation

SamuelTrahanNOAA commented Dec 31, 2022 • edited by jkbk2004 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

FV3 Compilation Option Change

HYCOM Compilation Option Change

Top of commit queue on: TBD

Input data additions/changes

Anticipated changes to regression tests:

Subcomponents involved:

Combined with PR's (If Applicable):

Commit Queue Checklist:

Linked PR's and Issues:

Testing Day Checklist:

Testing Log (for CM's):

Uh oh!

SamuelTrahanNOAA commented Dec 31, 2022

Uh oh!

BrianCurtis-NOAA commented Jan 1, 2023

Uh oh!

SamuelTrahanNOAA commented Jan 1, 2023

Uh oh!

BrianCurtis-NOAA commented Jan 1, 2023

Uh oh!

BrianCurtis-NOAA commented Jan 1, 2023

Uh oh!

jkbk2004 commented Jan 1, 2023

Uh oh!

SamuelTrahanNOAA commented Jan 2, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SamuelTrahanNOAA commented Jan 2, 2023

Uh oh!

junwang-noaa commented Jan 2, 2023

Uh oh!

SamuelTrahanNOAA commented Jan 2, 2023

Uh oh!

SamuelTrahanNOAA commented Jan 2, 2023

Uh oh!

SamuelTrahanNOAA commented Jan 2, 2023

Uh oh!

DusanJovic-NOAA commented Jan 11, 2023

Uh oh!

jkbk2004 commented Jan 11, 2023

Uh oh!

jkbk2004 commented Jan 11, 2023

Uh oh!

jkbk2004 commented Jan 11, 2023

Uh oh!

BrianCurtis-NOAA left a comment

Choose a reason for hiding this comment

Uh oh!

BrianCurtis-NOAA commented Jan 11, 2023

Uh oh!

jkbk2004 commented Jan 11, 2023

Uh oh!

BinLiu-NOAA commented Jan 11, 2023

Uh oh!

jkbk2004 commented Jan 11, 2023

Uh oh!

DeniseWorthen commented Jan 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BinLiu-NOAA commented Jan 11, 2023

Uh oh!

DeniseWorthen commented Jan 11, 2023

Uh oh!

SamuelTrahanNOAA commented Jan 11, 2023

Uh oh!

BinLiu-NOAA commented Jan 11, 2023

Uh oh!

jkbk2004 commented Jan 11, 2023

Uh oh!

BinLiu-NOAA commented Jan 11, 2023

Uh oh!

jkbk2004 commented Jan 11, 2023

Uh oh!

SamuelTrahanNOAA commented Dec 31, 2022 •

edited by jkbk2004

Loading

DeniseWorthen commented Jan 11, 2023 •

edited

Loading