Skip to content

CCPP acceptance: fv3_stochy / fv3_ccpp stochy bit-for-bit identical#205

Merged
climbfuji merged 3 commits into
NCAR:masterfrom
climbfuji:stochy_bitforbit_prod
Feb 14, 2019
Merged

CCPP acceptance: fv3_stochy / fv3_ccpp stochy bit-for-bit identical#205
climbfuji merged 3 commits into
NCAR:masterfrom
climbfuji:stochy_bitforbit_prod

Conversation

@climbfuji
Copy link
Copy Markdown
Collaborator

@climbfuji climbfuji commented Feb 7, 2019

These PRs reduce the optimization of a particular routine deep inside the stochastic physics code to obtain bit-for-bit identical results of the (fv3_control based) regression tests fv3_stochy / fv3_ccpp_stochy (static build) in PROD mode.

@climbfuji
Copy link
Copy Markdown
Collaborator Author

@climbfuji
Copy link
Copy Markdown
Collaborator Author

climbfuji commented Feb 8, 2019

Standard regression tests (Theia, Intel 18, REPRO) all pass as expected.

rt_ccpp_hybrid.log
rt_ccpp_ref_create.log
rt_ccpp_standalone.log
rt_ccpp_static.log
rt_full.log

@climbfuji
Copy link
Copy Markdown
Collaborator Author

CCPP acceptance tests passed/failed as expected, list of failing tests (down one):

fv3_ccpp_stretched
fv3_ccpp_stretched_nest
fv3_ccpp_regional_control
fv3_ccpp_regional_restart
fv3_ccpp_regional_quilt
fv3_ccpp_regional_c768
fv3_ccpp_control_debug
fv3_ccpp_stretched_nest_debug
fv3_ccpp_gfdlmp
fv3_ccpp_csawmgshoc
fv3_ccpp_csawmg3shoc127
fv3_ccpp_csawmg
fv3_ccpp_gfdlmp_32bit
fv3_ccpp_cpt

rt_ccpp_ref_for_acceptance_create.log
rt_ccpp_static_for_acceptance.log

@climbfuji
Copy link
Copy Markdown
Collaborator Author

Ready to merge!

Copy link
Copy Markdown
Contributor

@llpcarson llpcarson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved

Copy link
Copy Markdown
Collaborator

@grantfirl grantfirl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All I could find were whitespace changes for get_stochy_pattern.F90 (correct?). So, the !DIR$ OPTIMIZE:1 is what lowers optimization for one subroutine within a file? I didn't even know that was possible. Cool.

@climbfuji
Copy link
Copy Markdown
Collaborator Author

Yes, !DIR$ OPTIMIZE:N is an Intel-specific compiler directive, ignored by other compilers. It lowers the optimization for this routine only, not for any of the following/preceding subroutines, and also not for any "contained" subroutines. N can be 0,1,2 and I believe even higher. You can also say !DIR$ NOOPTIMIZE instead of !DIR$ OPTIMIZE:0.
These directives are a little confusing, because they work differently. Another one is !DIR$ NOFMA, which disables fused-multiply-adds (FMAs) that come with AVX2 from the point in the file where the directive is found until a !DIR$ FMA is detected. (just for your info)

@climbfuji climbfuji merged commit 154ce4b into NCAR:master Feb 14, 2019
@climbfuji climbfuji deleted the stochy_bitforbit_prod branch June 27, 2022 03:24
Qingfu-Liu pushed a commit to Qingfu-Liu/ccpp-physics that referenced this pull request May 18, 2024
Combination PR for ozone diagnostics, metadata intent bugfixes, sfcsub.F landmask bugfix, and canopy resistance output
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants