update izumi and implement intel20 workaround#460
Conversation
eclare108213
left a comment
There was a problem hiding this comment.
I am approving this but it needs to be put into an issue for further work.
ice_transport_remap seems to have persistent seg fault issues, but they appear in different places; there's a comment about one of the omp directives seg faulting, and in the past, I've had to unroll a loop in the transport (I no longer remember which one) in order for optimization to not create a seg fault. Is there a particular (set of) variable(s) that need to be allocated, or is it really all of them? Is there a reason to not allocate here all the time? Would it help to move to a vector version of the transport (e.g. the new unstructured-grid code in MPAS)?
|
@CICE-Consortium/devteam (or anyone reading this): |
|
I can't say for sure whether all the array need to be allocated to workaround this issue. I'm pretty sure that's not the case, but I did try to pin down whether it was one or a few arrays that were causing the problem and couldn't find any array alone that created the error. My sense is that it's a memory issue, not a problem with a particular array declaration or usage. I played around a lot with the argument declarations, local variable declarations, and source code. I even did things like turning off all the code in the subroutine while leaving all the declarations on, I degraded the interface down to no arguments, and many other things to try to isolate the issue. I also tried a number of compiler flags to try to increase the memory management. The error was surprising robust and came down to these local arrays. What I'll do is merge this PR and start testing across a wider set of machines. I'll also try to find another intel20 compiler on another machine to see if the problem is seen on another machine. I think that's a very good idea. Once we have a bit more information, we can decide what the next steps should be. |
PR checklist
Short (1 sentence) summary of your PR:
Update izumi and implement intel20 workaround
Developer(s):
apcraig
Suggest PR reviewers from list in the column to the right.
Please copy the PR test results link or provide a summary of testing completed below.
Full test suite run on izumi
- believe this is a compiler bug
- recommend this be reviewed further and hopefully removed at a later time
- have verified results are reproducible run-to-run
- have reviewed restart read/write diagnostic, they look fine
- requires more debugging
How much do the PR code changes differ from the unmodified code?
Does this PR create or have dependencies on Icepack or any other models?
Does this PR add any new test cases?
Is the documentation being updated? ("Documentation" includes information on the wiki or in the .rst files from doc/source/, which are used to create the online technical docs at https://readthedocs.org/projects/cice-consortium-cice/. A test build of the technical docs will be performed as part of the PR testing.)
Please provide any additional information or relevant details below:
Updated izumi due to upgrade, old compilers no longer available
Added feature to turn on machine/compiler dependent cpps via the machine files in the scripts.
Fixed nt_zbgc_frac initialization error (thought we fixed this before)
Updated icepack, includes a fix from Izumi updates Icepack#321 needed to pass some debug mode tests.
Full test suite run on izumi
- believe this is a compiler bug
- recommend this be reviewed further and hopefully removed at a later time
- have verified results are reproducible run-to-run
- have reviewed restart read/write diagnostic, they look fine
- requires more debugging