LANL machine files: add badger, remove wolf, pinto#240
Conversation
|
You can see recent test results here, https://github.com/CICE-Consortium/Test-Results/wiki/icepack_by_hash for comparison. I don't think those are failing elsewhere. Looks like maybe something related to "1x1" build/run or something in serial infrastructure? |
|
All of the Icepack tests are serial, so it's not that. The question is whether I try to debug this now, or if we just go with it and I'll try to figure out what's going on later. My CICE base_suite appears to have run fine. |
|
Of course, everything is 1x1. So, are you getting both the initial and restart to complete. Then the restart comparison is failing? Is it worth trying those tests again to see if it was just a hickup of some kind? I'm ok moving forward with the icepack release as is. I think the CICE testing is more important in some ways. But we should figure this out if it's repeatable. Also OK holding things up a bit if you want to look into it. |
|
The runs completed and they really are different. For sanity's sake, could you re-run these 3 tests? I'll do the same. |
|
I'm running these tests now on conrad, more soon. |
|
It might be an initialization issue. I turned on the debug option for the tests, and they all passed. I think debug sets -g, which initializes everything to 0. This issue could have appeared when I removed the initializations from the tracer index query routine in #230, and maybe the other machines initialize stuff differently from what badger is doing. Pesky little beast (I saw one here in NM a few weeks ago, actually a pretty amusing critter to watch). Maybe I can find the right place to initialize these indices... |
|
I ran the three tests on conrad on four compilers, all pass. |
|
One of the flags was not initialized with all the rest, but that wasn't the problem. I'll run another base_suite to make sure that change is okay. The test failures were occurring due to -O2, so I backed off to -O1. This is devilishly difficult to debug, so I'll make an issue and maybe we'll figure it out eventually. |
|
Great! |
|
Feel free to merge when you're ready. |
* move write stmts, cleanup unused vars * replace sil_data_type and nit_data_type with bgc_data_type * documentation * use ocn_data_type for SSS
Updated machine files for LANL institutional computing.
This is not ready to merge yet -- see test failures below.
Developer(s): E. Hunke
Please suggest code Pull Request reviewers in the column at right.
Are the code changes bit for bit, different at roundoff level, or more substantial? BFB
To verify that this PR passes the initial QC tests, lease include the link to test results
or paste in below the summary block from the bottom of the testing output.
Does this PR create or have dependencies on CICE or any other models?
Is the documentation being updated with this PR? (Y/N) no
If not, does the documentation need to be updated separately at a later time? (Y/N) no?
I don't think we list specific machines in the documentation, other than as examples for how to run various tests, right?
Note: "Documentation" includes information on the wiki and .rst files in doc/source/,
which are used to create the online technical docs at https://readthedocs.org/projects/cice-consortium-cice/.
Unfortunately I don't have a pinto or wolf baseline using the code after the latest nonBFB mods, so there's no comparison across machines. I'm getting 3 failures in the Icepack base_suite. Is this failing for anyone else, or is it due to this new machine I'm using? These are all restart failures, and not all restarts are failing.
#---
PASS badger_intel_restart_col_1x1_pondcesm build
PASS badger_intel_restart_col_1x1_pondcesm initialrun
PASS badger_intel_restart_col_1x1_pondcesm run
FAIL badger_intel_restart_col_1x1_pondcesm test
#---
PASS badger_intel_restart_col_1x1_pondtopo build
PASS badger_intel_restart_col_1x1_pondtopo initialrun
PASS badger_intel_restart_col_1x1_pondtopo run
FAIL badger_intel_restart_col_1x1_pondtopo test
#---
PASS badger_intel_restart_col_1x1_dyn build
PASS badger_intel_restart_col_1x1_dyn initialrun
PASS badger_intel_restart_col_1x1_dyn run
FAIL badger_intel_restart_col_1x1_dyn test
88 of 91 tests PASSED
3 of 91 tests FAILED
0 of 91 tests PENDING