Fix QC test, fix bug in history time axis, fix history output averaging for timestep output#624
Conversation
- Fix bug in history time axis when sec_init is not zero. - Fix issue with time_beg and time_end uninitialized values. - Add support for averaging with histfreq='1' by allowing histfreq_n to be any value in that case. Extend and clean up construct_filename for history files. More could be done, but wanted to preserve backwards compatibility. - Add new calendar_sec2hms to converts daily seconds to hh:mm:ss. Update the calchk calendar unit tester to check this method - Remove abort test in bcstchk, this was just causing problems in regression testing - Remove known problems documentation about problems writing when istep=1. This issue does not exist anymore with the updated time manager. - Add new tests with hist_avg = false. Add set_nml.histinst.
- Add netcdf ststus checks and aborts in ice_read_write.F90 - Check for end of file when reading records in ice_read_write.F90 for ice_read_nc methods - Update set_nml.qc to better specify the test, turn off leap years since we're cycling 2005 data - Add check in c ice.t-test.py to make sure there is at least 1825 files, 5 years of data - Add QC run to base_suite.ts to verify qc runs to completion and possibility to use those results directly for QC validation - Clean up error messages and some indentation in ice_read_write.F90
- Add prod suite including 10 year gx1prod and qc test - Update unit test compare scripts
dabail10
left a comment
There was a problem hiding this comment.
This looks good to go. I just had a few questions about the averaging. Doesn't have to be added here.
| time_beg(ns) = real(time_beg(ns),kind=real_kind) | ||
| endif | ||
| endif | ||
| if (avgct(ns) == c1) then |
There was a problem hiding this comment.
So this just initializes timedbl and time_beg at the beginning of the averaging interval?
There was a problem hiding this comment.
correct. and timedbl is just a temporary local to make going from double to single a little cleaner. time_beg is now initialized whenever hist_avg is on because we should assume it might be written to the history file when hist_avg is on, even for histfreq='1' and histfreq_n=1. This moves away from treating histfreq='1' and histfreq_n=1 as special cases which allows for averaging across timesteps (histfreq='1') and allows a user to cleanly set hist_avg to true even if histfreq_n=1.
|
|
||
| do ns = 1,nstreams | ||
| if (.not. hist_avg .or. histfreq(ns) == '1') then ! write snapshots | ||
| if (.not. hist_avg) then ! write snapshots |
There was a problem hiding this comment.
At some point, I would like hist_avg to be an array of size nstreams. Then we can control the instantaneous versus average on a stream by stream basis.
There was a problem hiding this comment.
That's a good idea, I will make sure that's part of an issue.
|
I just merged master into this branch and testing looks good, https://github.com/CICE-Consortium/Test-Results/wiki/cice_by_hash_forks#83068c7a3802948d3d89e36b6c871aa155afb684. The couple failures are expected and either cases still running or 10 year gx1prod tests I just added that will fail until we fix the leap year / cycling issue. |
phil-blain
left a comment
There was a problem hiding this comment.
Hi Tony, thanks a lot for putting in these fixes. I left a few comments/questions below, but in general this looks good.
I had started to fix the hist_avg with histfreq=1 situation myself, and I checked your code against mine just to see if we were doing the same thing. What you propose is more complete than what I had managed to code up (I was dragged into other things). If you are interested, my branch is here: master...phil-blain:hist-averaging-multiple-dt.
Have a good week-end! :)
| endif | ||
| if (avgct(ns) == c1) then | ||
| timedbl = (timesecs-dt)/(secday) | ||
| time_beg(ns) = real(timedbl,kind=real_kind) |
There was a problem hiding this comment.
we could do without timedbl and just do this, no ?
time_beg(ns) = real((timesecs-dt)/(secday), kind=real_kind)Is there any reason that I'm missing why we need a separate variable ?
There was a problem hiding this comment.
We absolutely could implement what you propose.
| if (histfreq(ns) == '1' .and. histfreq_n(ns) == 1) then ! timestep | ||
| write(ncfile,'(a,a,i4.4,a,i2.2,a,i2.2,a,i5.5,a,a)') & | ||
| history_file(1:lenstr(history_file))//trim(cstream),'_inst.', & | ||
| iyear,'-',imonth,'-',iday,'-',msec,'.',trim(suffix) |
There was a problem hiding this comment.
I'm not sure I understand why we need a separate if branch for this first case. Is it not covered by the
else ! instantaneousline at the end of the subroutine ?
I guess the idea is that someone might have hist_avg = true and histfreq =1 and histfreq_n = 1, and in this case we ignore hist_avg (since there is nothing to average) and write instantaneous output instead ?
I'm wondering if it might be more useful to catch that early in ice_init and reset hist_avg to false in that case.. That would simplify the code here.
There was a problem hiding this comment.
With multiple history streams and a single hist_avg variable, we do not want to change hist_avg to false if one of the streams has histfreq='1' and histfreq_n=1. A future goal would be to make hist_avg an array as Dave suggested then we have more options.
| @@ -1,5 +1,5 @@ | |||
| histfreq = 'm','d','1','h','x' | |||
| histfreq_n = 1,2,6,4,1 | |||
| histfreq_n = 1,2,6,4,5 | |||
There was a problem hiding this comment.
what is the rationale here ? 'x' already sets this stream to off, no ? we would want 'y' somewhere in histfreq to test all 5 streams, but we would need a multi-year test...
| cstream = '' | ||
| !echmod ! this was implemented for CESM but it breaks post-processing software | ||
| !echmod ! of other groups (including RASM which uses CESMCOUPLED) | ||
| !echmod if (ns > 1) write(cstream,'(i1.1)') ns-1 |
There was a problem hiding this comment.
these lines (setting an empty cstream, the following comment and the commented line if (ns > 1) write(cstream,'(i1.1)') ns-1 ) dates all the way back to our initial commit 7d740de (initial commit of CICE without icepack to git from LANL consortium branch r1229 May 24, 2017, 2017-05-24). Maybe it would be time to just remove cstream ?
There was a problem hiding this comment.
I thought about removing the cstream but they aren't hurting anyone and I wasn't sure whether maybe some coupled system was using them, so I left them in. We could/should think about removing it, but we'd want to ask outside users whether they are using it.
| if (hist_avg) then | ||
| if (histfreq(ns) == '1' .or. histfreq(ns) == 'h'.or.histfreq(ns) == 'H') then | ||
| ! do nothing | ||
| elseif (new_year) then | ||
| iyear = iyear - 1 | ||
| imonth = 12 | ||
| iday = daymo(imonth) | ||
| elseif (new_month) then | ||
| imonth = mmonth - 1 | ||
| iday = daymo(imonth) | ||
| elseif (new_day) then | ||
| iday = iday - 1 | ||
| endif | ||
| endif |
There was a problem hiding this comment.
When I looked at this code in the context of #551, I did not readily understand why we have to do those substractions here, and why we do not have to do any if histfreq = 1 or h...
Is it because the file names for averaged data are dated at the start of the averaging period ? I think a code comment explaining that would not hurt :)
There was a problem hiding this comment.
This is all part of a rather complex interaction between dates and history frequencies with a certain amount of assumption built-in. Basically, if you are doing annual, monthly, or daily averages, the history will be written on the "new_flag". At that point, you're in the next year, month, or day. So you subtract "1" so when you construct the filename, the name is associated with the year, month, or day being averaged. Of course, all this comes unglued if the histfreq_n is not 1 in terms of the filename. The instantaneous stuff and histfreq='1' just writes out a file with the current date in it because there isn't a general way to create a meaningful filename. Most of what I did tried to preserve what's there and expand how it works. There are many things that could be further improved.
| smoke gx3 1x4 debug,diag1,run2day | ||
| smoke gx3 4x1 debug,diag1,run5day | ||
| restart gx3 8x2 debug | ||
| smoke gx1 64x1 qc,medium |
There was a problem hiding this comment.
ah, seems you changed your mind :)
| @@ -0,0 +1,3 @@ | |||
| # Test Grid PEs Sets BFB-compare | |||
| smoke gx1 64x1 qc,medium | |||
| smoke gx1 64x2 gx1prod,long,run10year No newline at end of file | |||
There was a problem hiding this comment.
there is a missing newline at EOF here
| ! yearmax = 1000 | ||
| yearmax = 100000 |
There was a problem hiding this comment.
here again, this could be folded into "Fix QC issues"
| smoke gx3 4x2 diag1,run5day smoke_gx3_8x2_diag1_run5day | ||
| smoke gx3 4x1 diag1,run5day,thread smoke_gx3_8x2_diag1_run5day | ||
| smoke gx3 1x8 diag1,run5day,evp1d smoke_gx3_8x2_diag1_run5day | ||
| smoke gx3 1x8 diag1,run5day,evp1d |
There was a problem hiding this comment.
what is the justification for removing the b4b compare with smoke_gx3_8x2_diag1_run5day ?
There was a problem hiding this comment.
Because it fails and is expected to fail. We would have to turn on the debug flag for the baseline and test to have a chance of bit-for-bit and that would not happen for all compilers. After watching it fail a few times, I decided to just remove it. We could/should consider adding a debug version of the test though.
DeniseWorthen
left a comment
There was a problem hiding this comment.
Changes look good. I've confirmed that it fixes our issue of the time-axis being set correctly when secs_init is not 0.
* ice_history_write: fix initial condition metadata under 'hist_avg'
When writing averaged history outputs (hist_avg=.true.), this setting
also affects the initial condition. Even if the actual data variables
written to the initial condition are not averaged (they are taken
more or less directly from the restart or the hard-coded defaults,
modulo aggregation over categories), their attributes ('cell_method' and
'time_rep') imply they are averaged, and the 'bound' attribute of the
'time' variable refers to the 'time_bounds' variable.
Make the metadata of the initial condition more correct by:
- not writing the 'time_bounds' variable (and the corresponding 'd2' dimension)
- not writing the 'bounds' attribute of the 'time' variable
- not writing the 'cell_method' attributes of each variable
- writing the 'time_rep' attribute of each variable as 'instantaneous'
instead of 'averaged'.
Do this by checking 'write_ic' at all places where we check for the
value of 'hist_avg' to write the above variables and attributes in each
of the 3 IO backends (binary, netcdf, pio2).
* drivers/{nemo_concepts,standalone}: write initial condition at initial time
In CICE_InitMod::cice_init, we call ice_calendar::advance_timestep
before writing the initial condition, such that the 'time' variable in
the initial condition is not zero; it has a value of 1*dt (the model
time step). The initial condition filename also reflects this, since
'msec' (model seconds) also has a value of 1*dt and is used in
ice_history_shared::construct_filename. This leads to the initial
condition filename not corresponding to the model initialization
date/time but rather 1*dt later.
Since we call 'accum_hist' after initializing the forcing, any forcing
field written to the initial condition has values corresponding to
msec=dt, whereas the ice state corresponds to msec=0, leading to an
inconsistency.
Fix that by calling 'accum_hist' to write the initial condition _before_
calling 'advance_timestep'. Since we now call 'accum_hist' before
initializing the forcing, any forcing field written to the initial
condition have its default, hard-coded value, instead of its value at
time=dt. An improvement would be to read the forcing at time=dt, write
the initial condition, advance the time step, and read the forcing
again, but let's not complicate things too much for now.
(cherry picked from commit 9a55ad9)
Cherry-pick notes:
There were some conflicts in io_{netcdf,pio2}/ice_history_write.F90 due
to 26d917a (Fix QC test, fix bug in history time axis, fix history
output averaging for timestep output (CICE-Consortium#624), 2021-08-19), since that
commit enabled averaging over multiple time steps and thus removed the
assumption that histfreq(ns) = '1' means hist_avg = .false.. I also had
to add additional changes to 'ice_history_write' in the conditions that
are checked before writing the NetCDF attributes since we do not yet
have the 'ice_write_hist_attrs' subroutine added in 078aab4 (Merge
cgridDEV branch including C grid implementation and other fixes (CICE-Consortium#715),
2022-05-10).
Closes: https://gitlab.science.gc.ca/concepts/CICE/issues/19
* ice_history_write: fix initial condition metadata under 'hist_avg'
When writing averaged history outputs (hist_avg=.true.), this setting
also affects the initial condition. Even if the actual data variables
written to the initial condition are not averaged (they are taken
more or less directly from the restart or the hard-coded defaults,
modulo aggregation over categories), their attributes ('cell_method' and
'time_rep') imply they are averaged, and the 'bound' attribute of the
'time' variable refers to the 'time_bounds' variable.
Make the metadata of the initial condition more correct by:
- not writing the 'time_bounds' variable (and the corresponding 'd2' dimension)
- not writing the 'bounds' attribute of the 'time' variable
- not writing the 'cell_method' attributes of each variable
- writing the 'time_rep' attribute of each variable as 'instantaneous'
instead of 'averaged'.
Do this by checking 'write_ic' at all places where we check for the
value of 'hist_avg' to write the above variables and attributes in each
of the 3 IO backends (binary, netcdf, pio2).
* drivers/{nemo_concepts,standalone}: write initial condition at initial time
In CICE_InitMod::cice_init, we call ice_calendar::advance_timestep
before writing the initial condition, such that the 'time' variable in
the initial condition is not zero; it has a value of 1*dt (the model
time step). The initial condition filename also reflects this, since
'msec' (model seconds) also has a value of 1*dt and is used in
ice_history_shared::construct_filename. This leads to the initial
condition filename not corresponding to the model initialization
date/time but rather 1*dt later.
Since we call 'accum_hist' after initializing the forcing, any forcing
field written to the initial condition has values corresponding to
msec=dt, whereas the ice state corresponds to msec=0, leading to an
inconsistency.
Fix that by calling 'accum_hist' to write the initial condition _before_
calling 'advance_timestep'. Since we now call 'accum_hist' before
initializing the forcing, any forcing field written to the initial
condition have its default, hard-coded value, instead of its value at
time=dt. An improvement would be to read the forcing at time=dt, write
the initial condition, advance the time step, and read the forcing
again, but let's not complicate things too much for now.
(cherry picked from commit 9a55ad9)
Cherry-pick notes:
There were some conflicts in io_{netcdf,pio2}/ice_history_write.F90 due
to 26d917a (Fix QC test, fix bug in history time axis, fix history
output averaging for timestep output (CICE-Consortium#624), 2021-08-19), since that
commit enabled averaging over multiple time steps and thus removed the
assumption that histfreq(ns) = '1' means hist_avg = .false.. I also had
to add additional changes to 'ice_history_write' in the conditions that
are checked before writing the NetCDF attributes since we do not yet
have the 'ice_write_hist_attrs' subroutine added in 078aab4 (Merge
cgridDEV branch including C grid implementation and other fixes (CICE-Consortium#715),
2022-05-10).
* ice_history_write: fix initial condition metadata under 'hist_avg'
When writing averaged history outputs (hist_avg=.true.), this setting
also affects the initial condition. Even if the actual data variables
written to the initial condition are not averaged (they are taken
more or less directly from the restart or the hard-coded defaults,
modulo aggregation over categories), their attributes ('cell_method' and
'time_rep') imply they are averaged, and the 'bound' attribute of the
'time' variable refers to the 'time_bounds' variable.
Make the metadata of the initial condition more correct by:
- not writing the 'time_bounds' variable (and the corresponding 'd2' dimension)
- not writing the 'bounds' attribute of the 'time' variable
- not writing the 'cell_method' attributes of each variable
- writing the 'time_rep' attribute of each variable as 'instantaneous'
instead of 'averaged'.
Do this by checking 'write_ic' at all places where we check for the
value of 'hist_avg' to write the above variables and attributes in each
of the 3 IO backends (binary, netcdf, pio2).
* drivers/{nemo_concepts,standalone}: write initial condition at initial time
In CICE_InitMod::cice_init, we call ice_calendar::advance_timestep
before writing the initial condition, such that the 'time' variable in
the initial condition is not zero; it has a value of 1*dt (the model
time step). The initial condition filename also reflects this, since
'msec' (model seconds) also has a value of 1*dt and is used in
ice_history_shared::construct_filename. This leads to the initial
condition filename not corresponding to the model initialization
date/time but rather 1*dt later.
Since we call 'accum_hist' after initializing the forcing, any forcing
field written to the initial condition has values corresponding to
msec=dt, whereas the ice state corresponds to msec=0, leading to an
inconsistency.
Fix that by calling 'accum_hist' to write the initial condition _before_
calling 'advance_timestep'. Since we now call 'accum_hist' before
initializing the forcing, any forcing field written to the initial
condition have its default, hard-coded value, instead of its value at
time=dt. An improvement would be to read the forcing at time=dt, write
the initial condition, advance the time step, and read the forcing
again, but let's not complicate things too much for now.
(cherry picked from commit 9a55ad9)
Cherry-pick notes:
There were some conflicts in io_{netcdf,pio2}/ice_history_write.F90 due
to 26d917a (Fix QC test, fix bug in history time axis, fix history
output averaging for timestep output (CICE-Consortium#624), 2021-08-19), since that
commit enabled averaging over multiple time steps and thus removed the
assumption that histfreq(ns) = '1' means hist_avg = .false.. I also had
to add additional changes to 'ice_history_write' in the conditions that
are checked before writing the NetCDF attributes since we do not yet
have the 'ice_write_hist_attrs' subroutine added in 078aab4 (Merge
cgridDEV branch including C grid implementation and other fixes (CICE-Consortium#715),
2022-05-10).
* ice_history_write: fix initial condition metadata under 'hist_avg'
When writing averaged history outputs (hist_avg=.true.), this setting
also affects the initial condition. Even if the actual data variables
written to the initial condition are not averaged (they are taken
more or less directly from the restart or the hard-coded defaults,
modulo aggregation over categories), their attributes ('cell_method' and
'time_rep') imply they are averaged, and the 'bound' attribute of the
'time' variable refers to the 'time_bounds' variable.
Make the metadata of the initial condition more correct by:
- not writing the 'time_bounds' variable (and the corresponding 'd2' dimension)
- not writing the 'bounds' attribute of the 'time' variable
- not writing the 'cell_method' attributes of each variable
- writing the 'time_rep' attribute of each variable as 'instantaneous'
instead of 'averaged'.
Do this by checking 'write_ic' at all places where we check for the
value of 'hist_avg' to write the above variables and attributes in each
of the 3 IO backends (binary, netcdf, pio2).
* drivers/{nemo_concepts,standalone}: write initial condition at initial time
In CICE_InitMod::cice_init, we call ice_calendar::advance_timestep
before writing the initial condition, such that the 'time' variable in
the initial condition is not zero; it has a value of 1*dt (the model
time step). The initial condition filename also reflects this, since
'msec' (model seconds) also has a value of 1*dt and is used in
ice_history_shared::construct_filename. This leads to the initial
condition filename not corresponding to the model initialization
date/time but rather 1*dt later.
Since we call 'accum_hist' after initializing the forcing, any forcing
field written to the initial condition has values corresponding to
msec=dt, whereas the ice state corresponds to msec=0, leading to an
inconsistency.
Fix that by calling 'accum_hist' to write the initial condition _before_
calling 'advance_timestep'. Since we now call 'accum_hist' before
initializing the forcing, any forcing field written to the initial
condition have its default, hard-coded value, instead of its value at
time=dt. An improvement would be to read the forcing at time=dt, write
the initial condition, advance the time step, and read the forcing
again, but let's not complicate things too much for now.
(cherry picked from commit 9a55ad9)
Cherry-pick notes:
There were some conflicts in io_{netcdf,pio2}/ice_history_write.F90 due
to 26d917a (Fix QC test, fix bug in history time axis, fix history
output averaging for timestep output (CICE-Consortium#624), 2021-08-19), since that
commit enabled averaging over multiple time steps and thus removed the
assumption that histfreq(ns) = '1' means hist_avg = .false.. I also had
to add additional changes to 'ice_history_write' in the conditions that
are checked before writing the NetCDF attributes since we do not yet
have the 'ice_write_hist_attrs' subroutine added in 078aab4 (Merge
cgridDEV branch including C grid implementation and other fixes (CICE-Consortium#715),
2022-05-10).
Rebase onto CICE6.4.1 notes:
There were some conflicts in io_{netcdf,pio2}/ice_history_write.F90 due
to the same commits mentioned above. They are both included in
CICE6.4.1, but the original cherry-picked commit (9a55ad9 (cicecore:
correct initial condition metadata (CICE-Consortium#818), 2023-03-13)) is not.
The correct resolution was thus to:
- keep ours version for calls to ice_write_hist_attrs
- cherry-pick from 9a55ad9 the changes to ice_write_hist_attrs
- in io_pio2, change 'if (hist_avg)' to 'if (hist_avg .and. .not.
write_ic)'
PR checklist
Fix QC test, fix bug in history time axis, fix history output averaging for timestep output
apcraig
Tests pass bit-for-bit on cheyenne, https://github.com/CICE-Consortium/Test-Results/wiki/cice_by_hash_forks#7b5c2b404addb356c75d87f7cb4ae2fa59bfc20b
Fix history features
Fix QC issues
Fix unit tests and testing