fix for problems in ctests using debug mode of GSI release/rrfs1.0.0 #862
Conversation
|
Using this PR built with the release mode in comparison with the rrfs1.0.0 result, it is found both GSI::gsi and GSI::enkf produced identical results. But this PR is significantly faster for GSI:: gsi while slower for GSI::enkf. |
|
@TingLei-daprediction, could you give a number of how slow it is for GSI:enkf? |
|
In the runs I already have, the PR Enkf used time 631 S while the control used 545s. |
…me issue when built in debug mode
|
@ShunLiu-NOAA @hu5970 To avoid the additional computational time, I have removed the additional mpi_barrier as the work-around to prevent that runtime error with debug mode GSI::enkf. I also did some optimization of the openmp directive in the changes to avoid "false sharing". The current PR shows using less time in the recent two runs for the control and the PR. Hence, I think this PR is ready to be reviewed before merging. |
|
Thank you @TingLei-NOAA . We need another ISSURE/PR for GSI optimization test. |
|
@ShunLiu-NOAA I had opened the issue #879. |
|
@ShunLiu-NOAA The numbers you asked for: now,(skipped the writing of ensemble step ), the PR used the clock time 505 seconds while the control (RRFS1.0.0) used 519 seconds. |
|
@ShunLiu-NOAA An update for the runs of the complete EnKF (with the step of ensemble output0. The PR used 590 seconds and the control 561 Seconds. They also produced the identical results by comparison the min/max of the ensemble increament mean in their standard output . The difference of the clock time should come from the real time system conditions. |
|
Thank you Ting. |
|
Had a couple of minor comments but overall the changes make sense and look good to me! |
GangZhao-NOAA
left a comment
There was a problem hiding this comment.
Hi Ting,
The modifications look good to me.
Thank you!
-Gang
|
@TingLei-NOAA , what is the status of this PR? Should we keep it open, merge it, or close it? |
|
@RussTreadon-NOAA This is up to @ShunLiu-NOAA 's decision. We have some concerns on the rrfs1.0.0 (the base of this PR) 's behavior (slower than expected) recently found on wcoss2. This delayed this PR 's merging into rrfs1.0.0. |
|
Thank you. @TingLei-NOAA , for the update. |
|
@RussTreadon-NOAA I am going to test this PR again with real time parallel to check if it is slower than the current version. |
|
Thank you @ShunLiu-NOAA for testing. Taking a step back, does the RRFS GSI we are preparing to send to NCO successfully run the RRFS in debug mode?
|
Yes. RRFS GSI will be sent to NCO. This PR is to ensure RRFS GSI is can run in debug mode. I tested this PR with RRFS parallel on WCOSS2. It performs as expected. We will merge the change to rrfs1.0.0. |
This is a draft PR to show fix for problems when running ctests using debug mode built GSI at rrfs.1.0.0.
Resolves #860
Now the working ctests for this implementation are
With this PR, the update runs of all ctests (except for global_enkf) successfully ran to completion.
The global_enkf issue (a known issue) has been investigated at #776.
Also, letkf is not used in the current rrfs workflow.
Hence, the following tests would be to use real cases from current rrfsv1 to test the GSI::gsi and GSI :: regional EnKF