Skip to content

Feature/memcheck#848

Merged
MatthewMasarik-NOAA merged 17 commits into
NOAA-EMC:developfrom
DeniseWorthen:feature/memcheck
Dec 9, 2022
Merged

Feature/memcheck#848
MatthewMasarik-NOAA merged 17 commits into
NOAA-EMC:developfrom
DeniseWorthen:feature/memcheck

Conversation

@DeniseWorthen
Copy link
Copy Markdown
Contributor

@DeniseWorthen DeniseWorthen commented Nov 6, 2022

Pull Request Summary

Replaces W3_MEMCHECK ifdef and associated lines with utility routine.

Description

  • Replaces existing memcheck ifdef code of the type
#ifdef W3_MEMCHECK
      write(40000+IAPROC,*) 'memcheck_____:', 'WW3_WAVE'
      call getMallocInfo(mallinfos)
      call printMallInfo(IAPROC+40000,mallInfos)
#endif

with

      call print_memcheck(memunit, 'memcheck_____:'//' WW3_WAVE')

where the print_memcheck routine is added to w3servmd.F90. The variable memunit is assigned locally.

  • Lengthens some lines (ie, lines <~50 characters) which were found incidentally
  • Uses implicit none at the module level for files which were edited.
  • Removes trailing whitespace accidentally committed in PR doxygen documentation: 6th subset #841

Issue(s) addressed

Related issue #800 and Discussion #551.

Commit Message

Check list

Testing

The changes were tested in UWM and all tests were b4b. The changes were also tested in UWM by turning on the MEMCHECK switch, running on a small number of tasks (5) and for only 2 hours. The same files were generated by the existing MEMCHECK code as well as the changes in this branch. The values of VM and RSS were not identical. The VM values across files were generally <1% different between existing code and these changes. The values of RSS were also not identical (larger differences were seen). To investigate, the current code was built and run twice and similar variation in RSS values was seen.

  • How were these changes tested?
  • Are the changes covered by regression tests? (If not, why? Do new tests need to be added?)
  • Have the matrix regression tests been run (if yes, please note HPC and compiler)?
  • Please indicate the expected changes in the regression test output, (Note the list of known non-identical tests.)
  • Please provide the summary output of matrix.comp (matrix.Diff.txt, matrixCompFull.txt and matrixCompSummary.txt):

@DeniseWorthen DeniseWorthen marked this pull request as ready for review November 14, 2022 18:59
@MatthewMasarik-NOAA
Copy link
Copy Markdown
Contributor

@DeniseWorthen thank you for your very nice work streamlining this important utility for memory monitoring. I'm working on reviewing it now.

@MatthewMasarik-NOAA
Copy link
Copy Markdown
Contributor

Good morning @DeniseWorthen, @JessicaMeixner-NOAA and I have done an initial run of the regression tests and we get some errors popping up in ww3_shel/ww3_multi. They seem to be related to the use of IAPROC in the assignment to memunit where they are now not hidden behind a MEMCHECK #ifdef. Is that something you can look into? We can help, but have some items were currently working on, so would need to circle back. In the meantime we can help with running regression tests.

@aliabdolali
Copy link
Copy Markdown
Contributor

@MatthewMasarik-NOAA Do we have any regtest to check the MEMCHECK switch?
I'd recommend adding @aronroland and @MathieuDutSik who wrote this part of the code as reviewers.

@DeniseWorthen
Copy link
Copy Markdown
Contributor Author

@MatthewMasarik-NOAA Can you point me to a run directory where the errors are being reported?

@MatthewMasarik-NOAA
Copy link
Copy Markdown
Contributor

@MatthewMasarik-NOAA Can you point me to a run directory where the errors are being reported?

@DeniseWorthen sure. On hera, /scratch1/NCEPDEV/climate/Matthew.Masarik/projs/PRs/pr_848/ww3-fix/regtests is one my test directories. The logs are in the matrix0?.out files for the various regression tests. I believe all (except matrix13.out) have errors. I started with matrix01.out, the first error is what i described.

@DeniseWorthen
Copy link
Copy Markdown
Contributor Author

@MatthewMasarik-NOAA Does IAPROC have a value yet? It doesn't look like it is assigned until after the call to the MPI_COM_RANK (if MPI) or after the first call to print_memcheck (if SHRD). A quick test would be to comment out the first print_memcheck to see if that is the issue.

@JessicaMeixner-NOAA
Copy link
Copy Markdown
Collaborator

@DeniseWorthen I tried this:

diff --git a/model/src/ww3_shel.F90 b/model/src/ww3_shel.F90
index 99a1821d..47dda1fe 100644
--- a/model/src/ww3_shel.F90
+++ b/model/src/ww3_shel.F90
@@ -432,7 +432,6 @@ PROGRAM W3SHEL
   !--- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
   ! 0.  Set up data structures
   !
-  memunit = 740+IAPROC
 #ifdef W3_OASIS
   OASISED=1
 #endif
@@ -451,13 +450,13 @@ PROGRAM W3SHEL
   CALL W3SETA ( 1, 6, 6 )
   CALL W3SETO ( 1, 6, 6 )
   CALL W3SETI ( 1, 6, 6 )
-
-  call print_memcheck(memunit, 'memcheck_____:'//' WW3_SHEL SECTION 1')
   !
 #ifdef W3_SHRD
   NAPROC = 1
   IAPROC = 1
 #endif
+  memunit = 740+IAPROC
+  call print_memcheck(memunit, 'memcheck_____:'//' WW3_SHEL SECTION 1')
   !
 #ifdef W3_OMPH

And I get past the ww3_shel error Matt reported, and then start getting one in ww3_multi/w3init and I think it's the same type of issue that IAPROC is being used before being set. Could memunit just be set in the print_memcheck routine? IAPROC should be a module variable you could add to the routine. Otherwise, I think its just a matter of changing where memunit is defined and/or where the early print_memcheck routines are being called from. There isn't a regtest using this feature, so we haven't run into this issue before because its' just simply always behind a switch. Honestly not sure how some of the calls worked before given some of these errors.

@MatthewMasarik-NOAA
Copy link
Copy Markdown
Contributor

@DeniseWorthen, yes @JessicaMeixner-NOAA effectively did that by moving that first call past the initializations of NAPROC/IAPROC in the SHRD ifdef, and that does take care of the first error. Other's then come up in multi. @JessicaMeixner-NOAA may be able to point you to that run she did.

@MatthewMasarik-NOAA
Copy link
Copy Markdown
Contributor

@aliabdolali , yes it would be great if @aronroland and @MathieuDutSik would like to review/comment here since they are the original authors.

@DeniseWorthen
Copy link
Copy Markdown
Contributor Author

@MatthewMasarik-NOAA @JessicaMeixner-NOAA I found a similar issue in the yowpdlibmain, where myrank is not assigned until after the call to initMPI. In that case though, myrank=0 so did not cause an issue. I'm not sure why that doesn't seem to be the case w/ iaproc. What value does it have prior to initialization?

The problem w/ assigning memunit in print_memcheck itself is that there are multiple choices for where the file is written. memunit could be converted back to the hard-coded values call print_memcheck(740+IAPROC....) but if IAPROC not initialized is the problem, I don't think that would help.

@DeniseWorthen
Copy link
Copy Markdown
Contributor Author

From testing w/in the meshcap, it appears that after the call to w3seto, iaproc is safely initialized to 1. You could try the following change. If it works, only the fort.741 file will have the WW3_SHEL SECTION 1 result though.

diff --git a/model/src/ww3_shel.F90 b/model/src/ww3_shel.F90
index 99a1821d..8898539c 100644
--- a/model/src/ww3_shel.F90
+++ b/model/src/ww3_shel.F90
@@ -432,7 +432,6 @@ PROGRAM W3SHEL
   !--- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
   ! 0.  Set up data structures
   !
-  memunit = 740+IAPROC
 #ifdef W3_OASIS
   OASISED=1
 #endif
@@ -452,6 +451,7 @@ PROGRAM W3SHEL
   CALL W3SETO ( 1, 6, 6 )
   CALL W3SETI ( 1, 6, 6 )

+  memunit = 740+IAPROC
   call print_memcheck(memunit, 'memcheck_____:'//' WW3_SHEL SECTION 1')
   !
 #ifdef W3_SHRD

@JessicaMeixner-NOAA
Copy link
Copy Markdown
Collaborator

JessicaMeixner-NOAA commented Nov 17, 2022

@DeniseWorthen I think we have to have this change to move to this new paradigm. @aronroland might be able to chime in with how he uses this, if it breaks the functionality he needs, but we can't run the code without some sort of change. Also we have an issue in w3init where https://github.com/DeniseWorthen/WW3/blob/feature/memcheck/model/src/w3initmd.F90#L522 we have memunit = 10000+IAPROC but IAPROC is not necessarily set until https://github.com/DeniseWorthen/WW3/blob/feature/memcheck/model/src/w3initmd.F90#L558

@DeniseWorthen
Copy link
Copy Markdown
Contributor Author

Does the existing code work if W3_MEMCHECK is used in the regression tests?

@JessicaMeixner-NOAA
Copy link
Copy Markdown
Collaborator

There is not a regression test that tests MEMCHECK and it's not part of our WW3 regression testing as of right now.

@DeniseWorthen
Copy link
Copy Markdown
Contributor Author

Understood. I meant if W3_MEMCHECK is enabled, do any of the failing tests (w/ this branch) run successfully?

@JessicaMeixner-NOAA
Copy link
Copy Markdown
Collaborator

@DeniseWorthen I have not tried that. I just tried a few tests to see if that fix might work.

@DeniseWorthen
Copy link
Copy Markdown
Contributor Author

@MatthewMasarik-NOAA @JessicaMeixner-NOAA I've made an attempt to run a regression test from the develop branch with MEMCHECK enabled

/scratch1/NCEPDEV/stmp2/Denise.Worthen/FV3_RT/w3rt/WW3/regtests/ww3_tp2.1/work_PR1_MPI/

This job fails because MEMCHECK is trying to open a file (fort.5) that is readonly. The printMallInfo should be sent to 740+IAPROC, not IAPROC. I can continue to try to work through the failures, but I don't believe the current develop branch would pass the RTs with MEMCHECK enabled.

#ifdef W3_MEMCHECK
  write(740+IAPROC,*) 'memcheck_____:', 'WW3_SHEL SECTION 1'
  call getMallocInfo(mallinfos)
  call printMallInfo(IAPROC,mallInfos)
#endif

@JessicaMeixner-NOAA
Copy link
Copy Markdown
Collaborator

@DeniseWorthen Thanks for reporting this. This doesn't seem super surprising given that MEMCHECK is just used in certain cases and is not included in any regression testing. Right now the biggest barrier to this PR being merge is the issue that it's breaking the ww3 regression tests without MEMCHECK on. @aronroland or the other developers of MEMCHECK might be able to address the failure you reported.

@MathieuDutSik
Copy link
Copy Markdown
Contributor

Hello, I was asked to comment on this:

  • Yes, the MEMCHECK was not working before, I knew this but wanted to implement a cleaner solution but still using CPP pragmas. But, ok I understand different solutions will be implemented.
  • No, I did not wrote the MEMCHECK. It was done by Aron Roland.
  • Yes, it is important to have this functionality. Memory problems frequently happen.
  • Yes, it is important to have regression checks that use this switch.

Finally, let me comment that I regard the elimination of implicit none statement as purely criminal and something even worse than the elimination of ifdef statements. I do not see what we will get out of this except strange bugs.

@MatthewMasarik-NOAA
Copy link
Copy Markdown
Contributor

Finally, let me comment that I regard the elimination of implicit none statement as purely criminal and something even worse than the elimination of ifdef statements. I do not see what we will get out of this except strange bugs.

@MathieuDutSik, there may be a misunderstanding here. The changes regarding implicit none are to move them to be at the module level (vs. for each subroutine), and not removing them altogether.

@MathieuDutSik
Copy link
Copy Markdown
Contributor

@MathieuDutSik, there may be a misunderstanding here. The changes regarding implicit none are to move them to be at the module level (vs. for each subroutine), and not removing them altogether.

Ok, but why mix those two issues in the same PR? Jessica indicated us that PR have to be atomic. And in particular, why limit the elimination of implicit none to this file?

@JessicaMeixner-NOAA
Copy link
Copy Markdown
Collaborator

@MathieuDutSik We do prefer PRs to be as single focused as possible. Most importantly we are a community code and I appreciate your acceptance that sometimes we might not always all agree on what is the best path forward but that we all are working towards the same goal to improve WW3. We ask everyone to be respectful of others ideas, opinions and contributions. Please email me privately with any follow up questions/comments/issues with this PR. I know I personally am very grateful for @DeniseWorthen's contributions here!

* iaproc must be initialized and/or set for use in memunit

to 1) and then again after
@DeniseWorthen
Copy link
Copy Markdown
Contributor Author

@MatthewMasarik-NOAA I believe this latest commit will resolve the issue w/ the regtest failing even when memcheck is not set. I'm not experienced w/ running the regtests for WW3, but I think I did it correctly. Note that the fix will still mean that some memcheck calls will write results only to the first file (ie, 740+1) because they come prior to obtaining (or resetting) iaproc with a call to MPI_COMM_RANK.

@MatthewMasarik-NOAA
Copy link
Copy Markdown
Contributor

Thanks for your continued work on this @DeniseWorthen. I'll run the regtests to confirm the effects of this update.

@MatthewMasarik-NOAA
Copy link
Copy Markdown
Contributor

MatthewMasarik-NOAA commented Nov 21, 2022

@DeniseWorthen I tried running the updated branch but still was getting seg faults in the same areas, ww3_multi / W3INIT, seemingly still due to IAPROC state. I then tried putting the W3_SHRD and W3_MPI blocks (where assignment of IAPROC happens in W3INIT) up above W3SETO and the other related routines, though this didn't help. After that looking back at ww3_multi I saw that MPI_COMM_RANK is called from there (using IMPROC/NMPROC) prior to the W3INIT call. So I USEed those two variables from WMMDATMD in W3INIT. This got me past a seg fault when not using MEMCHECK. You can see the hack i did at: DeniseWorthen:feature/memcheck v. MatthewMasarik-NOAA:feature/memcheck. This is as far as I got for the moment, so have not checked with MEMCHECK on to verify logs are being written as desired. I think it's in the right direction though so wanted to post.

@MathieuDutSik
Copy link
Copy Markdown
Contributor

For the record, the solution that I would have liked to implement is the following:

(A) The first thing would be in w3servmd, the following code (that is without ifdef)
subroutine print_memcheck(iun, msg)
USE MallocInfo_m
integer , intent(in) :: iun
character(len=*) , intent(in) :: msg

! local variables
type(MallInfo_t)        :: mallinfos

write(iun,*) trim(msg)
call getMallocInfo(mallinfos)
call printMallInfo(iun, mallInfos)

end subroutine print_memcheck

(B) The second thing would in the w3macros.h
#ifdef W3_MEMCHECK

define MACRO_PRINT_MEMCHECK(a, b) print_memcheck(a, b)

#else

define MACRO_PRINT_MEMCHECK(a,b)

#endif

(C) The function calls then become instead of
call print_memcheck(memunit, 'memcheck_____:'//' WW3_PDLIB SECTION 14')
the following:
MACRO_PRINT_MEMCHECK(memunit, 'memcheck_____:'//' WW3_PDLIB SECTION 14')

The advantage I see are the following:

  1. The function code is called only if needed. If the W3_MEMCHECK is not selected then no function calls occur.
  2. It is simple to write down in the code, no clumsy ifdef W3_MEMCHECK followed by 5 lines of code.
  3. The prefix MACRO_ clearly indicates the nature of the code.

Of course, you are free to do as you want, but I just wanted to point out that alternative solutions exist.

@aronroland
Copy link
Copy Markdown
Collaborator

Hi All,

sorry for being absent few days. I just read through all the comments and yes definitely this is a good way to proceed and to wrap it that way it was done now. I have optimized the location and the number of the memcheck part in the latest branch. I never commit it since there are some open questions.

One of the u found e.g. the not defined IAPROC. If this is not defined it was writing the STDOUT, I used it at that time for looking at the memory and it was good enough. Another part, which I did not liked is that everything is written in the 740+IAPROC files. I started to subdivide it for my purposes to the IAPROC+10000, 20000, 30000, 40000 files for each of the parts in WW3. Now this is also not optimal. We need some brainstorming on the question where to write what and define file handles, which is dedicated to memcheck. As for now and your merge I would just delete the memcheck part, which is not properly intialized. I suggest that we discuss the above issues and I can then streamline my changes ...

Denise, Jessica I like to thank u both for bringing this forward and cleaning the code. Mathieu's suggestion is really great, thanks for pointing this out.

So what would be now from your side the favorable way to proceed?

Aron

@DeniseWorthen
Copy link
Copy Markdown
Contributor Author

DeniseWorthen commented Nov 25, 2022

I think one issue I see w/ the macro option is that each file that implements a memcheck call would need to have that macro defined, correct? It also doesn't align with a path forward of removing ifdefs when possible.

Yes, the feature/memcheck branch still requires the use of W3_MEMCHECK ifdefs but those can be removed as the code moves towards implementation of namelist settings. For example, a logical flag, set via namelist, can be passed to the print_memcheck routine; when this flag is false, the routine simply returns w/o doing anything. I didn't make that change here, although I probably should have explained that would be a further required step.

@MathieuDutSik
Copy link
Copy Markdown
Contributor

I think one issue I see w/ the macro option is that each file that implements a memcheck call would need to have that macro defined, correct? It also doesn't align with a path forward of removing ifdefs when possible.

Dear Denise,
thank you for replying to my proposal. You are correct in saying that it is contrary to the elimination of ifdef statements. I understand that the path that had been chosen is a complete elimination of ifdef statements.

I only wanted here to explain that there is another solution and that ifdef-based code is not necessarily awful.

@DeniseWorthen
Copy link
Copy Markdown
Contributor Author

@MatthewMasarik-NOAA I was able to run these two tests

./bin/run_test -b slurm -o all -c hera.intel -S -T -s ST4_PR3_UQ_MPI -w work_ST4_PR3_UQ_MPI -m grdset_a -f -p srun -n 24 ../model4 mww3_test_08
./bin/run_test -b slurm -o all -c hera.intel -S -T -i i_lowres_multi -w work_lowres         -m grdset_a -f -p srun -n 24 ../model4 mww3_test_08

in debug mode with commit dd80c1d. Can you re-try your failed cases now?

@MatthewMasarik-NOAA
Copy link
Copy Markdown
Contributor

MatthewMasarik-NOAA commented Nov 28, 2022

Hi All, there's been some posts over the holiday weekend, I'd like to address each. It is nice to have the input from @aronroland who wrote the code initially and @MathieuDutSik who is also involved.

@MathieuDutSik I think your solution is a clever macro addition to Denise's function call (and takes the #ifdef out of the function, and puts it around the macro). I'm going to ask that we would hold off using it at this time though, due to the points Denise has made: 1. w3macros.h would need to be included in all files, 2. it still uses the W3_MEMCHECK #ifdefs (the function currently still uses the W3_MEMCHECK #ifdefs as well, though they could eventually be removed, whereas the macro relies on pre-processing for it's functionality). That said, Mathieu's suggestion could potentially help with speed. We will be checking the runtimes before and after, and if we find speed is an area needing to be addressed, I'm open to considering it again. For the time being we should continue with the direction Denise is working on.

@aronroland I'm curious about the unit indexing you mentioned: IAPROC + 10000, 20000, etc. Did you find it useful to separate the output based on section, or should we consolidate and just have one file for each proc?

@MatthewMasarik-NOAA
Copy link
Copy Markdown
Contributor

Can you re-try your failed cases now?

@DeniseWorthen , yes. I'll let you know how they go.

@MathieuDutSik
Copy link
Copy Markdown
Contributor

@MathieuDutSik I think your solution is a clever macro addition to Denise's function call (and takes the #ifdef out of the function, and puts it around the macro). I'm going to ask that we would hold off using it at this time though, due to the points Denise has made: 1. w3macros.h would need to be included in all files, 2. it still uses the W3_MEMCHECK #ifdefs (the function currently still uses the W3_MEMCHECK #ifdefs as well, though they could eventually be removed, whereas the macro relies on pre-processing for it's functionality). That said, Mathieu's suggestion could potentially help with speed. We will be checking the runtimes before and after, and if we find speed is an area needing to be addressed, I'm open to considering it again. For the time being we should continue with the direction Denise is working on.

Right now, the "w3macros.h" is included in 124 of the 136 Fortran F90 files of the source code so I think the issue inclusion is a non-issue.

I guess that following the chosen path of eliminating of CPP pragmas the "w3macros.h" file will be removed from the source code.

@MatthewMasarik-NOAA
Copy link
Copy Markdown
Contributor

@DeniseWorthen the regtests came back with no errors. I'm running develop now to confirm no answers changed.

Did your runs have the memcheck output in the expected IAPROC files?

@DeniseWorthen
Copy link
Copy Markdown
Contributor Author

@MatthewMasarik-NOAA I tested this in the mesh cap by turning on memcheck in the structured and pdlib cases for both the original code and this feature branch. The same files were produced in both cases. I also did some basic checking on the numbers I obtained when turning the feature on (see PR for notes I made).

@MatthewMasarik-NOAA
Copy link
Copy Markdown
Contributor

Awesome. Thank you for that testing and checking the numbers between the two, @DeniseWorthen.

@MatthewMasarik-NOAA
Copy link
Copy Markdown
Contributor

@DeniseWorthen I wanted to give you a status update here. I needed to re-run the regtests as the first time around had some flaky behavior, but now everything looks well. All that remains before acceptance is some timing comparisons. I'm currently compiling timing output from the regtests and am going to submit some larger test cases this afternoon. I see both those being complete by Wed.

Copy link
Copy Markdown
Contributor

@MatthewMasarik-NOAA MatthewMasarik-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review

The code edits contain:

  • definition of new subroutine print_memcheck
  • replacement of existing W3_MEMCHECK ifdef-blocks with calls to the new routine
  • whitespace removal
  • alignment formatting
  • wrapped lines formatting
  • and removal of duplicated code.

The encapsulation of the MEMCHECK functionality in print_memcheck provides a substantial reduction in code (5 lines replaced with 1 for each instance), improving readability.

Timing

The timing for the PR was carefully check against current
develop. The develop commit used for this testing as well as for future testing is: 616bf79. Note, the established acceptable threshold for runtime increase is 5%.

Testing was done in two ways:

  1. comparing regtest output from WW3 run_test using option -T to provide timing information. [performed on: hera/intel]
  2. running larger tests cases (in time period, grid resolution) and comparing. [performed on: wcoss2/intel]

Summary of regtest timing

There are a large number of regression tests (661) and they are lightweight by design, with many of the tests running in a matter of seconds, or even fractions of seconds. The performance of longer test cases are our primary concern, though the results from these shorter regtests also provide a useful check for the many different configurations.

  • 96% (635/661) were less than the 5% threshold. In fact, many of the values were closer to ~1%, with 40% of the values actually being negative. This likely hints that in these cases runtime increases are near the level of noise.
  • 4% (26/661) exceeded the 5% threshold. However, most of these were tests lasting ~1sec, with differences in the fractions of seconds. The longest test in this group lasted ~1min with a %change of 7%.

Summary of larger tests cases

A 1/4 deg tripolar grid was used to run 4 test cases which varied: length of simulation, resources (MPI tasks, threads), and global timestep/input frequency. These simulations were much longer (15min -- 1hr+), and they all had runtime increases of only a fraction of a %, (~< 0.6%).

GRID: 1/4 deg tripolar

TEST01
  * total tasks:            240
  * threads:                4
  * forcing dt / global dt: 300sec
  * sim length:             1day
  > timing develop:         903sec (~15.05min)
  > timing memcheck:        902sec
  > %change:                -0.1%

TEST02
  * total tasks:            480
  * threads:                2
  * forcing dt / global dt: 300sec
  * sim length:             5days
  > timing develop:         4314sec (~1hr 11min)
  > timing memcheck:        4334sec
  > %change:                0.5%

TEST03
  * total tasks:            480
  * threads:                2
  * forcing dt / global dt: 1800sec
  * sim length:             5days
  > timing develop:         1220sec (~20.33min)
  > timing memcheck:        1224sec
  > %change:                0.3%

TEST04
  * total tasks:            240
  * threads:                4
  * forcing dt / global dt: 1800sec
  * sim length:             10days
  > timing develop:         2584sec (~43.06min)
  > timing memcheck:        2599sec
  > %change:                0.6%

Regression Tests

The regtest matrix was run and compared with current develop. The tests all passed, with exception of only the expected non-b4b tests. These are shown in output of the matrixCompSummary.txt:

**********************************************************************      
********************* non-identical cases ****************************      
**********************************************************************      
mww3_test_03/./work_PR1_MPI_e                     (1 files differ)          
mww3_test_03/./work_PR3_UQ_MPI_e_c                     (1 files differ)     
mww3_test_03/./work_PR3_UNO_MPI_e                     (1 files differ)      
mww3_test_03/./work_PR2_UNO_MPI_e                     (1 files differ)      
mww3_test_03/./work_PR2_UNO_MPI_d2                     (8 files differ)     
mww3_test_03/./work_PR1_MPI_d2                     (6 files differ)         
mww3_test_03/./work_PR3_UNO_MPI_d2_c                     (11 files differ)  
mww3_test_03/./work_PR3_UQ_MPI_d2_c                     (15 files differ)   
mww3_test_03/./work_PR3_UNO_MPI_d2                     (15 files differ)    
mww3_test_03/./work_PR2_UQ_MPI_d2                     (16 files differ)     
mww3_test_03/./work_PR3_UQ_MPI_e                     (1 files differ)       
mww3_test_03/./work_PR3_UNO_MPI_e_c                     (1 files differ)    
mww3_test_03/./work_PR3_UQ_MPI_d2                     (15 files differ)     
ww3_ta1/./work_UPD0F_U                     (0 files differ)                 
ww3_tp2.10/./work_MPI_OMPH                     (7 files differ)             
ww3_tp2.16/./work_MPI_OMPH                     (4 files differ)             
ww3_tp2.6/./work_ST0                     (1 files differ)                   
ww3_tp2.6/./work_ST4                     (1 files differ)                   
ww3_tp2.6/./work_pdlib                     (1 files differ)                 
ww3_ts4/./work_ug_MPI                     (1 files differ)                  
ww3_ufs1.3/./work_a                     (3 files differ)                    
                                                                            
**********************************************************************      
************************ identical cases *****************************      
**********************************************************************

The full output files are attached:

Conclusion

  • Code review - Improves code readability by encapsulating MEMCHECK functionality in a new routine. Additional edits pertain to related code clean up (whitespace, alignment, etc). PASS.
  • Timing - Regest timing showed 96% were under the 5% runtime increase threshold, with those remaining 4% being mostly simulations lasting ~1sec, with the longest, only ~1min. Results from the larger test cases (1/4 deg tripolar), having much longer runtimes (15min -- 1hr+), showed that in each case the runtime increase was ~< 0.6%. These larger test cases are the most important here and the runtime increases are consistently well below the 5% increase threshold. PASS.
  • Regtests - All tests passed with only the expected non-b4b tests as exceptions. PASS.

Based on these considerations I approve this PR.

@MatthewMasarik-NOAA
Copy link
Copy Markdown
Contributor

Please let me know of any questions or concerns on the timing results.

@MatthewMasarik-NOAA
Copy link
Copy Markdown
Contributor

Thank you @DeniseWorthen for your hard work. The general code clean up, and encapsulation of MallocInfo subroutine calls are great steps in improving code readability. Thank you also to @aronroland and @MathieuDutSik for giving your perspective (and who are the original authors of the MallocInfo module, which is very useful for tracking memory usage in WW3).

@MatthewMasarik-NOAA MatthewMasarik-NOAA merged commit bcde1d0 into NOAA-EMC:develop Dec 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants