Tech doc updates (for master)#209
Conversation
…to reflect: a) The gmtb-scm code repository is now public. b) The EMC NEMS repository is no longer in VLab as it has been moved to GitHub in August 2019. GMTB is forking this repository and no longer mirrorring it. c) Given that the CCPP-enabling capabilities in the FV3, NEMS, and NEMSfv3gfs repositories were committed to EMC's master in July 2019, the primary branch for CCPP development in the GitHub.com/NCAR fork or mirrors is no longer a special GMTB area (gmtb/ccpp) but is now the primary code (master for NEMSfv3gfs and FV3 and develop for NEMS). The difference in terminogy (master versus develop) is due to the transition of EMC's codes to GitHub and use of GitFlow (NEMS is now on GitHub using GitFlow, so develop is the proper origin/upstream). d) A clarification on Section 7.4.1 (Creating a PR) asking developers to, when relevant, note in the PR message the existence of associated PRs in separate repositories.
… reflect changes in the NEMSfv3gfs regression tests implemented when the CCPP-enabling changes to FV3, NEMS, and NEMSfv3gfs were committed to EMC master in July 2019.
Codecov Report
@@ Coverage Diff @@
## master #209 +/- ##
=======================================
Coverage 47.28% 47.28%
=======================================
Files 14 14
Lines 1343 1343
=======================================
Hits 635 635
Misses 708 708Continue to review full report at Codecov.
|
gold2718
left a comment
There was a problem hiding this comment.
At least one typo plus some suggestions and questions.
| ^^^^^^^^^^^^^^^^^^^ | ||
|
|
||
| The RT configuration files are located in ``./tests`` relative to the top-level directory of NEMSfv3gfs and have names ``rt*.conf``. The default RT configuration file, supplied with the NEMSfv3gfs master, compares the results from the non-CCPP code to the *official baseline* and is called ``rt.conf``. Before running the RT script ``rt.sh`` in the same directory, the user has to set one or more environment variables and potentially modify the script to change the location of the automatically created run directories. The environment variables are ``ACCNR`` (mandatory unless the user is a member of the default project *nems*; sets the account to be charged for running the RTs), ``NEMS_COMPILER`` (optional for the ``intel`` compiler option, set to ``gnu`` to switch), and potentially ``RUNDIR_ROOT``. ``RUNDIR_ROOT`` allows the user to specify an alternative location for the RT run directories underneath which directories called ``rt_$PID`` are created (``$PID`` is the process identifier of the ``rt.sh`` invocation). This may be required on systems where the user does not have write permissions in the default run directory tree. | ||
| The RT configuration files are located in ``./tests`` relative to the top-level directory of NEMSfv3gfs and have names ``rt*.conf``. The default RT configuration file, supplied with the NEMSfv3gfs master is called ``rt.conf`` and runs four types of configurations: IPD PROD, IPD REPRO, CCPP PROD, and CCPP REPRO. For the IPD configurations, CCPP is not used, that is, the code is compiled with ``CCPP=N``. The PROD configurations use the compiler flags used in NCEP operations for for superior performance, while the REPRO configurations remove certain compiler flags to create b4b identical results between CCPP and IPD configurations. Before running the RT script ``rt.sh`` in directory ``./tests``, the user has to set one or more environment variables and potentially modify the script to change the location of the automatically created run directories. The environment variables are ``ACCNR`` (mandatory unless the user is a member of the default project *nems*; sets the account to be charged for running the RTs), ``NEMS_COMPILER`` (optional for the ``intel`` compiler option, set to ``gnu`` to switch), and potentially ``RUNDIR_ROOT``. ``RUNDIR_ROOT`` allows the user to specify an alternative location for the RT run directories underneath which directories called ``rt_$PID`` are created (``$PID`` is the process identifier of the ``rt.sh`` invocation). This may be required on systems where the user does not have write permissions in the default run directory tree. |
There was a problem hiding this comment.
"operations for for superior" ==> "operations for superior"
"location for the RT run directories underneath which directories" really needs a comma
==> "location for the RT run directories, underneath which directories"
I do not see where I would modify the script -- it looks like all environment variables. Did I miss something?
There was a problem hiding this comment.
The mods look fine to me. Good catch with the typo and comma.
There was a problem hiding this comment.
I will address the typo and comma. Variables RUNDIR_ROOT and NEMS_COMPILER can be set in your shell or entered in the script. Variable ACCNR needs to be set in the script. So, I think this part of the text is okay.
There was a problem hiding this comment.
No, export ACCNR=gmtb and then running the script works fine.
| ^^^^^^^^^^^^^^^^^^^ | ||
|
|
||
| Regression testing is only possible on machines for which baselines exist. EMC maintains *official baselines* of non-CCPP runs on *Jet* and *Wcoss* created with the Intel compiler. GMTB maintains additional baselines on *Theia, Jet, Cheyenne*, and *Gaea*. While GMTB is trying to keep up with changes to the official repositories, baselines maintained by GMTB are not guaranteed to be up-to-date. | ||
| Regression testing is only possible on machines for which baselines exist. EMC maintains *official baselines* on *Theia* and *Wcoss* created with the Intel compiler. GMTB maintains additional baselines on *Jet*, *Cheyenne*, and *Gaea*. While GMTB is trying to keep up with changes to the official repositories, baselines maintained by GMTB are not guaranteed to be up-to-date. |
There was a problem hiding this comment.
are not guaranteed to be up-to-date
For each GMTB-stored baseline, it is clear which code version was tested? If so, I would state this (along with how to determine the tested code revision).
There was a problem hiding this comment.
Regression test baselines sit in directories .../trunk-YYYYMMDD/... - the procedure is to copy the latest baseline directory over from theia (because it also contains the input data and some o the config files) and then update them by running the tests in "create" mode with the version of the code that corresponds to this date tag. Thus, either the directories are there (and the baseline is up to date), or they aren't there at all.
There was a problem hiding this comment.
The connection between the baseline and the version of the code used to create the baseline is done solely by the date in the directory name of the baseline. In order to obtain the code used to create a baseline dated YYYYMMDD, a person would obtain top of the master as of this date.
To make this more clear, I will change text from "Note that yyyymmdd is the year, month and day the RT was created." to "Note that yyyymmdd is the year, month and day the baseline was created using top of master code."
| | https://github.com/NCAR/NEMS | develop | | ||
| +---------------------------------------------+----------------------+ | ||
| | https://github.com/NCAR/FMS | GFS-FMS | | ||
| +---------------------------------------------+----------------------+ |
There was a problem hiding this comment.
I find it unusual that users are pointed to branches rather than tags. What sort of users are targeted here?
Also, aren't most of the repositories checked out as submodules? Those never create branches (and usually do not even clone the full branch or repository).
There was a problem hiding this comment.
Maybe unusual, but equivalent. We are always keeping the branches listed above in sync, i.e. they head of each of them is guaranteed to work with all the others.
There was a problem hiding this comment.
Maybe we should list branches and tags. The branch information is needed if someone is going to create a pull request for a certain branch of NEMSfv3gfs and its submodules.
There was a problem hiding this comment.
But tags will change much more often than the technical documentation!
There was a problem hiding this comment.
This documentation is for top of master. Target audience is model developers, which will likely be modifying one or more submodules of NEMSfv3gfs. Because they will be creating PR of innovations, they cannot work from fixed tags.
All repositories under NEMSfv3gfs are submodules. Those are ccpp-physics, ccpp-framework, FV3, NEMS etc. Developers will be modifying one or more of these submodules.
Note that we also offer a public release of CCPP with the Single-Column Model. In that case, we point to tags because the public release is stable and does not change.
| https://github.com/NCAR/ccpp-physics | ||
|
|
||
| Users have read-only access to these repositories and as such cannot accidentally destroy any important (shared) branches of these authoritative repositories. Both CCPP repositories are public (no GitHub account required) and may be used directly to read or create forks. Write permission is generally restricted, however. The SCM repository is private, to request access please send a message and your GitHub username to gmtb-help@ucar.edu. | ||
| Users have read-only access to these repositories and as such cannot accidentally destroy any important (shared) branches of these authoritative repositories. Both CCPP repositories and the SCM repositories are public (no GitHub account required) and may be used directly to read or create forks. Write permission is generally restricted, however. |
There was a problem hiding this comment.
Up on line 63, I think "As for NEMSfv3gfs" would be easier to understand as "As with NEMSfv3gfs"
There was a problem hiding this comment.
I will make this change, tks.
| .. code-block:: console | ||
|
|
||
| git clone -b gmtb/ccpp https://github.com/NCAR/NEMSfv3gfs | ||
| git clone -b https://github.com/NCAR/NEMSfv3gfs |
There was a problem hiding this comment.
Does this work? I thought the -b switch required a branch name.
There was a problem hiding this comment.
Yes, I believe you need to specify the branch name.
There was a problem hiding this comment.
I will remove the "-b", so it will read
git clone https://github.com/NCAR/NEMSfv3gfs
| git clone -b https://github.com/NCAR/NEMSfv3gfs | ||
| cd NEMSfv3gfs | ||
| git submodule init | ||
| git submodule update |
There was a problem hiding this comment.
Is there a reason this recipe is used instead of using --recurse-submodules? Something like:
git clone -b master --recurse-submodules https://github.com/NCAR/NEMSfv3gfs
cd NEMSfv3gfs
There was a problem hiding this comment.
I can answer this one. NEMS contains a nested submodule, which at this point still lives on Vlab. The code of this submodule is not required for the "normal" user (only when running the NEMSCompset regression test version, which is different from the regression tests run through rt.sh). Because it is in Vlab, a --recursive leads to a checkout error. Doing it the way Ligia described it avoids this problem.
JulieSchramm
left a comment
There was a problem hiding this comment.
This update of the tech document looks good to me, with the corrections suggested by Steve and Dom.
- Fix typo and comma use - Make a few sentences more clear - Fix bug in git clone command
| ^^^^^^^^^^^^^^^^^^^ | ||
|
|
||
| The RT configuration files are located in ``./tests`` relative to the top-level directory of NEMSfv3gfs and have names ``rt*.conf``. The default RT configuration file, supplied with the NEMSfv3gfs master, compares the results from the non-CCPP code to the *official baseline* and is called ``rt.conf``. Before running the RT script ``rt.sh`` in the same directory, the user has to set one or more environment variables and potentially modify the script to change the location of the automatically created run directories. The environment variables are ``ACCNR`` (mandatory unless the user is a member of the default project *nems*; sets the account to be charged for running the RTs), ``NEMS_COMPILER`` (optional for the ``intel`` compiler option, set to ``gnu`` to switch), and potentially ``RUNDIR_ROOT``. ``RUNDIR_ROOT`` allows the user to specify an alternative location for the RT run directories underneath which directories called ``rt_$PID`` are created (``$PID`` is the process identifier of the ``rt.sh`` invocation). This may be required on systems where the user does not have write permissions in the default run directory tree. | ||
| The RT configuration files are located in ``./tests`` relative to the top-level directory of NEMSfv3gfs and have names ``rt*.conf``. The default RT configuration file, supplied with the NEMSfv3gfs master is called ``rt.conf`` and runs four types of configurations: IPD PROD, IPD REPRO, CCPP PROD, and CCPP REPRO. For the IPD configurations, CCPP is not used, that is, the code is compiled with ``CCPP=N``. The PROD configurations use the compiler flags used in NCEP operations for superior performance, while the REPRO configurations remove certain compiler flags to create b4b identical results between CCPP and IPD configurations. Before running the RT script ``rt.sh`` in directory ``./tests``, the user has to set one or more environment variables and potentially modify the script to change the location of the automatically created run directories. The environment variables are ``ACCNR`` (mandatory unless the user is a member of the default project *nems*; sets the account to be charged for running the RTs), ``NEMS_COMPILER`` (optional for the ``intel`` compiler option, set to ``gnu`` to switch), and potentially ``RUNDIR_ROOT``. ``RUNDIR_ROOT`` allows the user to specify an alternative location for the RT run directories, underneath which directories called ``rt_$PID`` are created (``$PID`` is the process identifier of the ``rt.sh`` invocation). This may be required on systems where the user does not have write permissions in the default run directory tree. |
There was a problem hiding this comment.
I do not understand why an environment variable needs to be set in the script.
Is there a reason there cannot be a standard bash environment override such as:
ACCNR=${ACCNR:-""}
with a test for a blank ACCNR located before it is used (say around line 555)?
if [ -z "${ACCNR}" ]; then
echo "ERROR: set ACCNR to a valid account key"
fi
There was a problem hiding this comment.
See my comment somewhere else in this thread. export ACCR=gmtb works just as fine, I am doing this all the time.
There was a problem hiding this comment.
See here: https://github.com/NCAR/NEMSfv3gfs/blob/8a1e235d57a13d40e8ba71d3ce174d62dfa4f265/tests/rt.sh#L569 for theia and some other platforms, no ACCNR is set a priori. The reason is that there used to be a script that was supposed to determine the account number automatically (based on the user's groups and certain rules), but this doesn't work anymore since the transition to slurm (and didn't work so well either beforehand), hence no default.
There was a problem hiding this comment.
I believe the script could be modified (around line 555) to override a shell environment variable. However, at this time I am just documenting the script as it is. I will change the text from "the user has to set one or more environment variables and potentially modify the script to change the location of the automatically created run directories. The environment variables are..." to "the user has to set one or more environment variables on the working shell or in the script. The environment variables are..."
There was a problem hiding this comment.
Just to clarify the situation.
On the WCOSS platforms (all phases/partitions), the variable ACCNR is set explicitly in the script and the user has to modify the corresponding lines. On all other platforms (cheyenne, theia, jet, gaea), ACCNR is explicitly NOT set in the host environment, and for one of them there is even a comment as to why:
# DO NOT SET AN ACCOUNT EVERYONE IS NOT A MEMBER OF
# USE AN ENVIRONMENT VARIABLE TO SET ACCOUNT
# ACCNR=cmp
This allows the user to do exactly what the comment says.
This is on purpose I think, because only the code managers and others involved in the actual hand-off to NCO are working on these systems. We should probably restrict our documentation to the user development platforms, in which case all variables can and should be set in the form of environment variables.
gold2718
left a comment
There was a problem hiding this comment.
Approving but I believe the user interface could be improved and simplified with a few lines of code in rt.sh.
…ession test script rt.sh.
|
Let me know if the following language is acceptable. If not, please suggest a language. |
climbfuji
left a comment
There was a problem hiding this comment.
Thanks for your patience with my picky comments on rt.sh. The script may not be the most user-friendly, but it was written (not by me!) for a specific purpose and it does serve this purpose quite well.
Approved, waiting to see if Grant wants to add something.
…Test script rt.sh.
| RUNDIR_ROOT=${RUNDIR_ROOT:-${PTMP}/${USER}/FV3_RT}/rt_$$ | ||
|
|
||
|
|
||
| Non-CCPP vs CCPP Tests |
There was a problem hiding this comment.
I liked this explanation of creating your own baseline. Perhaps it is not the job of CCPP documentation to explain to users how to do so, but is this information found anywhere else? I realize that the specifics of non-CCPP vs CCPP in this section is longer valid, but the underlying instructions regarding personal baselines was nice and I'm sad to see it go.
Updates to the CCPP Technical Documentation to reflect changes in the code management and regression tests of CCPP when used with NEMSfv3gfs. No source code was altered, so no regression tests are needed.