Add initial version of ccpp_track_variables.py#419
Conversation
…ing variable against standard names
…sh out argument parsing routine more
…ree for that Suite: in other words, a list of schemes in the order that they are called, including duplicates and subcycle loops
…etadata converted properly (?)
… of schemes for a given SDF, and a dictionary that associates those schemes with their corresponding .meta files. Last step is to simply parse those .meta files for their variables!
…tadataTable objects for each scheme's .meta file; these objects can then be parsed to get all the information we need for the final step!
…subroutine name, and intent. If partial match, output list of partial matches
…le, variable name, and directory with metadata files, and outputs a list of schemes that use the given variable, along with their intent. Will convert into a more "graph-like" output later, along with other cleanup that is needed before review/testing by the group.
…ive a graphical representation of the calling tree Also some more cleanup: - Change default logging level to WARNING to get rid of unnecessary log messages (debug level remains unchanged) - Change more strings to f strings - Remove more leftover debug printouts - Raise exception if create_metadata_filename_dict fails (esp for no .meta files found) - Remove check_var function (might re-introduce in the future)
…ve duplicate calls
|
Thank you for this PR. This is a very helpful new addition to the CCPP. Going forward, we would like to have a visualization capability but having the script is the foundational step. |
|
@ligiabernardet I haven't included any unit or regression tests yet, but I can make that a priority before opening this PR for review. I'll need to familiarize myself with the existing testing framework first |
christinaholtNOAA
left a comment
There was a problem hiding this comment.
I've heard an awful lot about this tool. Thought I'd pop in and see what it was all about. :)
| if not success: | ||
| logging.error(f'Parsing suite definition file {sdf} failed.') | ||
| success = False | ||
| return (success, suite) |
There was a problem hiding this comment.
In this type of error handling situation, it could be useful to use a try/except block to get more information about how the SDF parsing failed instead of only reporting that it didn't work.
This is also a nice example of how one who isn't regularly checking the value of success in the caller of this function could just power through the rest of main with an un-parsable Suite object.
There was a problem hiding this comment.
This is a limitation of how errors are handled in the existing object definitions; since they handle errors through a "success" flag I don't want to try to mess with that, especially since this code will be superseded in the (hopefully) near-future and need updating regardless.
| raise Exception('Call to import_config failed.') | ||
|
|
||
| # Variables defined by the host model | ||
| (success, _, _) = gather_variable_definitions(config['variable_definition_files'], config['typedefs_new_metadata']) |
There was a problem hiding this comment.
A comment about this magic could be helpful. I assume given its name, it will update the config data structure in place, but it's not obvious. You are also opting (I'm assuming intentionally) to not store the output, which throws a wrench in the obvious nature of this call.
There was a problem hiding this comment.
I'll be honest, I'm not 100% sure exactly what this step is doing, but it is necessary for converting metadata from an old format. I've added a comment trying to be as informative as possible.
…ctures from capgen
…ppropriate locations
… and intent(inout)
|
After addressing some initial concerns (still a few to go) and updating for the most recent commits on There are a few known issues/limitations right now, I think they can be addressed later on but I figured I'd point them out off the bat:
|
…from add_argument calls
… remove re-assignment of arguments and just manage them with "args" object
…" function from ccpp_prebuild
|
@christinaholtNOAA I believe I have resolved or responded to all your comments, please let me know if you have any further comments or questions. |
christinaholtNOAA
left a comment
There was a problem hiding this comment.
This looks pretty nice. Just one remaining thing I'm really not sure about.
| raise Exception('Call to import_config failed.') | ||
|
|
||
| # Variables defined by the host model; this call is necessary because it converts some old | ||
| # metadata formats so they can be used later in the script |
There was a problem hiding this comment.
I looked at gather_variable_definitions and it seems to be a function that doesn't act on any input data structures in place, or have any side effects. Perhaps you are calling this only for the data checking, but I really don't think that this call is doing anything to metadata that is used downstream, so the comment seems misleading. To have it convert metadata, I think you'd need to save its outputs in a data structure. Perhaps update one of the config entries? I'm not sure how that would work since I haven't dug into the details of the objects being acted on here.
climbfuji
left a comment
There was a problem hiding this comment.
Nice job, I've got a few questions and suggestions ...
| with that scheme""" | ||
|
|
||
| metadata_dict = {} | ||
| scheme_filenames=glob.glob(os.path.join(metapath, "*.meta")) |
There was a problem hiding this comment.
Is there a reason why you don't get those from ccpp_prebuild_config.py?
There was a problem hiding this comment.
I don't really like using glob, and also the schemes may not all sit in the same directory. For example, the gsl folks have a chemistry fork where the chemistry metadata is in a separate subdirectory chemistry in the ccpp-physics repo.
There was a problem hiding this comment.
Regarding getting the metadata files from ccpp_prebuild_config.py, I was originally hoping I could eliminate the dependency on that file, and allow users to specify whichever metadata path(s) they would like.
Re: glob, is the issue of using glob specifically or that the current design does not allow users to specify multiple directories? If it is the former I will have to re-think how the metadata files are specified, but if it is the latter I can easily put in a loop to allow multiple directories to be specified on the command line. Let me know if you think the second approach is acceptable.
There was a problem hiding this comment.
Re. 1: that's ok, even if the path(s) contain schemes that are not in use by this model (i.e. not in the prebuild config), the suites won't have those schemes.
Re. 2. There is a general aversion to glob, more so for cmake than for other tools, but nonetheless. If it can be avoided, fine, but if you need it than that's ok for me. Yes, you will need a capability to have one or more metadata paths if you don't use the prebuild config. Note though that you put the burden on the user to know where all the files are located that may be used in the suite.
| success = False | ||
| return success | ||
|
|
||
| # Call tree of all schemes in SDF (with duplicates and subcycles) |
There was a problem hiding this comment.
The first 20 lines are identical with the first twenty lines of the parse routine. Can this be combined to avoid code duplication?
There was a problem hiding this comment.
Maybe have an optional (or mandatory) argument create_call_tree for the parse routine, that switches between what is done in the second half of the parse routine?
There was a problem hiding this comment.
This is a good idea. I initially had the naive idea to try to not modify any existing routines so that I wouldn't have to run as many tests, but it makes more sense the way you suggested. I have implemented this change, let me know how it looks
There was a problem hiding this comment.
This looks great, thanks for making the change.
…hod of Suite class with an optional argument to avoid duplicating existing code while not affecting existing calls to parse method.
|
My main comments have been addressed, so I pass the baton to @gold2718 ... |
gold2718
left a comment
There was a problem hiding this comment.
I have a couple of questions that would help me understand the code but I don't see anything that should hold up getting it in. Sorry this took so long.
| the name of the scheme and the intent of the variable within that scheme""" | ||
|
|
||
| # Create a list of tuples that will hold the in/out information for each scheme | ||
| var_graph=[] |
There was a problem hiding this comment.
Is there a pattern where you use spaces around the = symbol and where you omit them?
There was a problem hiding this comment.
Just regular sloppiness :) I've standardized this to use spaces except when in keyword arguments; I believe that's consistent with PEP8
|
|
||
| # Loop through call tree, find matching filename for scheme via dictionary schemes_in_files, | ||
| # then parse that metadata file to find variable info | ||
| partial_matches = {} |
There was a problem hiding this comment.
Do you have examples of partial matches? How does tracking them help?
There was a problem hiding this comment.
This case is for when a user inputs something like "latent_heat" as their variable, which matches multiple standard names (e.g. latent_heat_of_vaporization_of_water_at_0c, surface_upward_potential_latent_heat_flux, etc.). This makes it easier for users who might not know the exact standard name of the variable they are looking for, or for something like, for example, any variable containing the word "temperature".
The final section in my PR message ("Partial match for variable") gives an example of this.
|
@mkavulich Would you please update this to the latest main in anticipation of merging? |
Initial version of script for tracking which variables are modified by which schemes in a given suite.
This script takes in as input the following arguments:
The script output changes based on the input:
This is a brand new script, so it does not change any existing interfaces. It does add a new variable and method to the Suite class: call_tree is a list of schemes in order (including duplicates) for the given Suite. The method make_call_tree is used to populate the call_tree list; it is not called by the write() method so must be called separately if desired.
Please provide feedback on the format and structure of the program!
I started with roughly the ccpp_prebuild script as a guide, but I am not tied to any particular style or philosophy here.
Testing:
test removed: None
unit tests: None
system tests: None
manual testing: Ran script with a variety of inputs, gives expected output. Examples below:
Running examples
No new python packages or other prerequisites are needed, so you can run on any platform that CCPP can already run on. It is also not necessary to build any code or run prebuild scripts; all you need is the ccpp-framework and ccpp-physics repositories, a model config file (this requirement can hopefully be removed in the future), and the xmls for the suites you wish to analyze.
Here is an example setup based on the ufs-weather-model on Hera:
From here, you can run the following example commands:
Successful output: prints list of schemes that use the specified variable, along with the variable's intent
Unknown variable: script exits with descriptive error message
Partial match for variable: outputs list of partial matches for each scheme