Skip to content

Plotting Timeseries in Python#345

Merged
apcraig merged 11 commits into
CICE-Consortium:masterfrom
mattdturner:update_timeseries
Aug 22, 2019
Merged

Plotting Timeseries in Python#345
apcraig merged 11 commits into
CICE-Consortium:masterfrom
mattdturner:update_timeseries

Conversation

@mattdturner
Copy link
Copy Markdown
Contributor

@mattdturner mattdturner commented Aug 8, 2019

PR checklist

  • Short (1 sentence) summary of your PR:
    This PR replaces the previous timeseries plotting script (timeseries.csh) with a new Python-based script.
  • Developer(s):
    @mattdturner
  • Suggest PR reviewers from list in the column to the right.
  • Please copy the PR test results link or provide a summary of testing completed below.
    N/A since no CICE code is changed
  • How much do the PR code changes differ from the unmodified code?
    • bit for bit
    • different at roundoff level
    • more substantial
  • Does this PR create or have dependencies on Icepack or any other models?
    • Yes
    • No
  • Does this PR add any new test cases?
    • Yes
    • No
  • Is the documentation being updated? ("Documentation" includes information on the wiki or in the .rst files from doc/source/, which are used to create the online technical docs at https://readthedocs.org/projects/cice-consortium-cice/.)
    • Yes
    • No, does the documentation need to be updated at a later time?
      • Yes
      • No
  • Please provide any additional information or relevant details below:
    The Python script is dependent on the Python packages numpy, logging, os, sys, matplotlib, re, datetime, and argparse.
    - Many of these packages should be included by default with Python installs
    - Once this is finalized and approved, I will make the same changes to the Icepack timeseries script.

@mattdturner
Copy link
Copy Markdown
Contributor Author

This PR is not ready to be merged. I still need to update the documentation. However, here are a few figures that were generated by the script using 2 random 1-year CICE runs.

image

image

The grid lines are an optional argument (enabled by adding --grid to the command line). The script is currently configured to create .png files of size 12" x 8", with a dpi value of 300. If there is anything that should be changed or added as a feature, please let me know.

@phil-blain
Copy link
Copy Markdown
Member

This looks very nice!

@mattdturner
Copy link
Copy Markdown
Contributor Author

Ok, documentation has been updated. I still have the timeseries.csh script included in the repo.

Before merging this, are there any other options that I should include in the script? For example,

  1. Specify file type (png, pdf, etc.) from command line?
  2. Specify figure size (default is 12" x 8") from command line?
  3. Specify dpi (default is 300) from command line?
  4. Specify font size from command line? Would I need to allow specification of axis label font size, legend font size, and title font size separately?
  5. Any other feature requests?

Copy link
Copy Markdown
Contributor

@apcraig apcraig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused about timeseries.py. Is this a new file? It looks like it's incomplete ending in the middle of an if statement?

@mattdturner
Copy link
Copy Markdown
Contributor Author

It is a new file. I thought, for now, it would be better to keep the original timeseries.csh script and just add the new timeseries.py script as another option.

The end of the file:

if __name__ == "__main__":
    main()

is not incomplete. While most programming and scripting languages use braces { } or end statements to define a block of code, Python uses indentation. All this block of code is doing is saying to only run main if the main script being run is timeseries.py. If the script is accessed via an import statement, then that call to main is not run.

@apcraig
Copy link
Copy Markdown
Contributor

apcraig commented Aug 9, 2019

@mattdturner sounds good, thanks for the clarification. Let me know when you think this is ready to merge, I will do a final review and then merge the PR. As it stands, the changes are fine by me.

Copy link
Copy Markdown
Contributor

@eclare108213 eclare108213 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, thank you.

@mattdturner
Copy link
Copy Markdown
Contributor Author

This should be ready to be merged. I can always add new features (if requested) at a later time.

@eclare108213
Copy link
Copy Markdown
Contributor

Thank you @mattdturner. I'm fine with this addition, but since I'm not an expert python user (far from it!), I'd like at least one person who has more experience to give this a try on a different system from the one it was designed on, as a sanity check. @phil-blain @rgrumbine @proteanplanet @apcraig ?

@apcraig
Copy link
Copy Markdown
Contributor

apcraig commented Aug 13, 2019

I'm checking out the branch now and trying on cheyenne.

@apcraig
Copy link
Copy Markdown
Contributor

apcraig commented Aug 13, 2019

I am having some problems on cheyenne. I am going to take the discussion offline to email with @mattdturner to provide some feedback and try to work thru some of the problems.

Copy link
Copy Markdown
Member

@phil-blain phil-blain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried the script in different conda environments and it works correctly with matplotlib 1.5.1, 2.2.3, 2.2.4.

With matplotlib 1.3.1 (system install) I obtain an error fro matplotlib :

INFO:__main__:Log file = /home/phb001/data/site2/cice/runs/brooks_intel_smoke_gx1_44x1_medium_qc.qc_base_20190626/cice.runlog.190630-033320
Traceback (most recent call last):
  File "./timeseries.py", line 275, in <module>
    main()
  File "./timeseries.py", line 272, in main
    plot_timeseries(args.log_dir, field, dtg, arctic, antarctic, expon, grid=args.grid)
  File "./timeseries.py", line 138, in plot_timeseries
    anchored_text = AnchoredText(text_field,loc=2)
  File "/usr/lib/pymodules/python2.7/matplotlib/offsetbox.py", line 1102, in __init__
    propkeys = prop.keys()
AttributeError: 'NoneType' object has no attribute 'keys'

I'm not sure if this is actually caused by the matplotlib version since the different environments also have different versions of numpy...
Maybe to be on the safe side we should list the tested packages versions in the documentation.

# Build the fieldlist based on which fields are passed
fieldlist = []
if args.area:
fieldlist.append('total ice area (km^2)')
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if the graph title and the ylabel would format the ^2, ^3 as exponents.

…els) using LaTeX format. Additionally, ensure that the output file has a proper name no matter what directory the script is run from.
@apcraig
Copy link
Copy Markdown
Contributor

apcraig commented Aug 19, 2019

I tested the latest and it's working for me on cheyenne. I had to be careful because the system was loading several different python versions to my env, but pip went with only one of them, so I had to make sure I was using a self-consistent setup. I don't seem to have much luck getting python to work. Just a couple other comments which I also made offline,

  • the csh script does not yet have the updated filenames. If we are keeping the csh and python scripts, I think both should use the update png filenames.
  • we should think about having just 1 script for plotting. maybe we can get rid of the csh script now or maybe later, but we should be working towards having just one.
  • do these scripts work on multiple output files? for instance, a long run with restarts in between? is one set of plots for the entire run created?
  • we might want to rethink how the scripts are used/accessed. it might make sense to drop them into each case directory. are these scripts going to be used by the community quite a bit or do we expect them to be used just by the consortium as "our" quick look tool?
  • we might want to update the documentation as bit. the documentation is under testing right now. it also works for just quickly making plots of any case.

I don't know whether the above list should be fixed now or deferred or a bit of both. I think it would be great to make another set of changes if we think there are some additional improvements that could be made.

@mattdturner
Copy link
Copy Markdown
Contributor Author

  • the csh script does not yet have the updated filenames. If we are keeping the csh and python scripts, I think both should use the update png filenames.

I'll implement this today.

  • we should think about having just 1 script for plotting. maybe we can get rid of the csh script now or maybe later, but we should be working towards having just one.

I would prefer only having 1 script as well. My personal inclination is to just have the Python script, but that is partly because Python is my scripting language of choice. I'm open to keeping both until the Python script is more widely tested to ensure that it works properly for many users in many cases.

  • do these scripts work on multiple output files? for instance, a long run with restarts in between? is one set of plots for the entire run created?

Right now, the script just grabs the most recently modified output file. I could add an option to the Python script to either allow the user to pass a specific output file, or an option to plot the data from all output files. Thoughts?

  • we might want to rethink how the scripts are used/accessed. it might make sense to drop them into each case directory. are these scripts going to be used by the community quite a bit or do we expect them to be used just by the consortium as "our" quick look tool?

I'm not sure if the community will be using them much. I think they are very convenient to just quickly get an idea of what is happening during the simulation. But for now, I will set the scripts to automatically be copied into case directories.

  • we might want to update the documentation as bit. the documentation is under testing right now. it also works for just quickly making plots of any case.

I agree. These plotting scripts are not testing-specific. I can move the documentation to the user guide section.

@mattdturner
Copy link
Copy Markdown
Contributor Author

I pushed some new commits.

  • the csh script does not yet have the updated filenames. If we are keeping the csh and python scripts, I think both should use the update png filenames.

Done

  • we should think about having just 1 script for plotting. maybe we can get rid of the csh script now or maybe later, but we should be working towards having just one.

Both scripts are still in the repo for now.

  • do these scripts work on multiple output files? for instance, a long run with restarts in between? is one set of plots for the entire run created?

The Python script has been updated to allow users to pass a specific log file. If a directory is passed, the most recently modified log file is used.

  • we might want to rethink how the scripts are used/accessed. it might make sense to drop them into each case directory.

Done

  • we might want to update the documentation as bit. the documentation is under testing right now. it also works for just quickly making plots of any case.

The documentation for the timeseries scripts has been moved to the Running CICE section (https://mattdturner-cice.readthedocs.io/en/update_timeseries/user_guide/ug_running.html#timeseries-plotting)

@apcraig
Copy link
Copy Markdown
Contributor

apcraig commented Aug 22, 2019

I have tested this on cheyenne and it seems to be working. I think this can be merged after maybe one more update to the documentation.

Thanks for moving the documentation, I think that works well. I have a few comments.

  • I like how the script works, but the documentation is a little confusing how the "file", "dir", and "neither" arguments work. Could this be clarified a bit.
    • if you pass in a dir, it will look either in that directory or that directory/logs for a filename like cice.run*. it uses the latest file by timestamp. so you can point to either a case directory or the logs directory directly.
    • if you pass in a path/file, it parses that file assuming the file matches the cice.run* form.
    • if you pass in nothing, it looks like it doesn't work for the csh script. but for the python script, it looks like it is like passing "." as the local dir and looks for a logs file locally or in the logs dir?
  • I would add a sentence that says the plots are only made for a single output file at a time. The ability to plot output from a series of cice.run files is not currently possible but may be added later.
  • I would suggest, in addition to pointing to the code compliance section for setting up the python environment, that you actually add documentation that shows what pip commands to use to load the packages. Something like, "see the code compliance section for additional information about how to setup the python environment, but we recommend using pip as follows,

pip --user install numpy
pip --user install matplotlib
pip --user install datetime

  • Rather than saying "This plotting script can be used to plot the following variables", I would say "The plotting script will plot the following variables by default, but you can also select individual plots by optional argument" or something like that.
  • I might not show the output from the python -h output. I would say use -h to get the latest help information and then describe a few of the current options. That makes the documentation more future-proof.
  • Finally, part of the confusion in the documentation is trying to mix the documentation for the csh and python scripts and not being totally clear about them. What I suggest is to document primarily the python script (partly because we hope the csh script will eventually go away and partly because the python script is more capable). In the first paragraph, I would just say something like "there is a python version and a csh version. both scripts create the same set of plots, but the python script has more capabilities, and it's likely the csh script will be removed in the future. To use the python script....". And focus only on the python script documentation in the following paragraphs. Then in the last paragraph of the section, document the csh script by saying the csh script basically works the same way, give a quick example, and then briefly describe that most other features are missing compared to the python script. Keep the csh script documentation separate, simple, and short, and then easy to remove later.

@mattdturner
Copy link
Copy Markdown
Contributor Author

Copy link
Copy Markdown
Contributor

@eclare108213 eclare108213 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, with the caveat that I haven't tried to use either script recently.

@apcraig
Copy link
Copy Markdown
Contributor

apcraig commented Aug 22, 2019

Thank you for the updates @mattdturner , documentation looks great.

@apcraig apcraig merged commit 0033810 into CICE-Consortium:master Aug 22, 2019
@rgrumbine
Copy link
Copy Markdown
Contributor

rgrumbine commented Aug 23, 2019 via email

@mattdturner mattdturner deleted the update_timeseries branch August 26, 2019 12:02
@phil-blain
Copy link
Copy Markdown
Member

phil-blain commented Aug 30, 2019

I just tried the script on a real regression test (base_suite) and I have a feature request : use different linestyles for the 'test' and 'baseline' lines, so that we can more easily see if some lines are superposed :
total_ice_area_brooks_intel_smoke_gx3_8x2_diag1_run5day test-2019-08-29-cm2m_base-brooks_intel_smoke_gx3_8x2_diag1_run5day base-2019-08-29

Here I think the Arctic test and baseline are actually BFB and are superposed, but it would be cool to have either the baseline or test line in dashed linestyle so we can still see both.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants