Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
111 changes: 110 additions & 1 deletion docs/iris/src/developers_guide/graphics_tests.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,113 @@
Graphics tests
**************

TODO: a full description is pending, will be provided for release 1.12.
The only practical way of testing plotting functionality is to check actual
output plots.
For this, a basic 'graphics test' assertion operation is provided in the method
:method:`iris.tests.IrisTest.check_graphic` : This tests plotted output for a
match against a stored reference.
A "graphics test" is any test which employs this.

At present (Iris version 1.10), such tests include the testing for modules
`iris.tests.test_plot` and `iris.tests.test_quickplot`, and also some other
'legacy' style tests (as described in :ref:`developer_tests`).
It is conceivable that new 'graphics tests' of this sort can still be added.
However, as graphics tests are inherently "integration" style rather than true
unit tests, results can differ with the installed versions of dependent
libraries (see below), so this is not recommended except where no alternative
is practical.

Testing actual plot results introduces some significant difficulties :
* Graphics tests are inherently 'integration' style tests, so results will
often vary with the versions of key dependencies, i.e. the exact versions of
third-party modules which are installed : Obviously, results will depend on
the matplotlib version, but they can also depend on numpy and other
installed packages.
* Although it seems possible in principle to accommodate 'small' result changes
by distinguishing plots which are 'nearly the same' from those which are
'significantly different', in practice no *automatic* scheme for this can be
perfect : That is, any calculated tolerance in output matching will allow
some changes which a human would judge as a significant error.
* Storing a variety of alternative 'acceptable' results as reference images
can easily lead to uncontrolled increases in the size of the repository,
given multiple independent sources of variation.


Graphics Testing Strategy
=========================

Prior to Iris 1.10, all graphics tests compared against a stored reference
image with a small tolerance on pixel values.

From Iris v1.11 onward, we want to support testing Iris against multiple
versions of matplotlib (and some other dependencies).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point in this document I think that the word 'dependencies' should be expanded or explained. Maybe just replaced with 'dependent libraries'. Although it is obvious to us what it means, it may not be as clear to a developer looking at this document without the context that we now have.

To make this manageable, we have now rewritten "check_graphic" to allow
multiple alternative 'correct' results without including many more images in
the Iris repository.
This consists of :

* using a perceptual 'image hash' of the outputs (see
<<https://github.com/JohannesBuchner/imagehash>) as the basis for checking
test results.
* storing the hashes of 'known accepted results' for each test in a
database in the repo (which is actually stored in
``lib/iris/tests/results/imagerepo.json``).
* storing associated reference images for each hash value in a separate public
repository, currently in https://github.com/SciTools/test-images-scitools ,
allowing human-eye judgement of 'valid equivalent' results.
* a new version of the 'iris/tests/idiff.py' assists in comparing proposed
new 'correct' result images with the existing accepted ones.

BRIEF...
There should be sufficient work-flow detail here to allow an iris developer to:
* understand the new check graphic test process
* understand the steps to take and tools to use to add a new graphic test
* understand the steps to take and tools to use to diagnose and fix an graphic test failure


Basic workflow
==============
# If you notice that a graphics test in the Iris testing suite has failed
following changes in Iris or any of its dependencies, this is the process
you now need to follow:

#1. Create a directory in iris/lib/iris/tests called 'result_image_comparison'.
#2. From your Iris root directory, run the tests by using the command:
``python setup.py test``.
#3. Navigate to iris/lib/iris/tests and run the command: ``python idiff.py``.
This will open a window for you to visually inspect the changes to the
graphic and then either accept or reject the new result.
#4. Upon acceptance of a change or a new image, a copy of the output PNG file
is added to the reference image repository in
https://github.com/SciTools/test-images-scitools. The file is named
according to the image hash value, as ``<hash>.png``.
#5. The hash value of the new result is added into the relevant set of 'valid
result hashes' in the image result database file,
``tests/results/imagerepo.json``.
#6. The tests must now be re-run, and the 'new' result should be accepted.
Occasionally there are several graphics checks in a single test, only the
first of which will be run should it fail. If this is the case, then you
may well encounter further graphical test failures in your next runs, and
you must repeat the process until all the graphical tests pass.
#7. To add your changes to Iris, you need to make two pull requests. The first
should be made to the test-images-scitools repository, and this should
contain all the newly-generated png files copied into the folder named
'image_files'.
#8. The second pull request should be created in the Iris repository, and should
only include the change to the image results database
(``tests/results/imagerepo.json``) :
This pull request must contain a reference to the matching one in
test-images-scitools.

Note: the Iris pull-request will not test out successfully in Travis until the
test-images-scitools pull request has been merged : This is because there is
an Iris test which ensures the existence of the reference images (uris) for all
the targets in the image results database.


Fixing a failing graphics test
==============================


Adding a new graphics test
==========================
108 changes: 62 additions & 46 deletions docs/iris/src/developers_guide/pulls.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,67 +16,83 @@ is merged. Before submitting a pull request please consider this list.
The Iris Check List
====================

* Have you provided a helpful description of the Pull Request? What has
changed and why. This should include:

* the aim of the change - the problem addressed, a link to the issue;
* how the change has been delivered;
* a "What's New" entry, submitted as part of the pull request. See `Contributing a "What's New" entry`_.

* Do new files pass PEP8?

* PEP8_ is the Python source code style guide.
* There is a python module for checking pep8 compliance: python-pep8_
* Have you provided a helpful description of the Pull Request?
I.E. what has changed and why. This should include:
* the aim of the change ; the problem addressed ; a link to the issue.
* how the change has been delivered.
* a "What's New" entry, submitted as a new file added in the pull request.
See `Contributing a "What's New" entry`_.

* Do all the tests pass locally?

* The Iris tests may be run with ``python setup.py test`` which has a command
line utility included.
* Coding standards, including PEP8_ compliance and copyright message (including
the correct year of the latest change), are tested.

* Has a new test been provided?

* Has iris-test-data been updated?

* iris-test-data_ is a github project containing all the data to support the
tests.
* If this has been updated a reference to the relevant pull request should be
provided.

* Has the the documentation been updated to explain the new feature or bug fix?

* with reference to the developer guide on docstrings_
* Have new tests been provided for all additional functionality?

* Have code examples been provided inside the relevant docstrings?

* Has iris-sample-data been updated?

* iris-sample-data_ is a github project containing all the data to support
the gallery and examples.
* Do all modified and new sourcefiles pass PEP8?
* PEP8_ is the Python source code style guide.
* There is a python module for checking pep8 compliance: python-pep8_
* a standard Iris test checks that all sourcefiles meet PEP8 compliance
(see "iris.tests.test_coding_standards.TestCodeFormat").

* Do all modified and new sourcefiles have a correct, up-to-date copyright
header?
* a standard Iris test checks that all sourcefiles include a copyright
message, including the correct year of the latest change
(see "iris.tests.test_coding_standards.TestLicenseHeaders").

* Has the documentation been updated to explain all new or changed features?
* refer to the developer guide on docstrings_

* Have code examples been provided inside docstrings, where relevant?
* these are strongly recommended as concrete (working) examples always
considerably enhance the documentation.
* live test code can be included in docstrings.
* See for example :data:`iris.cube.Cube.data`
* Details at http://www.sphinx-doc.org/en/stable/ext/doctest.html
* The documentation tests may be run with ``make doctest``, from within the
``./docs/iris`` subdirectory.

* Have you provided a 'whats new' contribution?
* this should be done for all changes that affect API or behaviour.
See :ref:`whats_new_contributions`

* Does the documentation build without errors?

* The documentation is built using ``make html`` in ``./docs/iris``.

* Do the documentation tests pass?

* ``make doctest``, ``make extest`` in ``./docs/iris``.

* Does this update introduce/change any dependencies? If so:

* Has the travis file been updated to reflect these changes?

* ``./.travis.yml`` is used to manage the continuous integration testing.

* Has ``conda-requirements.txt`` been updated to reflect these changes?
* Has the ``INSTALL`` file been updated to reflect these changes?
* Do the documentation and code-example tests pass?
* Run with ``make doctest`` and ``make extest``, from within the subdirectory
``./docs/iris``.
* note that code examples must *not* raise deprecations. This is now checked
and will result in an error.
When an existing code example encounters a deprecation, it must be fixed.

* Has the travis file been updated to reflect any dependency updates?
* ``./.travis.yml`` is used to manage the continuous integration testing.
* the files ``./conda-requirements.yml`` and
``./minimal-conda-requirements.yml`` are used to define the software
environments used, using the conda_ package manager.

* Have you provided updates to supporting projects for test or example data?
* the following separate repos are used to manage larger files used by tests
and code examples :
* iris-test-data_ is a github project containing all the data to support the
tests.
* iris-sample-data_ is a github project containing all the data to support
the gallery and examples.
* test-images-scitools_ is a github project containing reference plot images
to support iris graphics tests : see :ref:`test graphics images`.
* If new files are required by tests or code examples, they must be added to
the appropriate supporting project via a suitable pull-request.
This new 'supporting pull request' should be referenced in the main Iris
pull request, and must be accepted and merged before the Iris one can be.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like your additions to this document. Very clear and instructive.



.. _PEP8: http://www.python.org/dev/peps/pep-0008/
.. _python-pep8: https://pypi.python.org/pypi/pep8
.. _conda: http://conda.readthedocs.io/en/latest/
.. _iris-test-data: https://github.com/SciTools/iris-test-data
.. _iris-sample-data: https://github.com/SciTools/iris-sample-data
.. _test-images-scitools https://github.com/SciTools/test-images-scitools
.. _docstrings: http://scitools.org.uk/iris/docs/latest/developers_guide/documenting/docstrings.html
.. _Contributing a "What's New" entry: http://scitools.org.uk/iris/docs/latest/developers_guide/documenting/whats_new_contributions.html

21 changes: 11 additions & 10 deletions docs/iris/src/developers_guide/tests.rst
Original file line number Diff line number Diff line change
Expand Up @@ -131,20 +131,21 @@ module(s) under test.

Graphics tests
=================
Certain Iris tests rely on testing plotted results.
This is required for testing the modules :mod:`iris.plot` and
:mod:`iris.quickplot`, but is also used for some other legacy and integration
tests.
Certain Iris tests are based on checking plotted images.
This the only way of testing the modules :mod:`iris.plot` and
:mod:`iris.quickplot`, but is also used for some other legacy and integration-
style testcases.

Prior to Iris version 1.10, a single reference image for each test was stored
in the main Iris repository, and a 'tolerant' test was performed against this.
Prior to Iris version 1.10, a single reference image for each testcase was
stored in the main Iris repository, and a 'tolerant' comparison was performed
against this.

From version 1.11 onwards, graphics test outputs are compared against possibly
*multiple* known-good images, of which only a signature is stored.
From version 1.11 onwards, graphics testcase outputs are compared against
possibly *multiple* known-good images, of which only the signature is stored.
This uses a sophisticated perceptual "image hashing" scheme (see:
<https://github.com/JohannesBuchner/imagehash>).
Only imagehash signatures are stored in the Iris repo itself, thus freeing up
valuable space. Meanwhile, the actual reference *images*, which are required
for human-eyes evaluation of proposed new "good results", are all stored
valuable space. Meanwhile, the actual reference *images* -- which are required
for human-eyes evaluation of proposed new "good results" -- are all stored
elsewhere in a separate public repository.
See :ref:`developer_graphics_tests`.