-
Notifications
You must be signed in to change notification settings - Fork 6
More flexible synthetic data generation, more dataset types #38
More flexible synthetic data generation, more dataset types #38
Conversation
…ded tests for data dataset types.
pp-mo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggested some rework -- hope it's not too troublesome
iris_ugrid/tests/unit/tests/synthetic_data_generator/test_synthetic_data.py
Outdated
Show resolved
Hide resolved
pp-mo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some tiny points to consider, but I think it hangs together now 💐
I see that subtle differences between similar things have made some things a bit repetitive + verbose -- especially the tests -- but so be it.
After all, @bjlittle never shrinks from writing stuff like that -- especially in tests !
| # Data addition dependent on this assumption: | ||
| assert len(unlimited_dim_names) < 2 | ||
|
|
||
| # Fill all data variables (both mesh and phenomenon vars) with zeros. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tiny nit-picky point : not always zeros !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| """Shared data populating behaviour. | ||
| Adds placeholder data to the variables in a NetCDF file, accounting for | ||
| dimension size, unlimited dimensions and 'dimension coordinates'. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe say "a possible unlimited dimension" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
|
||
| file_path = create_file__xios_half_levels_faces( | ||
| def create_synthetic_file(self, **create_kwargs): | ||
| # Placeholder that will be overridden by sub-classes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As it's a placeholder -- i.e. an abstract method -- I don't think providing a "default" operation is a good idea.
You could either
- go back to using abc-s to enforce overriding
- provide no routine body, just "pass" (which should cause an error in use, as the caller expects a filepath)
raise NotImplementedError(), which is more-or-less what abc would do
Case (1) requires metaclass=abc.ABCMeta in the class def, and then @abstractmethod on this method.
(See iris.coords._DimensionalMetadata for examples)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, I'm familiar with what you're talking about but I wasn't sure of best practice, especially when it comes to constructing a test class. I will use 1 and 2!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There did seem to be some unexpected interaction with py.test so I opted for the simpler NotImplementedError.
Yes, that was mildly frustrating, but no big problem |
This PR shifts
tests/synthetic_data_generatorto a full package structure, where the generation code is held within__init__.pyand various CDL header templates are held within themesh_headers/sub-directory.The generation and associated tests are more agnostic than before, allowing the generation of a greater variety of dataset types.
The current available dataset types are as follows, and the code also accounts for planned future additions (e.g. data located on edges):