-
Notifications
You must be signed in to change notification settings - Fork 44
Add esmvalcore.local, a module to search data on the local filesystem
#1835
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## main #1835 +/- ##
==========================================
- Coverage 91.57% 91.52% -0.05%
==========================================
Files 203 210 +7
Lines 10979 11369 +390
==========================================
+ Hits 10054 10406 +352
- Misses 925 963 +38
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
esmvalcore.local, a module to search the local filesystemesmvalcore.local, a module to search data on the local filesystem
|
whoa! That's a biggie - I'll have a test or two, but I suggest we split the review effort: Manu checks the code and I test the thing or the other way round? Or Saskia checks the code, Manu tests, and I merge? 😁 |
|
I can take a look tomorrow |
|
Sorry, I won't have time to review this until next year. Levante is on maintenance currently (and we need to thoroughly test this with actual data/recipes), and I will be on AGU next week and on holidays afterwards. I can take a look in early January if that's necessary 👍 |
|
I'm rather hoping that we can get this merged sooner than January, otherwise I'm worried there will not be enough time left to get #1609 merged and have a few weeks for it to be used by ESMValCore development installation users before it hits the release. Note that the number of changes to the actual file-finding code in this pull request is pretty small: it's mostly renaming a lot of functions from _data_finder.py so they start with an |
|
All right, sounds reasonable. I'll try to have a quick look at the code today, but I would really appreciate it if someone can run some recipes with this (maybe myself on Friday, let's see). I know we have plenty of unit tests, but (1) you modified a lot of them and (2) I don't fully trust them (it happened in the past that changes didn't break unit tests but broke recipes) 😬 |
547b5d7 to
89df2b1
Compare
|
I merged the |
schlunma
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Bouwe, the code is much cleaner now! Some comments on the code, did not look at the tests yet.
|
I'll run recipes once the code settles a bit after review, maybe tomorrow or on Thu? 🍺 |
… if frequency=fx
|
@remi-kazeroni The problems you reported are fixed now. Could you have another go please? |
Thanks a lot for addressing these problems @bouweandela! I ran the same set of recipes again and all problems are now fixed, the recipes run fine. 👍 I'm currently running a few more recipes outside of |
|
I ran some further tests and almost all recipes seem to run fine. The main problem I noticed is that the data finder does not work with [2093965] INFO esmvalcore._recipe:1825 Creating preprocessor task ERA5_native6/vas_Amon
[2093965] INFO esmvalcore._recipe:1214 Creating preprocessor 'default' task for variable 'vas'
[2093965] DEBUG esmvalcore.local:443 Looking for files matching [PosixPath('/work/bd0854/DATA/ESMValTool2/RAWOBS/Tier3/ERA5/1/mon/vas/*.nc')]
[2093965] DEBUG esmvalcore._recipe:623 Using input files for variable vas of dataset ERA5:
[2093965] ERROR esmvalcore._recipe_checks:101 No input files found for variable {'short_name': 'vas', 'mip': 'Amon', 'variable_group': 'vas_Amon', 'diagnostic': 'ERA5_native6', 'preprocessor': 'default', 'dataset': 'ERA5', 'project': 'native6', 'tier': 3, 'type': 'reanaly', 'version': 1, 'recipe_dataset_index': 0, 'timerange': '1990/1990', 'alias': 'ERA5', 'original_short_name': 'vas', 'standard_name': 'northward_wind', 'long_name': 'Northward Near-Surface Wind', 'units': 'm s-1', 'modeling_realm': ['atmos'], 'frequency': 'mon', 'start_year': 1990, 'end_year': 1990}
[2093965] ERROR esmvalcore._recipe_checks:107 Looked for files matching: /work/bd0854/DATA/ESMValTool2/RAWOBS/Tier3/ERA5/1/mon/vas/*.nc
[2093965] ERROR esmvalcore._recipe_checks:108 Set 'log_level' to 'debug' to get more informationRegarding the diagnostics: it seems that only one diagnostic is currently using |
|
Thanks for testing again @remi-kazeroni!
The way the ERA5 data is organized on Levante is not compatible with what is specified in the recipe. Here is an example of a file path: note that the version is called I expect similar problems for other recipes which do specify version numbers that do not match how data is organized. In particular, for obs4MIPs where the situation is even hairier, see #1859. |
There the recipe loading tests which extensively patch ESMValCore: https://github.com/ESMValGroup/ESMValTool/blob/main/tests/integration/test_recipes_loading.py I'll have a look at what I can do. |
|
Thanks Bouwe for addressing all my comments so far! I tested this with 5 very complex recipes, and all worked well. I found a tiny issue regarding native IPSL-CM6 data. Would it be ok for you if I push that change directly here? Most likely you don't have the necessary data to test this. It's just one line. |
|
Sure, go ahead! |
Can this be fixed by adapting the recipe? And any idea why this worked before even though we used the wrong |
|
Sorry for messing up Codecov😁 Let's just ignore it, this line hasn't been tested before... |
Yes, we can update the recipe to use version |
schlunma
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! Could you please open an issue about that or address it in ESMValGroup/ESMValTool#2958?
From my side, this is ready 👍 (I didn't have a look at the tests at all though).
Thanks Bouwe, cool stuff 🚀
@remi-kazeroni I opened ESMValGroup/ESMValTool#2958 to update the imports from |
|
Created an issue to ask what version number to use for the ERA5 data: ESMValGroup/ESMValTool#2960 |
Thanks for clarifying @bouweandela, make sense to me. I'll take a the other issues/PRs that you opened regarding version numbers. For |
|
Thanks @remi-kazeroni! Did you want to do additional tests with this branch or are you happy with it now? |
Thanks for your great work @bouweandela! I don't think it is necessary to do further recipe testing for this PR. The tests of @schlunma and myself should be enough for now. You can take this as an approval from my side. From @schlunma's review, I see that one should maybe check the tests before this PR can be merged. @valeriupredoi, as one of our specialist, would you maybe have the time for that? 🍻 |
|
wonderful work @bouweandela @schlunma and @remi-kazeroni 🍺 x 3 - I have taken a look at codecov fails and those are simply because we dont have a test for |
Description
Add the
esmvalcore.localmodule that can be used to find files on the local filesystem (previously know asesmvalcore._data_finder). It supports using'*'as a facet value to match anything and also allows searching for a specific version of a file instead of just the latest. These features are needed to support using wildcards and versioned datasets in the recipe in #1609.Some minor related changes in this pull request are:
esmvalcore.typingmodule to define some types that can be used for type hints.[facet]to{facet}in the finding input data help section, as we've been using curly braces in the rootpaths inconfig-developer.ymlfor quite a while now.tests/integration/test_recipe.pywere using ancillary variables that were not meaningful for the preprocessor function they were added to. This is corrected here.Closes #286
Closes #1825
Backward incompatible change
If a
versionof a dataset is specified in the recipe, the tool will now search for exactly that version, instead of simply using the latest version. Therefore it is necessary to make sure that the version number in the directory tree matches with the version number in the recipe to find the files.Before you get started
Checklist
It is the responsibility of the author to make sure the pull request is ready to review. The icons indicate whether the item will be subject to the 🛠 Technical or 🧪 Scientific review.
To help with the number pull requests: