Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added icon to nwp providers #72

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

gabrielelibardi
Copy link

Pull Request

Description

I added icon to the nwp providers. Specifically the changes should allow to an xarray lazily from a list of .zarr or .zarr.zip paths downloaded from here https://huggingface.co/datasets/openclimatefix/dwd-icon-eu.
This pull request is created to address this issue #66 (comment).
In principle this should work even if ones uses remote paths directly to the the .zarr.zip files however because of the many request made to the hugging face server in a short time this may result in a 429 Error. There are ways around this as mentioned in the issue, that have not yet been implemented.

Fixes #

How Has This Been Tested?

see the test_load_icon_eu added to ocf-data-sampler/tests/load/test_load_nwp.py

  • [X ] Yes

Checklist:

  • My code follows OCF's coding style guidelines
  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • I have checked my code and corrected any misspellings

@Sukh-P
Copy link
Member

Sukh-P commented Oct 28, 2024

Thanks for creating this PR and the great work already done on trying to support ICON data in this library!

Something to note is that if this is added in as is that people may assume this library already supports ICON data but without some normalisation constants added and ICON listed as an NWP provider here creating samples from it won't work, so my suggestion is that either this is added in this PR, or in a subsequent PR or this outstanding work is clearly documented in a Github issue or README, thanks!

@gabrielelibardi
Copy link
Author

Thanks for creating this PR and the great work already done on trying to support ICON data in this library!

Something to note is that if this is added in as is that people may assume this library already supports ICON data but without some normalisation constants added and ICON listed as an NWP provider here creating samples from it won't work, so my suggestion is that either this is added in this PR, or in a subsequent PR or this outstanding work is clearly documented in a Github issue or README, thanks!

I can compute the std and mean constants, do you have a script to do this for others NWP? How large of a sample do you take?

@Sukh-P
Copy link
Member

Sukh-P commented Nov 1, 2024

I can compute the std and mean constants, do you have a script to do this for others NWP? How large of a sample do you take?

Thanks, that would be great! So I don't think we have a script in Github currently so just created this gist to share some example code of how I have calculated some of the normalisation stats previously, in the example I used 200 samples I think that would be fine for this too

@Sukh-P
Copy link
Member

Sukh-P commented Jan 30, 2025

Hey @gabrielelibardi, hope you are doing well! Coming back to this one after a while and just wanted to check if you're planning on working on this still? If not maybe we can merge this in as is and then I can create another issue for someone to calculate the normalisation constants for icon-eu, thanks!

@gabrielelibardi
Copy link
Author

Hey @Sukh-P hope you are also doing well! Still planning to work on this, but a bit busy at the moment with other stuff. You can also merge if nothing more is needed, up to you.

@Sukh-P
Copy link
Member

Sukh-P commented Jan 31, 2025

Hey @Sukh-P hope you are also doing well! Still planning to work on this, but a bit busy at the moment with other stuff. You can also merge if nothing more is needed, up to you.

Okay great, so I have left a few comments which would be good to address, I don't think we need to add anything else apart from those to this PR, so after that we can merge this in.

There are some normalisation constants to be calculated as per my comments previously but I can create a separate issue for that, may need your input on how someone goes about fetching this data (as I believe you already fetched this from our HF) and then I can point to the gist I linked above for how someone can calculate these.

Thanks for the input!

@gabrielelibardi gabrielelibardi force-pushed the added_icon_to_nwp_providers branch from c506714 to 9d2a1b2 Compare February 9, 2025 17:30
@gabrielelibardi
Copy link
Author

Hey @Sukh-P I made the changes you suggested, I also added some files to test the loading for icon (they are a bit heavy so we can remove those), also added the script to compute the mean and std_dev under utils, not sure if I should include it here or somewhere else or just leave it out.

@Sukh-P
Copy link
Member

Sukh-P commented Feb 12, 2025

Hey @Sukh-P I made the changes you suggested, I also added some files to test the loading for icon (they are a bit heavy so we can remove those), also added the script to compute the mean and std_dev under utils, not sure if I should include it here or somewhere else or just leave it out.

Hey @gabrielelibardi, thanks for this, it looks good and almost there, on the test it may be a bit easier to create a small test zarr file for icon data on the fly as is done here https://github.com/openclimatefix/ocf-data-sampler/blob/main/tests/conftest.py#L92 for another NWP example, this way there are less files that need to be added to the repo and also easier to modify the test data in the future

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants