Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DAS-2214 Add NSIDC IceSAT2 Regression Tests #92

Open
wants to merge 36 commits into
base: main
Choose a base branch
from

Conversation

flamingbear
Copy link
Member

@flamingbear flamingbear commented Aug 29, 2024

Description

This PR adds a new regression test to the repository.

This is a stop-gap to allow NSIDC to test the basic IceSAT2 Collections against Harmony Services (Primarily the Trajectory Subsetter).

This PR adds:

  • a set of regression tests in the test/nsidc-icesat2/ directory
  • a shared_utility directory so that common basic functionality can be shared across tests. Right now there are a number of basic routines copied into each tests directory, they can be migrated out slowly (or all at once in an IP sprint).
  • github workflow infrastructure to test and build the nsidc-icesat2-image

Jira Issue ID

DAS-2214

Local Test Steps

This can be tested in UAT so in order to test locally, you must clone and check out this branch.

Build the image from the test directory:

make nsidc-icesat2-image

Set your UAT environmental variables

# UAT
export HARMONY_HOST_URL=https://harmony.uat.earthdata.nasa.gov
export AWS_ACCESS_KEY_ID=<UATAWSKEY>
export AWS_SECRET_ACCESS_KEY=<UATAWSSECRETKEY>
export EDL_USER=<edl username>
export EDL_PASSWORD=<edl password>

Run the notebook against UAT

./run_notebooks.sh nsidc-icesat2

Verify it run successfully

output looks like:

Running regression tests
Test suite nsidc-icesat2 starting
running test with ghcr.io/nasa/regression-tests-nsidc-icesat2:latest
Waiting for nsidc-icesat2.
Input Notebook:  nsidc-icesat2/NSIDC-ICESAT2_Regression.ipynb
Output Notebook: /workdir/output/nsidc-icesat2/Results.ipynb
Working directory: nsidc-icesat2
Executing:   0%|          | 0/18 [00:00<?, ?cell/s]Executing notebook with kernel: python3
Executing: 100%|██████████| 18/18 [09:26<00:00, 31.50s/cell]
Test suite nsidc-icesat2 succeeded
Tests completed (passed)

and includes Test suite nsidc-icesat2 succeeded

PR Acceptance Checklist

  • Acceptance criteria met
  • Tests added/updated (if needed) and passing
  • Documentation updated (if needed)

When passed a build arg shared_utils, include this directory in the regression
test image.
Also adds first shapefile for Iceland
This is not technically necessary but will be good for datatree testing.
Updates the notebook for the first suite of tests.
(I'm 99.9% sure this is outdated and unneeded, but deleting in it's own commit
so it can be resurrected if needed)
ATL04 data doesn't work in DataTree because it's unalignable.
I added a new routine compare_results_to_reference_file_new

This needs to be replaced when trajectory subsetter switches over to use this,
but I don't want to do that with an unreleased xarray version from main.
and skeleton of ATL08 Temporal test
(ATL08 showed problems and cannot be run at this time.)
- Adds version file.
- Clarifies shared_utils readme
- removes and reorders utility functions.
This is a workaround because xarray datatree is not released to the public
yet. It can be removed as soon as xarray includes the `open_groups` command by
default.
@flamingbear flamingbear marked this pull request as ready for review August 29, 2024 19:22
Copy link
Contributor

@lyonthefrog lyonthefrog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran the test instructions and the tests passed! My comments are mainly a few questions and a couple minor typos. I'll approve once the questions are resolved.

CHANGELOG.md Show resolved Hide resolved
test/Dockerfile Show resolved Hide resolved
README.md Show resolved Hide resolved
test/shared_utils/README.md Outdated Show resolved Hide resolved
test/shared_utils/README.md Outdated Show resolved Hide resolved
test/nsidc-icesat2/environment.yaml Show resolved Hide resolved
test/shared_utils/utilities.py Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have this file, should we eventually add a reference to this and remove the shared functions from the existing utility files in the other tests? If so, this makes sense to me since it's good to not re-write the same code in multiple places. But I do worry a little bit about what the best approach would be if one of these functions needed to be edited in the future, since changes could affect tests across a few different services/teams. Would it become a case of pinging the Harmony service provider channel to announce that the shared utilities file has been updated so folks can check that it didn't break anything of theirs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You nailed it. This is a concern and why I waffled on this a bit. But, there are some basic things that all tests need to do and we probably just need to curate the functions in the shared utils carefully.

Copy link
Contributor

@lyonthefrog lyonthefrog Sep 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking an issue that might be encountered is using the comparison function for different file formats, since I know I hit that when trying to add that function to the sdps-egi-tests repository. But it would be nice to have. Would there be any one "owner" of these utils, or set of rules for what should and should not be added? But I'm likely overthinking this and perhaps the answer is that we trust the folks using this repository to be aware that any change may affect the rest - i.e., just be careful, as you said.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I think custom comparisons should live in their own test directory. But yes, this is just a be careful spot I think.

@lyonthefrog lyonthefrog self-requested a review September 5, 2024 15:08
Copy link
Member

@owenlittlejohns owenlittlejohns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you fixed up the pre-commit things that had failed while I was working through this review - nice!

This is nice work, and I like the extraction of things into a shared utility module (although feel your pain about the conditional shenanigans you had to go through in the Dockerfile).

I haven't given this the big thumbs up because I'm cautious about the sizes of some of the reference files. Are there tests that would give NSIDC enough confidence but using smaller spatial/temporal regions?

Oh - also: the tests ran nicely, and I also checked the HOSS image still worked with the updates to the Dockerfile. Nice!

@@ -42,7 +42,7 @@ echo "harmony host url: ${harmony_host_url}"
## e.g. if REGRESSION_TESTS_N2Z_IMAGE environment was set, the value would be used instead of the default.

image_names=()
all_tests=(harmony harmony-regression hoss hga n2z swath-projector trajectory-subsetter variable-subsetter regridder hybig geoloco net2cog)
all_tests=(harmony harmony-regression hoss hga n2z nsidc-icesat2 swath-projector trajectory-subsetter variable-subsetter regridder hybig geoloco net2cog)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Speculative comment - with the repository_dispatch stuff for TRT-532, I think we can get to a point where this gets cut and we can start to streamline some of this repository.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is 57MB - is git lfs enabled for this repository? (Similarly, the ATL04 temporal range file is 43.5 MB, and there's another couple at 35 or 25 MB)

150 MB isn't too bad, but this is exactly the sort of thing I'm worried about with dumping a bunch of different data provider tests in the same repository; if all new regression test suites add this much, then the repository is going to get out of hand quickly.

(Edit: I'm not sure exactly how long it took to pull the latest changes locally, but it was in the 10s of seconds or so - but that might also include net2cog related stuff)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

git lfs is not enabled for the repository. It may have been at one point, but I could be confusing a different repo.

The two large files were discussed with Mikala and there really isn't a way to make them much smaller. "I'm struggling to get ATL03 output much below 60 MB. The granules are just really huge to begin with."

I do think reference data is going to be a problem with this repository. I haven't thought of a better way, but not sure I want to enable git LFS yet. I haven't used it a lot and to make it efficient I think would take rewriting the git history to remove the files currently stored. I am trying to think of alternatives, like a Data Docker image from a different location, but haven't stumbled into anything yet. In the past I've looked at git-annex and DVC.org. I'm open to options, but I don't really want to add it to this PR. A fresh clone did take 51sec on my machine.

150 MB isn't too bad, but this is exactly the sort of thing I'm worried about with dumping a bunch of different data provider tests in the same repository; if all new regression test suites add this much, then the repository is going to get out of hand quickly.

True, but this is not supposed to be done for all providers, this was stop gap to get NSIDC to production, but I can't find the actual link for that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rambling stream-of-consciousness ahoy:

One thought Vinny suggested a while back - could the reference files go somewhere like S3? Then the notebook could either do some in-cloud access, or download the reference files when being run, maybe? Problem with in-cloud access: these notebooks then need to run in us-west-2, which will be problematic as we look to lean on running things in GH (truly rambling: CodeDeploy in Harmony AWS account associated with this repository?). Problem with downloading when running: slow at runtime, rather than slow once when compiling a Docker image.


Another (lower-tech) thought - what's the size of the source file? ATL03 is indeed pretty beefy, but some of the granules are up in the several GB range, while others are more like 300 MB. I wanted to check the ATL03 granule, but don't have access to view CMR records in the NSIDC_CUAT provider, so couldn't.

This led me to:

  1. If these tests are going to be automated, we need to make sure that whichever EDL user is running them (credentials probably set by secrets in a couple of GH environments) has the requisite permissions to view/access/download data and metadata records.

  2. The tests currently only define granules for non-production accounts. Are there production equivalents? Don't we need those? (When I couldn't get at the UAT granule for ATL03, I was going to rip the equivalent from production, but that isn't defined, so I was defeated again!)

Sorry - that's probably about 4 comments rolled into one.

Copy link
Member Author

@flamingbear flamingbear Sep 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did forget the authentication that would be required (I had a note at one point: and it was that I would add the test user to NSIDC's ACLs I think)

As for the size, I did go back and forth with Mikala on the size of the reference files, and I can go back to her, but maybe you should if this is a blocker.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did ask Mikala for clarification on why these were the smallest reference files she felt comfortable making. And we will see what she says

@flamingbear
Copy link
Member Author

After talking with @asteiker @owenlittlejohns and @mikala-nsidc. I will update this PR configuring the repository to use github's LFS to keep these large files separate from the code.

@flamingbear
Copy link
Member Author

To complete:

  • update repo with GIT LFS and store the reference files there.
  • update pinned version of xarray to latest release
  • clean up docker requirements related to installing git for the previous pin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants