Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUMULUS-2688: Update bulk operations to fetch granules by unique columns #3000

Merged
merged 27 commits into from
Jul 26, 2022

Conversation

npauzenga
Copy link
Contributor

@npauzenga npauzenga commented Jun 17, 2022

Summary:

Addresses CUMULUS-2688

This is PR 2/3 for the above ticket. This PR address the changes to bulk operations. When a bulk operation operates on granules it now needs to accept and fetch granules by granuleId + collectionId, not just granuleId. This is because the unique identifiers changed in the Postgres switchover.

Changes

  • Updates BULK_GRANULE, BULK_GRANULE_DELETE, and BULK_GRANULE_REINGEST operations in bulk-operations to support collectionId
  • Updates integration tests to use unique identifiers

PR Checklist

  • Update CHANGELOG
  • Unit tests
  • Ad-hoc testing - Deploy changes and test manually
  • Integration tests

@@ -6,6 +6,12 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).

## Unreleased Phase 3

### Breaking Changes
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if there should be migration instructions for this... 🤔.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good call. I'll update that repo once all three tickets for 2688 are in as they all have changes that'll need to be noted. It looks like we might need a new rds-phase-3 branch there that gets released at the same time as the main cumulus phase 3 release too 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const { reingestGranule, applyWorkflow } = require('../lib/ingest');

const log = new Logger({ sender: '@cumulus/bulk-operation' });

async function applyWorkflowToGranules({
granuleIds,
granules,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These aren't really granules. They're { collectionId, granuleId}. If this is confusing I can rename.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think it should be something better across the stack, esp at the endpoint level if we're renaming there anyway. I just can't decide on a good name. 🤔

Copy link
Contributor Author

@npauzenga npauzenga Jul 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's where I'm getting stuck. It's really granuleUniqueIdsButNotCumulusIds 😄. Though the API doesn't return cumulus_id at all so maybe it would only be slightly confusing on the dev side.

We could do something like granuleAndCollectionIds but that could also be ambiguous because "ID" can mean a couple things in our schema. It highlights the problem of not exposing cumulus_id to these endpoints/users.

The more I think about it the more I'm leaning towards keeping granules. My reasoning (subject to change) is that you could pass an entire API Granule object and it would work. It's just that we really only need granuleId and collectionId.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't find a convincing reason to disagree.

@npauzenga npauzenga changed the title WIP: Feature/cumulus 2688 bulk operation CUMULUS-2688: Update bulk operations to fetch granules by unique columns Jun 22, 2022
Copy link
Contributor

@nemreid nemreid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the changes look fine, but I'd really like to avoid overloading granules even more in this way. Let's discuss regarding another term when you're available.

Base automatically changed from feature/CUMULUS-2688-new-granule-endpoint to feature/rds-phase-3 July 19, 2022 21:00
@npauzenga npauzenga merged commit fee1280 into feature/rds-phase-3 Jul 26, 2022
@npauzenga npauzenga deleted the feature/CUMULUS-2688-bulk-operation branch July 26, 2022 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants