Skip to content

Commit

Permalink
Release 16 with master (#3432)
Browse files Browse the repository at this point in the history
* [CUMULUS-3226] Remove refs to async operations dynamo table (#3331)

* Remove refs to async operations dynamo table

* add changelog entry

* Remove additional async operation table refs

* Update process_dead_letter_archive role

* CUMULUS-3053: removing references to dynamodb from docs (#3322)

* CUMULUS-3053: removing references to dynamodb from docs

* resolving lint errors

* CUMULUS-3285: Updated isAuthBearTokenRequest to handle non-Bearer authorization header (merge to master) (#3350) (#3352)

* CUMULUS-3285: Updated isAuthBearTokenRequest to handle non-Bearer authorization header (#3341)

* Jk rel16/cumulus 3079 (#3357)

* Update endpoints/granules :granuleName parameter to :granuleId (#3333)

* Move PUT logic to PATCH.  Implement error for PUT endpoint

* Add put units/refactor

* Update API client to use PATCH protocol

cumulus's api is moving all existing PUT requests to PATCH.  This
commit updates the api-client to use those, adds 'PATCH' to typings
and updates the test fixtures for the thin unit tests.

* Fix broken API unit

* Update CHANGELOG

* Fix naming/minor refactor

* Fix Dynamodb granule.files default bug

* Fix elasticsearch default array write value

* Update/add PUT endpoint and relevant tests

* Update CHANGELOG

* Update unit test to take advantage of fixed unit helper

* Update API version

* Add api-client method to use updated PUT endpoint

* Add fixed annotations to CHANGELOG

* Update API version unit

* Minor comment refactor

* Add PUT API spec test

* Fix incorrect test label

* Add missing unit tests

* Add removed unit tests

* Jk/cumulus 3072 add header check (#3257)

* Add API version compliance checks to app middleware

* Update middleware/add test defaults/fix typing

* Fix broken unit tests

* cleanup

* Add middleware units

* Add endpoint tests to validate middleware

* Fix unit test

* Remove unneeded type exports

* Remove unneeded test

* Remove async

* Minor formatting fix/changeset reduction

* Add missing ts-check

* Update CHANGELOG

* Fix bad import

* Update 'version' header to 'Cumulus-API-Version'

* Respond to PR feedback

* validateApiCompliance -> requireApiVersion

* Update packages/api/lib/request.js

Co-authored-by: Marc <[email protected]>

* Fix isMinVersionApi expression

* Update granule units/fix prior test fixtures

* Update packages/api/app/middleware.js

Co-authored-by: Marc <[email protected]>

* Fix spacing

* Add api-client required headers

* Update header restriction with new name, fix api-package header
inclusions

* Fix broken unit

* Bring in fix from #3270

* Update packages/api/app/middleware.js

Co-authored-by: Marc <[email protected]>

* isNumber -> Number.isFinite

---------

Co-authored-by: Marc <[email protected]>

* Update PUT to use collectionId endpoint *only*

* Update CHANGELOG

* Fix api-client to call right PUT endpoint

* Remove unneeded log output

* Update tests, add granuleId parameter updates

* Fix bad logic assertion

* Update endpoints/granules :granuleName parameter to :granuleId

* Fix lint 🔔

* Fix lint

* Fix merged unit test

* Fix merged comments

---------

Co-authored-by: Marc <[email protected]>

* Reorder/updated CHANGELOG

* Minor P3 release edits

---------

Co-authored-by: Marc <[email protected]>

* Cumulus 3120 bugfix (#3364)

* [CUMULUS-3226] Remove refs to async operations dynamo table (#3331)

* Remove refs to async operations dynamo table

* add changelog entry

* Remove additional async operation table refs

* Update process_dead_letter_archive role

* Update endpoints/granules :granuleName parameter to :granuleId (#3333)

* Move PUT logic to PATCH.  Implement error for PUT endpoint

* Add put units/refactor

* Update API client to use PATCH protocol

cumulus's api is moving all existing PUT requests to PATCH.  This
commit updates the api-client to use those, adds 'PATCH' to typings
and updates the test fixtures for the thin unit tests.

* Fix broken API unit

* Update CHANGELOG

* Fix naming/minor refactor

* Fix Dynamodb granule.files default bug

* Fix elasticsearch default array write value

* Update/add PUT endpoint and relevant tests

* Update CHANGELOG

* Update unit test to take advantage of fixed unit helper

* Update API version

* Add api-client method to use updated PUT endpoint

* Add fixed annotations to CHANGELOG

* Update API version unit

* Minor comment refactor

* Add PUT API spec test

* Fix incorrect test label

* Add missing unit tests

* Add removed unit tests

* Jk/cumulus 3072 add header check (#3257)

* Add API version compliance checks to app middleware

* Update middleware/add test defaults/fix typing

* Fix broken unit tests

* cleanup

* Add middleware units

* Add endpoint tests to validate middleware

* Fix unit test

* Remove unneeded type exports

* Remove unneeded test

* Remove async

* Minor formatting fix/changeset reduction

* Add missing ts-check

* Update CHANGELOG

* Fix bad import

* Update 'version' header to 'Cumulus-API-Version'

* Respond to PR feedback

* validateApiCompliance -> requireApiVersion

* Update packages/api/lib/request.js

Co-authored-by: Marc <[email protected]>

* Fix isMinVersionApi expression

* Update granule units/fix prior test fixtures

* Update packages/api/app/middleware.js

Co-authored-by: Marc <[email protected]>

* Fix spacing

* Add api-client required headers

* Update header restriction with new name, fix api-package header
inclusions

* Fix broken unit

* Bring in fix from #3270

* Update packages/api/app/middleware.js

Co-authored-by: Marc <[email protected]>

* isNumber -> Number.isFinite

---------

Co-authored-by: Marc <[email protected]>

* Update PUT to use collectionId endpoint *only*

* Update CHANGELOG

* Fix api-client to call right PUT endpoint

* Remove unneeded log output

* Update tests, add granuleId parameter updates

* Fix bad logic assertion

* Update endpoints/granules :granuleName parameter to :granuleId

* Fix lint 🔔

* Fix lint

* Fix merged unit test

* Fix merged comments

---------

Co-authored-by: Marc <[email protected]>

* CUMULUS-3053: removing references to dynamodb from docs (#3322)

* CUMULUS-3053: removing references to dynamodb from docs

* resolving lint errors

* CUMULUS-3285: Updated isAuthBearTokenRequest to handle non-Bearer authorization header (merge to master) (#3350)

* CUMULUS-3285: Updated isAuthBearTokenRequest to handle non-Bearer authorization header (#3341)

* CUMULUS-3121/3120 (#3360)

* CUMULUS-3121/3120 v15.1.0 backport (#3346)

* backport PR

* finalizing docs and changing remaining groups

* PR feedback + testing changes

* PR feedback

* reverting change

* Revert "update changelog"

This reverts commit ae4627c.

* reverting changes

* PR feedback

* removing EgressLambda from doc

* Update xml2js 0.4.22->0.5 strict (#3330) (#3339)

* Update xml2js 0.4.22->0.5 strict

* Address GHSA-776f-qx25-q3cc/update allow list

* Update CHANGELOG

* Update package pins to 0.5.0 for xmljs

Co-authored-by: Jonathan Kovarik <[email protected]>

* PR feedback

* PR feedback

* PR feedback

* fixing documentation linting

* PR feedback

* PR feedback

* PR feedback

* adding variables to tf-modules/workflow

* PR feedback

* PR feedback

* reverting previous change

---------

Co-authored-by: jennyhliu <[email protected]>
Co-authored-by: Jonathan Kovarik <[email protected]>

* PR feedback (docs+changelog)

* fixing docs

* undoing commit

* PR feedback

---------

Co-authored-by: jennyhliu <[email protected]>
Co-authored-by: Jonathan Kovarik <[email protected]>

* CUMULUS-3121/3120 (#3360)

* CUMULUS-3121/3120 v15.1.0 backport (#3346)

* backport PR

* finalizing docs and changing remaining groups

* PR feedback + testing changes

* PR feedback

* reverting change

* Revert "update changelog"

This reverts commit ae4627c.

* reverting changes

* PR feedback

* removing EgressLambda from doc

* Update xml2js 0.4.22->0.5 strict (#3330) (#3339)

* Update xml2js 0.4.22->0.5 strict

* Address GHSA-776f-qx25-q3cc/update allow list

* Update CHANGELOG

* Update package pins to 0.5.0 for xmljs

Co-authored-by: Jonathan Kovarik <[email protected]>

* PR feedback

* PR feedback

* PR feedback

* fixing documentation linting

* PR feedback

* PR feedback

* PR feedback

* adding variables to tf-modules/workflow

* PR feedback

* PR feedback

* reverting previous change

---------

Co-authored-by: jennyhliu <[email protected]>
Co-authored-by: Jonathan Kovarik <[email protected]>

* PR feedback (docs+changelog)

* fixing docs

* undoing commit

* PR feedback

---------

Co-authored-by: jennyhliu <[email protected]>
Co-authored-by: Jonathan Kovarik <[email protected]>

* CUMULUS-3121/3120 (#3360)

* CUMULUS-3121/3120 v15.1.0 backport (#3346)

* backport PR

* finalizing docs and changing remaining groups

* PR feedback + testing changes

* PR feedback

* reverting change

* Revert "update changelog"

This reverts commit ae4627c.

* reverting changes

* PR feedback

* removing EgressLambda from doc

* Update xml2js 0.4.22->0.5 strict (#3330) (#3339)

* Update xml2js 0.4.22->0.5 strict

* Address GHSA-776f-qx25-q3cc/update allow list

* Update CHANGELOG

* Update package pins to 0.5.0 for xmljs

Co-authored-by: Jonathan Kovarik <[email protected]>

* PR feedback

* PR feedback

* PR feedback

* fixing documentation linting

* PR feedback

* PR feedback

* PR feedback

* adding variables to tf-modules/workflow

* PR feedback

* PR feedback

* reverting previous change

---------

Co-authored-by: jennyhliu <[email protected]>
Co-authored-by: Jonathan Kovarik <[email protected]>

* PR feedback (docs+changelog)

* fixing docs

* undoing commit

* PR feedback

---------

Co-authored-by: jennyhliu <[email protected]>
Co-authored-by: Jonathan Kovarik <[email protected]>

* CUMULUS-3121/3120 (#3360)

* CUMULUS-3121/3120 v15.1.0 backport (#3346)

* backport PR

* finalizing docs and changing remaining groups

* PR feedback + testing changes

* PR feedback

* reverting change

* Revert "update changelog"

This reverts commit ae4627c.

* reverting changes

* PR feedback

* removing EgressLambda from doc

* Update xml2js 0.4.22->0.5 strict (#3330) (#3339)

* Update xml2js 0.4.22->0.5 strict

* Address GHSA-776f-qx25-q3cc/update allow list

* Update CHANGELOG

* Update package pins to 0.5.0 for xmljs

Co-authored-by: Jonathan Kovarik <[email protected]>

* PR feedback

* PR feedback

* PR feedback

* fixing documentation linting

* PR feedback

* PR feedback

* PR feedback

* adding variables to tf-modules/workflow

* PR feedback

* PR feedback

* reverting previous change

---------

Co-authored-by: jennyhliu <[email protected]>
Co-authored-by: Jonathan Kovarik <[email protected]>

* PR feedback (docs+changelog)

* fixing docs

* undoing commit

* PR feedback

---------

Co-authored-by: jennyhliu <[email protected]>
Co-authored-by: Jonathan Kovarik <[email protected]>

* fixing CHANGELOG.md

* removing docs + updating changelog

* changing doc link

---------

Co-authored-by: Nate Pauzenga <[email protected]>
Co-authored-by: Jonathan Kovarik <[email protected]>
Co-authored-by: Marc <[email protected]>
Co-authored-by: nasamoduyebo <[email protected]>
Co-authored-by: jennyhliu <[email protected]>

* CUMULUS-3243:Updated granule delete logic (#3338) (#3366) (#3367)

* CUMULUS-3243:Updated granule delete logic (#3338)

* CUMULUS-3243:Updated granule delete logic to delete granule which is not in DynamoDB

* add unit tests

* delete files not in master

* update CHANGELOG

* remove dynamodb from test

* Release 16.0.0 (#3376)

* update docs

* Version up to v16.0.0

* Minor edits/update release doc

* Add PI release version notes

* Minor CL edit

* Update CHANGELOG.md

Co-authored-by: Nate Pauzenga <[email protected]>

* Update CHANGELOG.md

Co-authored-by: Nate Pauzenga <[email protected]>

* Update CHANGELOG.md

Co-authored-by: Nate Pauzenga <[email protected]>

* Address PR feedback

---------

Co-authored-by: Nate Pauzenga <[email protected]>

* [CUMULUS-3172] Update data integrity/migration docs (#3387)

* update docs

* update docs

* add new doc to sidebars

* Update docs/upgrade-notes/rds-phase-3-data-migration-guidance.md

Co-authored-by: Jonathan Kovarik <[email protected]>

* Update docs/upgrade-notes/rds-phase-3-data-migration-guidance.md

Co-authored-by: Jonathan Kovarik <[email protected]>

* Update docs/upgrade-notes/rds-phase-3-data-migration-guidance.md

Co-authored-by: Jonathan Kovarik <[email protected]>

* Update docs/upgrade-notes/rds-phase-3-data-migration-guidance.md

Co-authored-by: Jonathan Kovarik <[email protected]>

---------

Co-authored-by: Jonathan Kovarik <[email protected]>

* Fix lint issues (#3392)

* Jk/cumulus 3307 (#3394)

* Backport CUMULUS-3307/PR #3386

* Fixup

* Re-issue v16 docs (#3400)

* CUMULUS-3223: Fix failed granule stuck in queued (#3373) (#3402)

* CUMULUS-3223:Fix failed granule stuck in queued

* skip sqsQueueExists test

* update getGranuleTemporalInfo

* update test match schema

* update getGranuleTemporalInfo

* remove extra await

* remove skip sqsQueueExists

* update `@cumulus/cumulus-message-adapter-js` to `2.0.5`

* update sfEventSqsToDbRecords to return partial batch failure

* fix typo

* handle process error seperately, multiple message test

* update test to process multiple messages

* Jk/cumulus 3315 (#3407)

* CUMULUS-3135 - Update integration test scripts to fail on test timeout (#3401)

* Update integration test scripts to fail on test timeout

* Fixup

* Fixup

* Update script interpreter for test runs

* Fix script

* Fixup

* Fixup

* Update timeout pass/fail conditional

* Jk/cumulus 3135 fix integration tests (#3403)

* Update integration test scripts to fail on test timeout

* Fixup

* Fixup

* Update script interpreter for test runs

* Fix script

* Fixup

* Fixup

* Update api/client and integration test usage of it to fix test failures

* Fix formatting/lint/etc

* Update test/minor fix

---------

Co-authored-by: etcart <[email protected]>

---------

Co-authored-by: etcart <[email protected]>

* v16.0.1 alpha release for testing (#3409)

* bump to 16.0.1-alpha.0

* resolve conflicts

---------

Co-authored-by: Jonathan Kovarik <[email protected]>

* Update serve.js to match main

* bump version

* Add missing method for erasing PG tables (#3419)

* add missing method for erasing PG tables

* Remove duplicate declaration

* Remove duplicate declaration

* Take serve.js and serveUtils.js from main

* Update schema endpoint to utilize P3 api/lib schema

---------

Co-authored-by: Jonathan Kovarik <[email protected]>

* Release 16.0.0 (#3428)

* version bump

* update changelog version

* changelog fixup

* changelog fixup

* update cumulus versions in new lambda

* bump cumulus versions

---------

Co-authored-by: nasamoduyebo <[email protected]>
Co-authored-by: jennyhliu <[email protected]>
Co-authored-by: Jonathan Kovarik <[email protected]>
Co-authored-by: Marc <[email protected]>
Co-authored-by: Naga Nages <[email protected]>
Co-authored-by: etcart <[email protected]>
Co-authored-by: Jonathan Kovarik <[email protected]>
  • Loading branch information
8 people authored Jul 14, 2023
1 parent 0c6e1d1 commit 49f3b08
Show file tree
Hide file tree
Showing 278 changed files with 12,578 additions and 393 deletions.
142 changes: 94 additions & 48 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).

## Unreleased

- **CUMULUS-3188**
- Updated QueueGranules to support queueing granules that meet the required API granule schema.

Expand All @@ -26,17 +27,34 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
- Added `example/cumulus-tf/orca_recovery_adapter_workflow.tf`, `OrcaRecoveryAdapterWorkflow` workflow has `OrcaRecoveryAdapter` task
to call the ORCA recovery step-function.
- Updated `example/data/collections/` collection configuration `meta.granuleRecoveryWorkflow` to use `OrcaRecoveryAdapterWorkflow`
- **CUMULUS-3315**
- Updated `@cumulus/api-client/granules.bulkOperation` to remove `ids`
parameter in favor of `granules` parameter, in the form of a
`@cumulus/types/ApiGranule` that requires the following keys: `[granuleId, collectionId]`
- **CUMULUS-3215**
- Create reconciliation reports will properly throw errors and set the async
operation status correctly to failed if there is an error.
- Knex calls relating to reconciliation reports will retry if there is a
connection terminated unexpectedly error
- Improved logging for async operation
- Set default async_operation_image_version to 47
- **CUMULUS-3024**
- Combined unit testing of @cumulus/api/lib/rulesHelpers to a single test file
`api/tests/lib/test-rulesHelpers` and removed extraneous test files.
- **CUMULUS-3209**
- Apply brand color with high contrast settings for both (light and dark) themes.
- Cumulus logo can be seen when scrolling down.
- "Back to Top" button matches the brand color for both themes.
- Update "note", "info", "tip", "caution", and "warning" components to [new admonition styling](https://docusaurus.io/docs/markdown-features/admonitions).
- Add updated arch diagram for both themes.
- **CUMULUS-3203**
- Removed ACL setting of private on S3.multipartCopyObject() call
- Removed ACL setting of private for s3PutObject()
- Removed ACL confguration on sync-granules task
- Update documentation on dashboard deployment to exclude ACL public-read setting
- **CUMULUS-3245**
- Update SQS consumer logic to catch ExecutionAlreadyExists error and
delete SQS message accordingly.
- Add ReportBatchItemFailures to event source mapping start_sf_mapping

### Fixed

- **CUMULUS-3315**
- Update CI scripts to use shell logic/GNU timeout to bound test timeouts
instead of NPM `parallel` package, as timeouts were not resulting in
integration test failure
- **CUMULUS-2625**
- Optimized heap memory and api load in queue-granules task to scale to larger workloads.

Expand All @@ -45,9 +63,22 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
- The async_operation_image property of cumulus module should be updated to pull
the ECR image for cumuluss/async-operation:47

## [v16.0.0] 2023-05-09

### MIGRATION notes

#### RDS Phase 3
#### PI release version

When updating directly to v16 from prior releases older that V15, please make sure to
read through all prior release notes.

Notable migration concerns since the last PI release version (11.1.x):

- [v14.1.0] - Postgres compatibility update to Aurora PostgreSQL 11.13.
- [v13.1.0] - Postgres update to add `files_granules_cumulus_id_index` to the
`files` table may require manual steps depending on load.

#### RDS Phase 3 migration notes

This release includes updates that remove existing DynamoDB tables as part of
release deployment process. This release *cannot* be properly rolled back in
Expand Down Expand Up @@ -80,8 +111,8 @@ endpoints will require a `Cumulus-API-Version` value of at least `2`.
```

Users/clients that do not make use of these endpoints will not be impacted.
### RDS Phase 3

### RDS Phase 3
#### Breaking Changes

- **CUMULUS-2688**
Expand Down Expand Up @@ -149,8 +180,6 @@ Users/clients that do not make use of these endpoints will not be impacted.
- Remove DynamoDB logic from `sfEventSqsToDbRecords` lambda
- **CUMULUS-2856**
- Update API/Message write logic to handle nulls as deletion in execution PUT/message write logic
- **CUMULUS-3299**
- Docs: Update and fix links that reference the docs after Docusaurus upgrade

#### Added

Expand All @@ -167,7 +196,6 @@ Users/clients that do not make use of these endpoints will not be impacted.
- Add new endpoints to update and delete granules by collectionId as well as
granuleId


#### Removed

- **CUMULUS-2994**
Expand Down Expand Up @@ -211,12 +239,8 @@ Users/clients that do not make use of these endpoints will not be impacted.
- Removed `granuleFilesCacheUpdater` lambda
- Removed dynamo files table from `data-persistence` module. *This table and
all of its data will be removed on deployment*.
- **CUMULUS-3290**
- Removed Dynamo references from local API serve.js script
- Updated .python-version to include patch version

### Added

- **CUMULUS-3072**
- Added `replaceGranule` to `@cumulus/api-client/granules` to add usage of the
updated RESTful PUT logic
Expand All @@ -226,6 +250,11 @@ Users/clients that do not make use of these endpoints will not be impacted.
- Added support for sha512 as checksumType for LZARDs backup task.

### Changed

- **CUMULUS-3315**
- Updated `@cumulus/api-client/granules.bulkOperation` to remove `ids`
parameter in favor of `granules` parameter, in the form of a
`@cumulus/types/ApiGranule` that requires the following keys: `[granuleId, collectionId]`
- **CUMULUS-3307**
- Pinned cumulus dependency on `pg` to `v8.10.x`
- **CUMULUS-3279**
Expand All @@ -240,46 +269,62 @@ Users/clients that do not make use of these endpoints will not be impacted.
after receiving a 404 Not Found Response Error from the `cumulus-api`.
- **CUMULUS-3165**
- Update example/cumulus-tf/orca.tf to use orca v6.0.3
- **CUMULUS-3215**
- Create reconciliation reports will properly throw errors and set the async
operation status correctly to failed if there is an error.
- Knex calls relating to reconciliation reports will retry if there is a
connection terminated unexpectedly error
- Improved logging for async operation
- Set default async_operation_image_version to 47
- **CUMULUS-3024**
- Combined unit testing of @cumulus/api/lib/rulesHelpers to a single test file
`api/tests/lib/test-rulesHelpers` and removed extraneous test files.
- **CUMULUS-3209**
- Apply brand color with high contrast settings for both (light and dark) themes.
- Cumulus logo can be seen when scrolling down.
- "Back to Top" button matches the brand color for both themes.
- Update "note", "info", "tip", "caution", and "warning" components to [new admonition styling](https://docusaurus.io/docs/markdown-features/admonitions).
- Add updated arch diagram for both themes.
- **CUMULUS-3203**
- Removed ACL setting of private on S3.multipartCopyObject() call
- Removed ACL setting of private for s3PutObject()
- Removed ACL confguration on sync-granules task
- Update documentation on dashboard deployment to exclude ACL public-read setting
- **CUMULUS-3245**
- Update SQS consumer logic to catch ExecutionAlreadyExists error and
delete SQS message accordingly.
- Add ReportBatchItemFailures to event source mapping start_sf_mapping

### Fixed

- **CUMULUS-3315**
- Update CI scripts to use shell logic/GNU timeout to bound test timeouts
instead of NPM `parallel` package, as timeouts were not resulting in
integration test failure
- **CUMULUS-3223**
- Update `@cumulus/cmrjs/cmr-utils.getGranuleTemporalInfo` to handle the error when the cmr file s3url is not available
- Update `sfEventSqsToDbRecords` lambda to return [partial batch failure](https://docs.aws.amazon.com/lambda/latest/dg/with-sqs.html#services-sqs-batchfailurereporting),
and only reprocess messages when cumulus message can't be retrieved from the execution events.
- Update `@cumulus/cumulus-message-adapter-js` to `2.0.5` for all cumulus tasks

### Removed
## [v15.0.3] 2023-04-28

### Fixed

- **CUMULUS-3243**
- Updated granule delete logic to delete granule which is not in DynamoDB
- Updated granule unpublish logic to handle granule which is not in DynamoDB and/or CMR

## [v15.0.2] 2023-04-25

### Fixed

- **CUMULUS-3120**
- Fixed a bug by adding in `default_log_retention_periods` and `cloudwatch_log_retention_periods`
to Cumulus modules so they can be used during deployment for configuring cloudwatch retention periods, for more information check here: [retention document](https://nasa.github.io/cumulus/docs/configuration/cloudwatch-retention)
- Updated cloudwatch retention documentation to reflect the bugfix changes

## [v15.0.1] 2023-04-20

### Changed

- **CUMULUS-3279**
- Updated core dependencies on `xml2js` to `v0.5.0`
- Forcibly updated downstream dependency for `xml2js` in `saml2-js` to
`v0.5.0`
- Added audit-ci CVE override until July 1 to allow for Core package releases

## Fixed

- **CUMULUS-3285**
- Updated `api/lib/distribution.js isAuthBearTokenRequest` to handle non-Bearer authorization header

- **CUMULUS-3204**
- Removed fetchAllRules from @cumulus/api/lib/rulesHelpers.
- Removed deleteOldEventSourceMappings from @cumulus/api/lib/rulesHelpers and
refactored endpoint logic to use `deleteKinesisEventSources` instead.
### Fixed

- **CUMULUS-3315**
- Update CI scripts to use shell logic/GNU timeout to bound test timeouts
instead of NPM `parallel` package, as timeouts were not resulting in
integration test failure
- **CUMULUS-3223**
- Update `@cumulus/cmrjs/cmr-utils.getGranuleTemporalInfo` to handle the error when the cmr file s3url is not available
- Update `sfEventSqsToDbRecords` lambda to return [partial batch failure](https://docs.aws.amazon.com/lambda/latest/dg/with-sqs.html#services-sqs-batchfailurereporting),
and only reprocess messages when cumulus message can't be retrieved from the execution events.
- Update `@cumulus/cumulus-message-adapter-js` to `2.0.5` for all cumulus tasks

## [v15.0.3] 2023-04-28

Expand Down Expand Up @@ -7215,7 +7260,8 @@ Note: There was an issue publishing 1.12.0. Upgrade to 1.12.1.
## [v1.0.0] - 2018-02-23
[unreleased]: https://github.com/nasa/cumulus/compare/v15.0.3...HEAD
[unreleased]: https://github.com/nasa/cumulus/compare/v16.0.0...HEAD
[v16.0.0]: https://github.com/nasa/cumulus/compare/v15.0.3...v16.0.0
[v15.0.3]: https://github.com/nasa/cumulus/compare/v15.0.2...v15.0.3
[v15.0.2]: https://github.com/nasa/cumulus/compare/v15.0.1...v15.0.2
[v15.0.1]: https://github.com/nasa/cumulus/compare/v15.0.0...v15.0.1
Expand Down
87 changes: 87 additions & 0 deletions docs/upgrade-notes/rds-phase-3-data-migration-guidance.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
---
id: rds-phase-3-data-migration-guidance
title: Data Integrity & Migration Guidance (RDS Phase 3 Upgrade)
hide_title: false
---

A few issues were identied as part of the RDS Phase 2 release. These issues could impact Granule data-integrity and are described below along with recommended actions and guidance going forward.

## Issue Descriptions

### Issue 1

[Relevant ticket: CUMULUS-3019](https://bugs.earthdata.nasa.gov/browse/CUMULUS-3019)

Ingesting granules will delete unrelated files from the Files Postgres table. This is due to an issue in our logic to remove excess files when writing granules and fixed in Cumulus versions 13.2.1, 12.0.2, 11.1.5

With this bug we believe the data in Dynamo is the most reliable and Postgres is out-of-sync.

### Issue 2

[Relevant ticket: CUMULUS-3024](https://bugs.earthdata.nasa.gov/browse/CUMULUS-3024)

Updating an existing granule either via API or Workflow could result in datastores becoming out-of-sync if a partial granule record is provided. Our update logic operates differently in Postgres and Dynamo/Elastic. If a partial object is provided in an update payload the Postgres record will delete/nullify fields not present in the payload. Dynamo/Elastic will retain existing values and not delete/nullify.

With this bug it’s possible that either Dynamo or PG could be the source of truth. It’s likely that it’s still Dynamo.

### Issue 3

[Relevant ticket: CUMULUS-3024](https://bugs.earthdata.nasa.gov/browse/CUMULUS-3024)

Updating an existing granule with an empty files array in the update payload results in datastores becoming out-of-sync. If an empty array is provided, existing files in Dynamo and Elastic will be removed. Existing files in Postgres will be retained.

With this bug Postgres is the source of truth. Files are retained in PG and incorrectly removed in Dynamo/Elastic.

### Issue 4

[Relevant ticket: CUMULUS-3017](https://bugs.earthdata.nasa.gov/browse/CUMULUS-3017)

Updating/putting a granule via framework writes that duplicates a granuleId but has a different collection results in overwrite of the DynamoDB granule but a *new* granule record for Postgres. This *intended* post RDS transition, however should not be happening now.

With this bug we believe Dynamo is the source of truth, and ‘excess’ older granules will be left in postgres. This should be detectable with tooling/query to detect duplicate granuleIds in the granules table.

### Issue 5

[Relevant ticket: CUMULUS-3024](https://bugs.earthdata.nasa.gov/browse/CUMULUS-3024)

This is a sub-issue of issue 2 above - due to the way we assign a PDR name to a record, if the `pdr` field is missing from the final payload for a granule as part of a workflow message write, the final granule record will not link the PDR to the granule properly in postgres, however the dynamo record *will* have the linked PDR. This *can* happen in situations where the granule is written prior to completion with the PDR in the payload, but then downstream only the granule object is included, particularly in multi-workflow ingest scenarios and/or bulk update situations.

## Immediate Actions

1. Re-review the issues described above
- GHRC was able to scope the affected granules to specific collections, which makes the recovery process much easier. This may not be an option for all DAACs.

2. If you have not ingested granules or performed partial granule updates on affected Cumulus versions (questions 1 and 2 on the survey), no action is required. You may update to the latest version of Cumulus.

3. One option to ensure your Postgres data matches Dynamo is running the data-migration lambda (see below for instructions) before updating to the latest Cumulus version if both of the following are true:
- you have ingested granules using an affected Cumulus version
- your DAAC has not had any operations that updated an existing granule with an empty files array (granule.files = [])

4. A second option for DAACs that have ingested data using an affected Cumulus version is to use your DAAC’s recovery tools or reingest the affected granules. This is likely the most certain method for ensuring Postgres contains the correct data but may be infeasible depending on the size of data holdings, etc..

## Guidance Going Forward

1. Before updating to Cumulus version 16.x and beyond, take a snapshot of your DynamoDB instance. The v16 update removes the DynamoDB tables. This snapshot would be for use in unexpected data recovery scenarios only.

2. Cumulus recommends that you establish and follow a database backup/disaster recovery protocol for your RDS database, which should include periodic backups. The frequency will depend on each DAAC’s database architecture, comfort level, datastore size, and time available. [Relevant AWS Docs](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_CreateSnapshot.html)

3. Invest future development effort in data validation/integrity tools and procedures. Each DAAC has different requirements here. Each DAAC should maintain procedures for validating their Cumulus datastore against their holdings.

## Running a Granule Migration

[Instructions for running the data-migration operation to sync Granules from DynamoDB to PostgreSQL](./upgrade-rds.md#5-run-the-second-data-migration)

The data-migration2 Lambda (which is invoked asynchronously using `${PREFIX}-postgres-migration-async-operation)` uses Cumulus' Granule upsert logic to write granules from DynamoDB to PostgreSQL. This is particularly notable because granules with a running or queued status will only migrate a subset of their fields:

- status
- timestamp
- updated_at
- created_at

It is recommended that users ensure their granules are in a final state (`running`, `completed`) before running this data migration. If there are Granules with an incomplete status, it may impact the data migration.

For example, if a Granule in the running status is updated by a workflow or API call (containing an updated status) and fails, that granule will have the original running status, not the intended/updated status. Failed Granule writes/updates should be evaluated and resolved prior to this data migration.

Cumulus provides the Cumulus Dead Letter Archive which is populated by the Dead Letter Queue for the sfEventSqsToDbRecords Lambda, which is responsible for Cumulus message writes to PostgreSQL. This may not catch all write failures depending on where the failure happened and workflow configuration but may be a useful tool.

If a Granule record is correct except for the status, Cumulus provides an API to update specific granule fields.
6 changes: 3 additions & 3 deletions docs/upgrade-notes/upgrade_rds_phase_3_release.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,9 +34,9 @@ In addition to the above requirements, we suggest users:
and other efforts included in the outcome from CUMULUS-3035/CUMULUS-3071.

- Halt all ingest prior to performing the version upgrade.
- Run load testing/functional testing
- Run load testing/functional testing.

While the majority of the modifications for release 16 are related to DynamoDB removal, we always encourage user engineering teams ensure compatibility at scale with their deployment's engineering configuration prior to promotion to a production environment to ensure a smooth upgrade.
While the majority of the modifications for release 16 are related to DynamoDB removal, we always encourage user engineering teams ensure compatibility at scale with their deployment's configuration prior to promotion to a production environment to ensure a smooth upgrade.

## Upgrade procedure

Expand Down Expand Up @@ -151,6 +151,6 @@ In addition to the above requirements, we suggest users:
module.cumulus.module.postgres_migration_async_operation.aws_security_group.postgres_migration_async_operation[0]
```

Because the AWS resources associated with these security groups can take some time to be properly updated (in testing this was 20-35 minutes), these deletions may cause the deployment to take some time. If for some unexpected reason this takes longer than expected this causes the update to time out, you should be able to continue the deployment by re-running terraform to completion.
Because the AWS resources associated with these security groups can take some time to be properly updated (in testing this was 20-35 minutes), these deletions may cause the deployment to take some time. If for some unexpected reason this takes longer than expected and this causes the update to time out, you should be able to continue the deployment by re-running terraform to completion.

Users may also opt to attempt to reassign the affected Network Interfaces from the Security Group/deleting the security group manually if this situation occurs and the deployment time is not desirable.
2 changes: 1 addition & 1 deletion example/lambdas/asyncOperations/package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "@cumulus/test-async-operations",
"version": "15.0.0",
"version": "16.0.0",
"description": "AsyncOperations Test Lambda",
"main": "index.js",
"private": true,
Expand Down
Loading

0 comments on commit 49f3b08

Please sign in to comment.