Releases: Sage-Bionetworks/BridgeDownstream
Releases · Sage-Bionetworks/BridgeDownstream
v0.4.0
What's Changed
- [ETL-437] Add script to copy (underscore) S3 objects to new object by @philerooski in #128
- [ETL-424] Update schema mapping with new schemas by @philerooski in #130
- [ETL-426] Add new datasets as Glue tables and update crawlers by @philerooski in #131
- [ETL-453] Add module to update Glue crawlers with additional targets by @philerooski in #134
- Fix syntax mistake in JSON to Parquet test by @philerooski in #133
- [ETL-440] Add script to validate self-referencing records on Synapse by @philerooski in #129
- [ETL-423] Allow for self referencing schemas by @rxu17 in #132
- Add stack for study mtbwrj by @philerooski in #135
- [ETL-458] Fixes for S3 to JSON, Glue tables by @philerooski in #136
- [ETL-454] Add new study: pgdvpj by @rxu17 in #137
- [ETL-455] Add study stack for vwrdjf by @philerooski in #138
- [ETL-314] Add parameter for deployment environment to manage artifacts script by @philerooski in #140
- Clean up "matrix" component of upload-and-deploy workflow by @philerooski in #141
- [PATCH] Clean up "matrix" component of upload-and-deploy workflow by @philerooski in #142
- [ETL-444] Append a 0 (zero) in front of underscore/period records file names by @philerooski in #139
- [ETL-461] Fix crontab by @philerooski in #143
- [ETL-463] Add stack for study gxvwhj by @philerooski in #144
- [ETL-316] Map Android's microphone.json to a dataset identifier by @philerooski in #146
- [ETL-487] Add new expected error for motion.json by @rxu17 in #145
Full Changelog: v0.3.0...v0.4.0
v0.3.0
What's Changed
- [ETL-417] Update crontab to use sharedschema_v1 parquet as reference by @philerooski in #120
- Add helper script used to investigate ETL-408 by @philerooski in #121
- [ETL-425] Bootstrap trigger diffs upon union of sharedschema and archivemetadata by @philerooski in #123
- [ETL-425 fix] Update bootstrap trigger to use correct set logic by @philerooski in #124
- [ETL-432] Only submit post-April 2023 records for test study ccbcwq by @philerooski in #125
- Remove query flag from ccbcwq bootstrap trigger by @philerooski in #126
Full Changelog: v0.2.0...v0.3.0
v0.2.0
What's Changed
- Use v0.1 artifacts by @tthyer in #59
- [ETL-107] Add SQS template and config by @philerooski in #60
- [ETL-117] Lambda polls from SQS queue by @philerooski in #61
- [ETL-109] Remove record ID from partitions by @philerooski in #62
- [ETL-170] Generate test events from Synapse Dataset by @philerooski in #63
- [ETL-172] Update Glue table schemas with fields from new assessment revisions by @philerooski in #64
- [ETL-166] De-dockerize lambda by @philerooski in #66
- [ETL-173] Update architecture diagram by @philerooski in #67
- [ETL-177] Update lambda README and copy relevant info to setup_test_data.py by @philerooski in #68
- Upload lambda template to CFN bucket by @philerooski in #69
- [ETL-176] Create VPC with private subnets by @philerooski in #70
- [ETL-179, ETL-112] Use sceptre-sam-handler to deploy lambda by @philerooski in #71
- [ETL-153] Changes necessary for exporting study data in prod by @philerooski in #65
- [ETL-185] Add parquet diff parameters and add cron job for study hktrrx by @philerooski in #72
- [ETL-197] Support JSON client info by @philerooski in #73
- [ETL-198] Fix parquet diff and simplify bootstrap trigger by @philerooski in #74
- [ETL-199] Activate JSON to Paquet scheduled trigger upon creation by @philerooski in #75
- [ETL-200] Build and push bootstrap trigger image each time we deploy trunk by @philerooski in #76
- [ETL-181] Add pmbfzc study stack to prod by @philerooski in #78
- [ETL-202] Add hktrrx study stack to prod by @philerooski in #77
- Add study pmbfzc to bootstrap trigger crontab by @philerooski in #79
- Use correct parameter set in cron job for pmbfzc by @philerooski in #80
- Add script used in resolving ETL-226 by @philerooski in #82
- [ETL-138] Integrate JSON schemas by @philerooski in #84
- [ETL-237] Update table_columns.yaml to better reflect JSON Schemas by @philerooski in #85
- [ETL-239] Refactor JSON to Parquet job and attempt to match types by @philerooski in #86
- [ETL-232] Update setup_test_data script by @philerooski in #83
- Fix type of assessment revision when referencing archive mapping by @philerooski in #87
- [ETL-244] Update dataset_mapping.json by @philerooski in #89
- [ETL-245] Add script for resolving ETL-245 by @philerooski in #90
- [ETL-32] Schema Validation by @philerooski in #92
- ETL-182 by @mfazza in #81
- [ETL-23] Unit tests by @philerooski in #94
- [ETL-255] Explicit Glue resource permissions by @philerooski in #95
- Fix incorrect dependency in prod parquet bucket by @philerooski in #96
- [ETL-259] Remove ReadWriteAccessArns and ReadOnlyAccessArns parameters by @philerooski in #98
- Revert changes to config synapseBridgeDownstreamUserId parameter by @philerooski in #99
- Add missing glue-job-role dependency to S3 to JSON stack by @philerooski in #100
- [ETL-261] Add
--drop-duplicates
functionality to the bootstrap trigger script by @philerooski in #101 - [ETL-256] Address unit test concurrency issues by @philerooski in #103
- Add mtb-alpha stack to prod by @philerooski in #104
- [ETL-250] Update bootstrap trigger cron by @philerooski in #105
- [IBCDPE-186] Update README by @philerooski in #106
- [ETL-111] update config template handler by @rxu17 in #108
- [ETL-301] Support new schema for GradualOnsetV1 assessment by @philerooski in #109
- Update tests to use latest archive map format by @philerooski in #110
- [ETL-312] Ignore specific non-severe Android validation errors by @philerooski in #112
- Add additional non-severe Android errors for motion.json by @philerooski in #113
- Add pyarrow dep by @rxu17 in #115
- Create LICENSE by @thomasyu888 in #116
- [ETL-357] Add stack for study ccbcwq by @philerooski in #117
- [ETL-355] Add prod stack for new study jfxqpk by @rxu17 in #118
- [ETL-358] Remove expected validation errors in S3 to JSON S3 validation result by @philerooski in #119
New Contributors
- @mfazza made their first contribution in #81
- @rxu17 made their first contribution in #108
- @thomasyu888 made their first contribution in #116
Full Changelog: v0.1...v0.2.0
v0.1.0
v0.0.0
What's Changed
- Create folder with external storage location and STS by @philerooski in #2
- [ETL-33/55] Ready BridgeDownstream for infra development by @tthyer in #3
- Changes in response to comments on ETL-34 by @philerooski in #4
- Update flow diagram to reflect ETL-34 changes by @philerooski in #5
- bootstrap json dataset by submitting existing archives by @philerooski in #6
- fix end of files by @philerooski in #7
- [ETL-33] CFN stacks: revise bucket, add role, database, classifier by @tthyer in #8
- Etl-55 scripts workflow by @tthyer in #10
- Create ECR repository for use with lambda by @philerooski in #9
- Etl-55 scripts workflow part. 2 by @tthyer in #11
- Etl-66 Glue Jobs by @tthyer in #12
- Etl-68/buckets by @tthyer in #13
- ETL-68 Add workflows stack by @tthyer in #14
- Etl-69 Triggers by @tthyer in #15
- Add glue tables stack by @tthyer in #16
- Etl-70 Glue crawler by @tthyer in #17
- Small revision to README by @tthyer in #19
- Make job names dependent upon stack name by @tthyer in #20
- Etl-76/pipeline fixes by @tthyer in #21
- Add schema change documents describing how to respond to a proposed upstream schema change by @philerooski in #18
- Etl-73 refactor by @tthyer in #22
- [ETL-71] Nested template for individual studies by @tthyer in #23
- Etl-65/synapsify by @tthyer in #24
- Etl-65/lambda by @tthyer in #25
- Etl-84/s3tojson tweaks by @tthyer in #26
- Add dataset mapping and reference it in s3_to_json_s3 script by @philerooski in #27
- Extend backfill json datasets script to accept entity view by @philerooski in #28
- ETL-85: unique job names / stack decoupling by @tthyer in #29
- Add script to build query string for representative appVersion sample by @philerooski in #30
- Initial commit flip_job script by @philerooski in #31
- Initial commit archive_dataset script by @philerooski in #33
- Etl-83 Schema update changes by @tthyer in #35
- fixing storage location handling in setup_test_data by @tthyer in #36
- Remove explicit SerdeInfo config from Glue tables by @tthyer in #37
- Use uploadedOn rather than createdOn for derived year/month/day partition fields by @philerooski in #38
- Improve schema change docs, remove flip_job script by @philerooski in #34
- ETL-99/spark UI by @tthyer in #39
- Config-driven refactor & jinja2 conversions by @tthyer in #40
- Fix info version in config files by @tthyer in #41
- Correct branch in example study sceptre config by @tthyer in #42
- Update sns_to_glue lambda to use Bridge SNS message format and SQS trigger by @philerooski in #43
- Create CODE_OF_CONDUCT.md by @tthyer in #44
- Convert s3 to json s3 job from python shell to spark job by @philerooski in #45
- Make database per study rather than global by @tthyer in #46
- Use exporter 3.0 data and metadata by @philerooski in #48
- Etl-121 by @tthyer in #49
- Fix lambda issue with namespaces by @tthyer in #50
- ETL-121: fixing cleanup workflow by @tthyer in #51
- Test commit by @tthyer in #52
- Another empty test commit by @tthyer in #53
- Fix syntax issue preventing variable from being expanded by @tthyer in #54
- Use github action expression instead of bash environment variable by @tthyer in #55
- Use the correct ref name from the delete event when cleaning up by @tthyer in #56
- [ETL-144] Return None if osName not found in dataset mapping by @philerooski in #57
Full Changelog: https://github.com/Sage-Bionetworks/BridgeDownstream/commits/v0