Skip to content

[destination-redshift] does not name staging S3 files correctly leading to missing data #65954

@jorge-gt3

Description

@jorge-gt3

Connector Name

destination-redshift

Connector Version

3.5.3

What step the error happened?

During the sync

Relevant information

According to the documentation, S3 Filename pattern supports {timestamp:micros} however when we used that, the file is literally called "{timestamp:micros}" as opposed to using the pattern to name the file dynamically, which resulted on lost data every time we tried to sync large tables.

Image

Using {timestamp:millis} works as expected...

Image

Additionally, the default is {date} which leads to the same issue where the file is duplicated leading to data loss. This was reported back in 2023 but never changed.

https://discuss.airbyte.io/t/redshift-destination-buffer-stream-uploading-duplicate-files-to-s3-staging-area/3313/4

Timestamp should just be the default.

Relevant log output

Contribute

  • Yes, I want to contribute

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions