Skip to content

Replace xitongsys/parquet-go with segmentio/parquet-go#28442

Merged
tobiaszheller merged 1 commit intomasterfrom
tobiaszheller/athena-segmentio-parquet
Jun 29, 2023
Merged

Replace xitongsys/parquet-go with segmentio/parquet-go#28442
tobiaszheller merged 1 commit intomasterfrom
tobiaszheller/athena-segmentio-parquet

Conversation

@tobiaszheller
Copy link
Copy Markdown
Contributor

It turned out that problems with Athena engine v3 describes in #26053 were caused by Athena v3 not being compatible with parquet files which are not writing page indexes.

V3 in future may support that files but as workaround it was suggested to write those indexes.

xitongsys/parquet-go does not implement writing those indexes so we decided to switch to other parquet library github.com/segmentio/parquet-go which seems in more active development and used by bigger project despite being pre v1.

@github-actions github-actions Bot added audit-log Issues related to Teleports Audit Log size/sm labels Jun 28, 2023
@tobiaszheller tobiaszheller added backport/branch/v13 and removed audit-log Issues related to Teleports Audit Log backport size/sm labels Jun 28, 2023
@github-actions github-actions Bot requested review from smallinsky and tigrato June 28, 2023 18:45
Comment thread lib/utils/aws/s3.go Outdated
Comment thread lib/utils/aws/s3.go Outdated
Comment thread lib/utils/aws/s3.go Outdated
Comment thread lib/utils/aws/s3.go Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we actually add a test to ensure we will always wait for uploads to finish?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added in 73aaba3

Comment thread lib/utils/aws/s3.go Outdated
@tobiaszheller tobiaszheller force-pushed the tobiaszheller/athena-segmentio-parquet branch from b2c7b9f to bcb92c9 Compare June 29, 2023 11:06
@tobiaszheller tobiaszheller requested a review from tigrato June 29, 2023 11:44
Comment thread lib/events/athena/consumer_test.go Outdated
Comment on lines 648 to 649
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reworked tests, because what we really want is to check if two parqet files with events are created and if events are stored there.
Generating parquet files and comparing it to testdata is already done in library.

Comment thread lib/utils/aws/s3.go Outdated
@public-teleport-github-review-bot public-teleport-github-review-bot Bot removed the request for review from smallinsky June 29, 2023 13:45
@tobiaszheller tobiaszheller enabled auto-merge June 29, 2023 14:16
@tobiaszheller tobiaszheller added this pull request to the merge queue Jun 29, 2023
@tobiaszheller tobiaszheller removed this pull request from the merge queue due to a manual request Jun 29, 2023
@tobiaszheller tobiaszheller force-pushed the tobiaszheller/athena-segmentio-parquet branch from 858ae30 to de62e69 Compare June 29, 2023 14:23
@tobiaszheller tobiaszheller enabled auto-merge June 29, 2023 14:23
@tobiaszheller tobiaszheller added this pull request to the merge queue Jun 29, 2023
Merged via the queue into master with commit c33b604 Jun 29, 2023
@tobiaszheller tobiaszheller deleted the tobiaszheller/athena-segmentio-parquet branch June 29, 2023 15:16
@public-teleport-github-review-bot
Copy link
Copy Markdown

@tobiaszheller See the table below for backport results.

Branch Result
branch/v13 Failed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants