Skip to content

[v13] fix: trim large events in Athena querier#37350

Merged
nklaassen merged 2 commits intobranch/v13from
nklaassen/v13/s3-large-events
Jan 26, 2024
Merged

[v13] fix: trim large events in Athena querier#37350
nklaassen merged 2 commits intobranch/v13from
nklaassen/v13/s3-large-events

Conversation

@nklaassen
Copy link
Copy Markdown
Contributor

@nklaassen nklaassen commented Jan 26, 2024

Backport #35402 to branch/v13
Backport #35440 to branch/v13

Backporting both of these to v13, rather late, didn't realize we had athena in v13

Changelog: Fixed querying of large audit events with Athena backend and added prometheus metrics for audit event sizes

nklaassen and others added 2 commits January 26, 2024 09:24
Backport #35402 to branch/v13
Fixes #35161

Large events queried from the Athena audit backend will now be trimmed
before they are stored and before they are returned from a query
according to the existing TrimToMaxSize implementations for each event
type already used by the Dynamo and File backends.

The other backends typically trim the event before storing it, for
Dynamo this is due to the 400 KB item size limit, for the file backend
it's due to the 64 KiB bufio.MaxScanTokenSize.

There is no hard limit to events stored in Parquet files in S3, but
we've been using a 2 GiB limit in the publisher so far.
With this change we will attempt to trim events to 2 GiB before writing
them (if we haven't already run out of memory) instead of just failing.

We've also been using a 1 MiB limit in the querier and just returning an
empty result when an event larger than that is encountered.
With this change we will attempt to trim the event to 1MiB before
returning it.
The 1 MiB limit ultimately stems from the 4MB max gRPC message size.

We could just trim to 1 MiB in the publisher, but I'd prefer to preserve
as much of the event data as possible in case we improve the querying
story for large events in the future (and in case the user wants to
query the events directly from S3).
@github-actions github-actions Bot added audit-log Issues related to Teleports Audit Log backport size/md labels Jan 26, 2024
@github-actions github-actions Bot requested review from mdwn and rosstimothy January 26, 2024 17:32
@github-actions
Copy link
Copy Markdown
Contributor

The PR changelog entry failed validation: Changelog entry not found in the PR body. Please add a "no-changelog" label to the PR, or changelog lines starting with changelog: followed by the changelog entries for the PR.

@public-teleport-github-review-bot public-teleport-github-review-bot Bot removed the request for review from mdwn January 26, 2024 18:15
@nklaassen nklaassen added this pull request to the merge queue Jan 26, 2024
Merged via the queue into branch/v13 with commit 4f04174 Jan 26, 2024
@nklaassen nklaassen deleted the nklaassen/v13/s3-large-events branch January 26, 2024 18:55
@camscale camscale mentioned this pull request Feb 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

audit-log Issues related to Teleports Audit Log backport size/md

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants