Skip to content

Implement Compaction of Desktop Shared Directory Read/Write Events#57815

Merged
rhammonds-teleport merged 1 commit intomasterfrom
rhammonds/desktop-audit-combiner
Aug 22, 2025
Merged

Implement Compaction of Desktop Shared Directory Read/Write Events#57815
rhammonds-teleport merged 1 commit intomasterfrom
rhammonds/desktop-audit-combiner

Conversation

@rhammonds-teleport
Copy link
Copy Markdown
Contributor

@rhammonds-teleport rhammonds-teleport commented Aug 12, 2025

Copying files into or out of the shared directory often creates many small read/write operations that clutter the audit log. This PR aims to implement an algorithm that compacts audit events resulting from reads/writes to files within a given shared directory.

The audit compactor caches subsequent reads/writes to a given file for some configurable period of time. Once this timeout expires, audit entries are examined to determine which operations constitute sequential reads/writes that could be compacted into a single audit event. The reduced set of audit events are then emitted.

Closes #40341

Changelog: Reduce audit log clutter by compacting contiguous shared directory read/write events into a single audit log event.

@rhammonds-teleport rhammonds-teleport force-pushed the rhammonds/desktop-audit-combiner branch from f5b97f6 to 9cd3c2a Compare August 12, 2025 19:44
@rhammonds-teleport rhammonds-teleport changed the title Compaction of Desktop Shared Directory Read/Write Events Implement Compaction of Desktop Shared Directory Read/Write Events Aug 14, 2025
@rhammonds-teleport rhammonds-teleport marked this pull request as ready for review August 14, 2025 14:21
@github-actions github-actions bot requested review from probakowski and tcsc August 14, 2025 14:21
Comment thread lib/srv/desktop/audit_compactor.go Outdated
Comment thread lib/srv/desktop/audit_compactor.go Outdated
}
}

func (s *stream) compactEvents() []streamEvent {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to return an iter.Seq here instead?

Comment thread lib/srv/desktop/audit_compactor.go Outdated
Comment thread lib/srv/desktop/audit_compactor.go Outdated
Comment thread lib/srv/desktop/audit_compactor.go Outdated
}
case tdp.SharedDirectoryReadResponse:
s.emit(ctx, audit.makeSharedDirectoryReadResponse(msg))
audit.compactor.handleRead(ctx, audit.makeSharedDirectoryReadResponse(msg))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might just add a comment to these lines that says something like:

// shared directory audit events can be noisy, so we use a compactor
// in order to bucket them instead of writing straight to the audit log

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same goes for handleWrite below.

Comment thread lib/srv/desktop/audit_compactor.go Outdated
Comment on lines +65 to +72
refreshInterval time.Duration
maxDelayInterval time.Duration
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between these two?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some comments to clarify. refreshInterval defines how long to wait for the next event to arrive before flushing a given bucket. maxDelayInterval defines the maximum time a bucket should exist before flushing. Basically the latter prevents some pathological read/write scenario from delaying audit events indefinitely.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, so event A and event B will be combined assuming they overlap and occur within refreshInterval, so long as the first event in this bucket occurred less than maxDelayInterval ago.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right!

@zmb3
Copy link
Copy Markdown
Collaborator

zmb3 commented Aug 15, 2025

Don't forget to link this PR to the issue it closes.

@public-teleport-github-review-bot public-teleport-github-review-bot bot removed the request for review from tcsc August 22, 2025 11:09
@rhammonds-teleport rhammonds-teleport force-pushed the rhammonds/desktop-audit-combiner branch from 2750f20 to ba118c6 Compare August 22, 2025 13:24
…eads and writes

Attempt #2. Read requests for a given copy operation can arrive out of order. Adjusted this approach to record all read/write events within a given time period, then find the longest contiguous set of reads/writes that can be joined into a single audit event before emitting  audit event(s).

bit of cleanup

A bit of code cleanup plus extra test case. Removed timestamps from testing since the compaction algorithm can't really guarantee which segments will get grouped together.

Fix racy tests by bringing assertions into synctest bubble and using 'flush' for syncronization. Also add more thorough test cases for timer expiration and flush behavior.

lint fixes

Add license header

Replace unnecessary sort with 'slices.Min'

Apply suggestions from code review

Co-authored-by: Zac Bergquist <zac.bergquist@goteleport.com>

Rename 'evnt' -> 'event'

Return iterator instead of slice

Abandon 'stream' terminology in favor of 'fileOperationsBucket'. This might be a better name since the code is basically sorting audit events  into a set of 'buckets' and attempting to compact each bucket of events later on.

Add a few comments
@rhammonds-teleport rhammonds-teleport force-pushed the rhammonds/desktop-audit-combiner branch from ba118c6 to e470056 Compare August 22, 2025 13:48
@rhammonds-teleport rhammonds-teleport added this pull request to the merge queue Aug 22, 2025
Merged via the queue into master with commit 37b712c Aug 22, 2025
41 of 42 checks passed
@rhammonds-teleport rhammonds-teleport deleted the rhammonds/desktop-audit-combiner branch August 22, 2025 17:36
@backport-bot-workflows
Copy link
Copy Markdown
Contributor

@rhammonds-teleport See the table below for backport results.

Branch Result
branch/v17 Create PR
branch/v18 Create PR

mmcallister pushed a commit that referenced this pull request Sep 22, 2025
…eads and writes (#57815)

Attempt #2. Read requests for a given copy operation can arrive out of order. Adjusted this approach to record all read/write events within a given time period, then find the longest contiguous set of reads/writes that can be joined into a single audit event before emitting  audit event(s).

bit of cleanup

A bit of code cleanup plus extra test case. Removed timestamps from testing since the compaction algorithm can't really guarantee which segments will get grouped together.

Fix racy tests by bringing assertions into synctest bubble and using 'flush' for syncronization. Also add more thorough test cases for timer expiration and flush behavior.

lint fixes

Add license header

Replace unnecessary sort with 'slices.Min'

Apply suggestions from code review

Co-authored-by: Zac Bergquist <zac.bergquist@goteleport.com>

Rename 'evnt' -> 'event'

Return iterator instead of slice

Abandon 'stream' terminology in favor of 'fileOperationsBucket'. This might be a better name since the code is basically sorting audit events  into a set of 'buckets' and attempting to compact each bucket of events later on.

Add a few comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Consolidate audit events related to directory read/write operations

4 participants