-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sasha/exp2 #1022
Sasha/exp2 #1022
Conversation
lib/client/bench.go
Outdated
} else { | ||
done := make(chan bool) | ||
go func() { | ||
err := client.SSH(ctx, nil, false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why bother declaring err
?
lib/events/auditlog.go
Outdated
func (l *AuditLog) PostSessionChunk(namespace string, sid session.ID, reader io.Reader) error { | ||
if namespace == "" { | ||
// PostSessionChunks writes a new chunk of session stream into the audit log | ||
func (l *AuditLog) PostSessionSlice(slice SessionSlice) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment doesn't match the function.
return trace.BadParameter("missing parameter Namespace") | ||
} | ||
sl, err := l.LoggerFor(namespace, sid) | ||
if len(slice.Chunks) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why bother? the function appears to be safe to run on empty slice. it won't do anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LoggerFor opens file, so did not want to open a file on server if we know it won't work
retest this please |
1 similar comment
retest this please |
Fixes #933
Description
This PR contains implementation for the new log forwarder
Some important properties of this implementation:
Introduces grpc and protobuf based serialization. My plan is to release grpc-based API in future versions
of teleport, however this time I only use protobuf as a payload encoding. I settled on gogoproto
marshaler as it's much faster (see benchmarks here )
Without back-pressure on posting session chunks, audit log was loosing events
because produce was much faster than consume and buffer was overflowing even if you set it to very
large value, e.g. 10K . Back pressure is triggered on large outputs, e.g.
cat /dev/random
and usually corrects the output with1ms
delays when log server flushes the output to disk, causing occasional 10-15ms latency spikes.Throttle is important to continue the session in case if audit log
get slow, as the session output will block and timeout on every request
It is important to pack chunks, because
ls / -laR | lolcat
would otherwisegenerate about 10K requests per second. With this packing approach
we reduced this number to about 40-50 requests per second,
we can now tweak this parameter now by setting queue size and flush buffers.
Current implementation attaches audit log forwarder per session. I have tested with 100 parallel sessions and the memory usage is about 100MB to make sure it's not too expensive
One can be concerned about channel and goroutine per audit forward, however I would not worry about that for now, as the overhead per channel is substantial, but not critical
New implementation will make it easy to introduce "audit-mandatory" sessions that will freeze if the audit log can't be sent, however I did not bother to implement those.
Thoughts
I spent several days doing benchmarks and various loadtests, as a result I have refined the dashboard and added various metrics outputs to teleport Prometheus endpoint. Here are some thoughts on the overall Teleport performance:
lolcat
- high CPU usage is mostly caused by SSH driven IOgogoproto
completely removed the chunk encoding/decoding from the CPU profile picture, once I will migrate events stream to protobuf as well, this will be even less of an issue.