-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
misc: tweak metrics to support service-level benchmarks #908
Conversation
|
||
override fun requestFailed(call: Call, ioe: IOException) = trace(ioe) { "request failed" } | ||
|
||
override fun requestHeadersStart(call: Call) = trace { "sending request headers" } | ||
|
||
override fun requestHeadersEnd(call: Call, request: Request) = trace { "finished sending request headers" } | ||
override fun requestHeadersEnd(call: Call, request: Request) { | ||
if (request.body == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correctness: Shouldn't this be after sending the body? If there is no body this is fine, we should be able to detect that though and either set it here or in responseBodyEnd
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little confused about this comment, requestTimeEnd
is set in both places: requestHeadersEnd
if there is no body, and requestBodyEnd
if there is a body, so I think it's correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're right, I just missed it in first review
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes what confused me is that this entire file has the events out-of-order (i.e requestHeadersEnd
is defined after requestBodyStart
even though it occurs before in the chain of events). It could be worth reordering these to match the order shown in the OkHttp docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, that would make it easier to parse this file. I can submit a mini-PR after this one gets merged.
|
||
@InternalApi | ||
public object EngineAttributes { | ||
public val TimeToFirstByte: AttributeKey<Duration> = AttributeKey("TimeToFirstByte") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix: docs
val attemptDuration = attemptStart?.elapsedNow() ?: return | ||
metrics.rpcAttemptDuration.recordSeconds(attemptDuration, perRpcAttributes, metrics.provider.contextManager.current()) | ||
|
||
context.executionContext.getOrNull(EngineAttributes.TimeToFirstByte)?.let { ttfb -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: Should we clear this after reading it? e.g. what if we retry and it fails before recording TTFB? We'll be using the first TTFB value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right we may have an issue if we fail to record TTFB on a retry. We certainly can clear it after reading it, although this makes me wonder if there's a new set of execution context which is scoped to a single attempt and should be cleared when an attempt is ended.
What would we think about a new optional scope for attributes:
interface MutableAttributes : Attributes {
// ...existing set, remove, and computeIfAbsent methods...
operator fun <T : Any> set(scope: AttributeScope, key: AttributeKey<T>, value: T)
fun <T : Any> removeAll(scope: AttributeScope)
}
Get operations would find an attribute regardless of its scope but removeAll
could be used to clear all attributes with a specific scope (e.g., an attempt).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm maybe. Seems like it may complicate some things but wouldn't know until I tried. Also raises questions like what attributes are visible? If I set an attribute in one scope but not another are they visible to other scopes? Are they independent (i.e. can they shadow each other)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, this is likely too complicated without a deeper design. I'll just remove it upon reading.
@@ -94,6 +96,13 @@ internal class OperationTelemetryInterceptor( | |||
if (attempts > 1) { | |||
metrics.rpcRetryCount.add(1L, perRpcAttributes, metrics.provider.contextManager.current()) | |||
} | |||
|
|||
val attemptDuration = attemptStart?.elapsedNow() ?: return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change the implementation of smithy.client.attempt_duration to measure attempt duration instead of merely transmission duration
This now includes signing and other things right? The current description of the metric was accurate for what it measured: The time it takes to connect to the service, send the request, and receive the HTTP status code and headers from the response for an operation
. If we are updating the metric definition lets update the description to match.
Can you first explain why this is needed/better? Just want to understand the motivation for the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this now includes identity resolution, endpoint resolution, and signing. The inclusion of EP resolution and signing will likely have minimal impact to the metric but identity resolution is a potentially-expensive operation since it may involve local IO or calls to remote services. Given that the intent is to measure the SDK-added overhead and that identity resolution, EP resolution, and signing are per-attempt, we'd need to know the duration of the entire attempt.
I'll update the metric documentation.
|
||
override fun requestFailed(call: Call, ioe: IOException) = trace(ioe) { "request failed" } | ||
|
||
override fun requestHeadersStart(call: Call) = trace { "sending request headers" } | ||
|
||
override fun requestHeadersEnd(call: Call, request: Request) = trace { "finished sending request headers" } | ||
override fun requestHeadersEnd(call: Call, request: Request) { | ||
if (request.body == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little confused about this comment, requestTimeEnd
is set in both places: requestHeadersEnd
if there is no body, and requestBodyEnd
if there is a body, so I think it's correct?
Kudos, SonarCloud Quality Gate passed!
|
Issue #
Closes awslabs/aws-sdk-kotlin#968
Description of changes
Various changes necessary to support service-level benchmarks including:
ExecutionContext
smithy.client.attempt_duration
to measure attempt duration instead of merely transmission durationCompanion PR: awslabs/aws-sdk-kotlin#1006
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.