-
Notifications
You must be signed in to change notification settings - Fork 658
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metrics API #68
Metrics API #68
Conversation
Comments for Meter More comments Add more comments Fix typos
Please sign the CLA. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks like it provides everything that was laid out in the spec, which is amazing!
I think along the lines of open-telemetry/opentelemetry-specification#165, we can consider an API that provides the capabilities necessary but ensures the user experience is idiomatic python and succinct. If I understand correctly, the code I would need to create and record a value today is:
ENVIRONMENT_LABEL = LabelKey("environment", "the environment the app is running in")
measure = meter().create_long_counter("request-count", options=MeasureOptions(
label_keys=[ENVIRONMENT_LABEL]
))
meter().record(
[measure.create_long_measurement(10)], # not sure how to add label values
options=RecordOptions(distributed_context=DistributedContext.get_current())
)
Which feels a bit unwieldy (I also may be completely misunderstanding how the APIs should be used here). Many metrics libraries generally create a measure and then record the measure, such as:
ENVIRONMENT_LABEL = LabelKey("environment", "the environment the app is running in")
measure = meter().create_long_counter("request-count", options=MeasureOptions(
label_keys=[ENVIRONMENT_LABEL]
))
measure.record(10, ["staging"]) # handles creating a measurement object and recording it.
This also makes the measures significantly more powerful, which allows them to be carried around as app variables that might be useful later (e.g. in flask):
app.total_executed_jobs = meter().create_long_counter(...)
def handle_foo():
app.total_executed_jobs.record(1)
Something that is totally possible to accomplish based on the SDK implementations. But I'm hard-pressed to imagine why you wouldn't want a metrics API to expose that sort of an interface.
A :class:`.Measure` | ||
""" | ||
|
||
def record(self, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how does this play into the measures? I'm thinking through the interface and I see something like:
measure = meter().create_double_counter("foobar", label_keys=[ENVIRONMENT])
measure.create_double_measurement(10, "staging")
I see that the contexts are not an explicit argument, although it would be good to include those labels. I guess that would be possible depending on the implementation of the measure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a long conversation about static labels at open-telemetry/oteps#4, not sure if it answers your question though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth discussing whether we need constant labels for measurements since distributed_context
is not explicit. I'll bring it up in the meeting.
A new :class:`.Measurement` | ||
""" | ||
|
||
def create_long_measurement(self, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these apis do not provide a way to add the required label_keys as specified in the MetricOptions, which state that it is defining required labels.
Looking at the spec it doesn't call out a required argument for the label values, but it feels like that's needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taken from the spec, for measurements, I believe the way to added the required label values (or dimensions) would be through passing in an explicit distributed_context
or from the current context itself. This is done not at the creation of the measurement level but when the measurements are recorded, which is why the creating of measurements do not have a way to pass in specific label values.
Other general thought: is there strong value in not having standard behavior here? There's a lot of complex logic that would have to be re-implemented if someone were to choose a separate sdk. Specifically:
It feels like to me having a defined way that all of this works would reduce confusion if people plugged in different sdks. But maybe the implicit assumption is that people will almost always consume the sdk along with the API, and thus feel confident about consistent behavior by that convention. |
|
||
def __init__(self, | ||
value: str) -> None: | ||
self.value = value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just use a string for this instead of introducing a class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Introducing a class gives flexibility in the future if we ever want to add anything to LabelValue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good so far! I took a first pass, but there's a lot here still to consider. I'm in favor of losing the options classes, losing the constructors, and splitting this up into multiple modules.
There are some problems rendering the docs, you may want to cherry-pick edbc34a and try generating the docs as you're writing them.
A :class:`.Measure` | ||
""" | ||
|
||
def record(self, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a long conversation about static labels at open-telemetry/oteps#4, not sure if it answers your question though.
|
||
def create_measure(self, | ||
name: str, | ||
options: typing.Optional['MeasureOptions'] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the type (int or float) an option here when there are separate methods for int/float counters and gauges?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the specification, a measurement has both createDouble
and createLong
methods, irregardless of the type of the measure
that was used to create the measurement
. If we remove the type in MeasureOptions
, there will be no validation logic between the measure_type
and the type of measurement
being created. I think we want to prevent a measure
to be able to create any type of measurement
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks to me like the spec was written to match the java client, and as far as I can tell there's no reason for (java's) Measure
to have both createDoubleMeasurement
and createLongMeasurement
.
I'd imagine something like this instead:
class Meter:
def create_double_measure(...)
def create_long_measure(...)
class DoubleMeasure(Measure):
def create_measurement(value)
# returns a double-valued Measurement
class LongMeasure(Measure):
def create_measurement(value)
# returns a long-valued Measurement
and possibly replacing Measurement
with separate DoubleMeasurement
and LongMeasurement
classes.
I don't mean to suggest deviating from the spec. The spec is underdefined here and writing the python client should help us generate spec changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. I was focused too much on following the spec. I also think the ability to create both types of measurements and having to validate the type of measure is redundant as well. I will make these changes and propose them in the spec.
self.resource = resource | ||
|
||
|
||
class MeasureType: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hopefully this is one of those ugly things we only need in javascript...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we should have some other way of representing the type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be missing something here, but ideally we represent differently typed metrics with different classes.
I signed it |
@c24t @toumorokoshi |
@toumorokoshi I've posted some sample code on how measures and metrics would be used in a very simple case (in the PR description). Feel free to make some comments and questions! |
@lzchen the examples would be useful in the repo too, and then we can comment on them in this PR. |
Just linking #48 |
"""Used to create raw :class:`.FloatMeasurement` s.""" | ||
|
||
def create_measurement(self, | ||
value: int, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Float here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:class:`.Metric` s are used for recording pre-defined aggregation, or | ||
already aggregated data. This should be used to report metrics like | ||
cpu/memory usage, in which the type of aggregation is already defined, or | ||
simple metrics like "queue_length". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question for the spec: where do we define the aggregation behavior a la views in OC?
import typing | ||
|
||
|
||
class CounterTimeSeries: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note from IRL conversation: we should probably revisit this class structure, decide if we need separate TS for each type combination, whether we need measurement types before export.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the examples! I think there is still some more simplification that can be done here. But It's not clear to me how much that will violate what the API that is called for in OpenTelemetry would be.
Specifically, I'm thinking about the user interface exposing these six options:
- float gauge
- float counter
- float measure
- int gauge
- int counter
- int measure
The measures themselves I don't see a lot of value for, they need to work in tandem with some sort of aggregator (which OT calls timeseries, which IMO is a little confusing) to actually produce a measurement that will be enqueued and sent to exporters. I can imagine a measurement being a primitive that Gauge / Counters use, but a consumer will probably not use them directly.
Also not sure about the value of calling out the type in the method signature. A similar flexibility could be done by just passing the type into the factory:
create_counter(name, description, cls=int)
Which would further reduce API definitions for both int and float timeseries.
I think there's a lot of work here that would be best to be contributed back to the OT spec. I think it continues to highlight some issues that are worth hashing out across implementations.
@@ -11,3 +11,14 @@ | |||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | |||
# See the License for the specific language governing permissions and | |||
# limitations under the License. | |||
|
|||
|
|||
class LabelValue: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this just be a string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if we ever needed to add additional fields to LabelValue, we could do so if we used a class without introducing breaking changes. However, label values might be fixed as strings so making them strings might be appropriate.
opentelemetry-api/src/opentelemetry/metrics/examples/pre_aggregated.py
Outdated
Show resolved
Hide resolved
def create_int_counter(self, | ||
name: str, | ||
description: str, | ||
unit: str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
random question: why is "unit" it's own field when it could be identified as a label_key?
I feel like not everything measured needs a unit to help understand the value. And even if it did, I don't believe many metric storage systems have the unit as a field they store.
For example, I believe graphite or prometheus would store the "unit" key as a tag or a part of the metric name.
"number", | ||
LABEL_KEYS) | ||
LABEL_VALUES = ["Testing"] | ||
SUM_TIME_SERIES = SUM_METRIC.getOrCreateTimeSeries(LABEL_VALUES) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding the example! Things are getting a little clearer for me.
What is the value for having the workflow of creating a metric, then creating a time series?
It feels to me like the metric is not valuable without the time series included. Generally, when emitting metrics, I think of a single measurement as an event that writes to a stream, that then gets aggregated by whatever queue system and submitted to the exporter.
If I was a user, I would like things to be as simple as:
METER = Meter()
LABEL_KEYS = [LabelKey("environment",
"the environment the application is running in")]
SUM_METRIC = METER.create_int_counter("sum numbers",
"sum numbers over time",
"number",
LABEL_KEYS)
SUM_METRIC.record(10)
Continued with [#87 ] |
This PR adds the Metrics API package.
this API takes information from:
Example usage for raw measurement:
Example usage for already aggregated metrics:
Example usage for simple pre-defined aggregation metrics: