Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add view to Npgsql duration metrics #641

Closed
JamesNK opened this issue Nov 2, 2023 · 9 comments · Fixed by #658
Closed

Add view to Npgsql duration metrics #641

JamesNK opened this issue Nov 2, 2023 · 9 comments · Fixed by #658
Assignees
Labels
area-integrations Issues pertaining to Aspire Integrations packages

Comments

@JamesNK
Copy link
Member

JamesNK commented Nov 2, 2023

Histogram buckets aren't well configured in OpenTelemetry. The minimum bucket is 5. That's fine if you're measuring in milliseconds, not so good when you're measuring in seconds. There is special hardcoding for known .NET histograms, but 3rd party histograms aren't helped.

Need to add views to PostgresSQL component.

Example:

        builder.AddView("http.server.request.duration",
            new ExplicitBucketHistogramConfiguration
            {
                Boundaries = new double[] { 0, 0.005, 0.01, 0.025, 0.05, 0.075, 0.1, 0.25, 0.5, 0.75, 1, 2.5, 5, 7.5, 10 }
            });
@eerhardt
Copy link
Member

eerhardt commented Nov 2, 2023

This isn't an Aspire specific issue. Why isn't this being handled in OpenTelemetry? The instrument declares that it's units are seconds. This should just work without every instrument consumer needing to make a custom view.

@eerhardt
Copy link
Member

eerhardt commented Nov 2, 2023

Here's the instrument creation: https://github.com/npgsql/npgsql/blob/5c106023a1a192a2fab0726fc7df9ca51e39cb22/src/Npgsql/MetricsReporter.cs#L62-L66

        CommandDuration
            = Meter.CreateHistogram<double>(
                "db.client.commands.duration",
                unit: "s",
                description: "The duration of database commands, in seconds.");

@JamesNK
Copy link
Member Author

JamesNK commented Nov 2, 2023

Three decisions have put us in this situation:

  • OTEL recommends seconds for durations. They changed from ms to s this year to align for Prometheus, who uses seconds everywhere.
  • However, OTEL didn't change their histogram bucket bounds, which are still optimized for milliseconds. They won't change them because it's a breaking change. It's very dumb and means that the smallest bucket is 5 seconds.
  • There isn't a way to configure bounds when creating a histogram in .NET. The feature in OTEL is experimental. I'm going to push for it to happen for .NET 9 - Add "hints" in Metric API to provide things like histogram bounds runtime#63650

A fix is hard coded for known .NET built-in counters in OTEL. Automatically changing all histograms with a unit of seconds might not be what someone wants.

The temporary solution for 3rd parties in .NET 8 is to add custom views.

@JamesNK
Copy link
Member Author

JamesNK commented Nov 2, 2023

Duration metric reporting 5 seconds:

image

The graph is unlikely to report any other value because everything will fall into the 5-second bucket.

@eerhardt
Copy link
Member

eerhardt commented Nov 2, 2023

https://github.com/open-telemetry/semantic-conventions/blob/1eb70c4d74aa904a8c81c5ed10302af8bb3c1b70/docs/database/database-metrics.md?plain=1#L138-L146
says

Name Instrument Type Unit (UCUM) Description
db.client.connections.create_time Histogram ms The time it took to create a new connection

So maybe Npgsql should be updated to ms?

@JamesNK
Copy link
Member Author

JamesNK commented Nov 2, 2023

I think DB conventions just hasn't been updated yet for seconds. They're focusing on stabilizing HTTP.

@eerhardt eerhardt added the area-integrations Issues pertaining to Aspire Integrations packages label Nov 2, 2023
@eerhardt eerhardt added this to the preview-1 week milestone Nov 2, 2023
@eerhardt eerhardt self-assigned this Nov 2, 2023
eerhardt added a commit that referenced this issue Nov 2, 2023
* Restore disabled counter

This was missed in #648

Addresses #645

* Add view to Npgsql's histograms because they are in seconds and not ms.

Fix #641
eerhardt added a commit that referenced this issue Nov 2, 2023
* Restore disabled counter

This was missed in #648

Addresses #645

* Add view to Npgsql's histograms because they are in seconds and not ms.

Fix #641
@eerhardt
Copy link
Member

eerhardt commented Nov 2, 2023

Reopening to backport to preview1.

@eerhardt eerhardt reopened this Nov 2, 2023
@roji
Copy link
Member

roji commented Nov 2, 2023

So maybe Npgsql should be updated to ms?

FWIW I implemented seconds for command duration because of the explicit OTel conventions around this, as @JamesNK wrote above:

When instruments are measuring durations, seconds (i.e. s) SHOULD be used.

joperezr pushed a commit that referenced this issue Nov 2, 2023
* Restore disabled counter

This was missed in #648

Addresses #645

* Add view to Npgsql's histograms because they are in seconds and not ms.

Fix #641
@eerhardt
Copy link
Member

eerhardt commented Nov 2, 2023

Fixed with #660.

@eerhardt eerhardt closed this as completed Nov 2, 2023
@github-actions github-actions bot locked and limited conversation to collaborators Apr 30, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-integrations Issues pertaining to Aspire Integrations packages
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants