(feat) #3337 REST catalog metrics table#3348
Conversation
|
|
||
| ### Metrics Event Data | ||
|
|
||
| The `AfterReportMetricsEvent` captures the following data in `additional_properties`: |
There was a problem hiding this comment.
i am not sure if persisting the metrics is technically an event, my understanding of event is something that happened while process for example :
- log request
- processing in the middle
- log response
here there is no processing except persisting the metrics to persistence and then response is nothing but 200 OK, have we considered just persisting the metrics reports in seperate table and then using the join of the both the tables ?
There was a problem hiding this comment.
I think having this as an event opens up the possibility of firing this off to Kafka or any other external source if want to gather the metrics there. Likewise, we can plugin in another listener to send it to CloudWatch logs. Also, storing it in the single events table allows a single query to capture all audit trail.
There was a problem hiding this comment.
The big negative of a separate metrics table is trace correlation complexity. Correlating metrics with other events (e.g., "show me everything that happened for this commit") requires explicit JOINs on otel_trace_id across tables. There is also schema migration overhead and introducing new Java classes with duplication effort to persist.
81d0265 to
818c9e4
Compare
818c9e4 to
2833e2c
Compare
##IMPORTANT - DO NOT MERGE
This draft PR currently contains commits from #3327 as well. When that PR is merged, this PR will be rebased against main.
Checklist
CHANGELOG.md(if needed)site/content/in-dev/unreleased(if needed)Summary
This PR implements a flexible, configurable metrics persistence system for Apache Polaris that captures Iceberg ScanReport and CommitReport data. The implementation provides multiple storage options to accommodate different use cases, from simple audit logging to advanced analytics.
Motivation
Compute engines (Spark, Trino, Flink) send metrics reports to Polaris after query execution, including:
Previously, these metrics were logged but not persisted, making it impossible to:
Implementation
Metrics Storage Options
This PR introduces four configurable reporter types:
defaulteventspersistencecompositeNew Components
1. EventsMetricsReporter
Persists metrics to the existing events table as JSON, providing a unified audit trail.
ScanReportCommitReportadditional_properties2. PersistingMetricsReporter
Persists metrics to dedicated tables with typed columns for efficient querying.
scan_metrics_reportsandcommit_metrics_reportstables3. CompositeMetricsReporter
Delegates to multiple reporters simultaneously, enabling flexible deployment patterns.
4. MetricsReportCleanupService
Scheduled service for automatic cleanup of old metrics data.
Configuration Examples
Option 1: Logging Only (Default)
Option 2: Events Table
Option 3: Dedicated Tables
Option 4: Composite (Multiple Targets)
With Retention Policy
Benefits by Storage Option
Events Table (
type: events)Dedicated Tables (
type: persistence)otel_trace_idandotel_span_idcolumnsComposite (
type: composite)Testing
EventsMetricsReporterTest: Verifies events are correctly created and persistedCompositeMetricsReporterTest: Verifies delegation to multiple reportersMetricsReportPersistenceTest: End-to-end persistence with H2 databaseExample Queries
Data scanned by user:
Correlate with OpenTelemetry:
Migration Notes
type: default(logging only)Related PRs
Configuration
No new configuration required. The feature uses existing infrastructure: