Skip to content

(feat: persistence) Add schema-metrics-v1.sql for metrics tables (#3337)#3523

Merged
dimas-b merged 1 commit intoapache:mainfrom
obelix74:feat-3337-schema-v4
Feb 17, 2026
Merged

(feat: persistence) Add schema-metrics-v1.sql for metrics tables (#3337)#3523
dimas-b merged 1 commit intoapache:mainfrom
obelix74:feat-3337-schema-v4

Conversation

@obelix74
Copy link
Contributor

@obelix74 obelix74 commented Jan 24, 2026

Add new schema version 4 with tables for storing scan and commit metrics reports as first-class entities.

New tables:

  • scan_metrics_report: Stores scan metrics with trace correlation
  • scan_metrics_report_roles: Junction table for principal roles
  • commit_metrics_report: Stores commit metrics with trace correlation
  • commit_metrics_report_roles: Junction table for principal roles

Key design decisions:

  • PRIMARY KEY (realm_id, report_id) for multi-tenancy
  • Junction tables with CASCADE DELETE for roles
  • Timestamp index for retention cleanup
  • JSONB metadata column for extensibility (Postgres), TEXT for H2

Checklist

  • 🛡️ Don't disclose security issues! (contact security@apache.org)
  • 🔗 Clearly explained why the changes are needed, or linked related issues: Fixes #
  • 🧪 Added/updated tests with good coverage, or manually tested (and explained how)
  • 💡 Added comments for complex logic
  • 🧾 Updated CHANGELOG.md (if needed)
  • 📚 Updated documentation in site/content/in-dev/unreleased (if needed)

@singhpk234 singhpk234 requested a review from dimas-b January 24, 2026 00:23
@dimas-b
Copy link
Contributor

dimas-b commented Jan 26, 2026

@obelix74 @singhpk234 : WDYT about starting an RFC doc + dev thread on this? I believe a structured overview of this feature would be good to set the stage for PRs :) (apologies if I missed it)

@singhpk234
Copy link
Contributor

@dimas-b there is a dev thread already please ref : https://lists.apache.org/thread/c83jnkvlwc2k3swm65cmvl4t0mt7p799
thanks @obelix74 for the the writing this up !

@obelix74
Copy link
Contributor Author

@obelix74 @singhpk234 : WDYT about starting an RFC doc + dev thread on this? I believe a structured overview of this feature would be good to set the stage for PRs :) (apologies if I missed it)

I am trying to solve two sets of asks from my product folks with this.

  1. Metrics - what tables were accessed by a client principal
  2. Auditing - which user accessed what data and why

From the metrics perspective, today, with 1.3.0, I want to be able to report on table metrics based on:

Track table scan operations:

  • by table
  • by snapshot
  • by time range
  • by realm
  • by user principal
  • by engine

For commit report queries:

  • by operation type
  • data growth
  • file churn
  • storage analysis

Also many operational dashboards, and filtering by user, realm, engine name, version etc.

I have not thought about roles in this flow at all, perhaps it will be useful. @singhpk234 recommended adding roles and I added them. I normalized the roles tables from a RDBMS perspective, but I didn't realize there are other similar fields stored as JSON already.

@dimas-b
Copy link
Contributor

dimas-b commented Feb 4, 2026

@obelix74 : please rebase to fix CI

@dimas-b
Copy link
Contributor

dimas-b commented Feb 4, 2026

Let's hold final review until #3616 is resolved... Intermediate comments are welcome, of course :)

@obelix74 obelix74 force-pushed the feat-3337-schema-v4 branch 3 times, most recently from 7771396 to 65e8cb4 Compare February 5, 2026 23:44
@obelix74 obelix74 requested a review from singhpk234 February 5, 2026 23:50
dimas-b
dimas-b previously approved these changes Feb 6, 2026
Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Let's give this PR a couple more days in review for other people to comment if they want.

@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Feb 6, 2026
@dimas-b dimas-b changed the title (feat: persistence) Add schema-v4.sql for metrics tables (#3337) (feat: persistence) Add schema-metrics-v1.sql for metrics tables (#3337) Feb 6, 2026
dimas-b
dimas-b previously approved these changes Feb 6, 2026
Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@singhpk234 : WDYT?

dimas-b
dimas-b previously approved these changes Feb 6, 2026
@obelix74 obelix74 requested a review from dimas-b February 12, 2026 20:36
obelix74 pushed a commit to obelix74/polaris that referenced this pull request Feb 12, 2026
…tation

Add checkMetricsPersistenceBootstrapped() to RelationalJdbcProductionReadinessChecks
that properly verifies the MetricsPersistence implementation as requested in PR apache#3523.

The check verifies:
1. The MetricsPersistence implementation is JdbcMetricsPersistence (not NoOpMetricsPersistence)
2. The JdbcMetricsPersistence.supportsMetricsPersistence() returns true (schema >= v4)

This addresses the PR comment from dimas-b suggesting we should test the actual
Metrics Persistence implementation rather than just checking configuration flags.
@obelix74 obelix74 force-pushed the feat-3337-schema-v4 branch from df4d3e7 to ff9d552 Compare February 12, 2026 21:48
This commit adds database schema support for metrics persistence tables
as part of the JDBC persistence backend.

Key changes:
- schema-metrics-v1.sql for PostgreSQL and H2 with metrics tables
- DatabaseType: Add metrics schema resource path support
- JdbcBasePersistenceImpl: Add metrics schema bootstrap capability
- JdbcBootstrapUtils: Support for metrics schema initialization
- QueryGenerator: Add metrics-related query generation
- SchemaOptions: Add metrics schema option for bootstrap command
- BootstrapCommand: Support --metrics flag for schema bootstrap

Testing:
- MetricsPersistenceBootstrapValidationTest: Validates schema bootstrap
- JdbcBootstrapUtilsTest: Tests for bootstrap utilities
- QueryGeneratorTest: Tests for metrics query generation
@obelix74 obelix74 force-pushed the feat-3337-schema-v4 branch from 3e553a3 to 64dca5c Compare February 14, 2026 15:11
@dimas-b dimas-b merged commit 4ffb98a into apache:main Feb 17, 2026
15 checks passed
@github-project-automation github-project-automation bot moved this from Ready to merge to Done in Basic Kanban Board Feb 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants