Skip to content

Conversation

@adutra
Copy link
Contributor

@adutra adutra commented Nov 6, 2025

This PR implements the action items from the following discussion threads:

Summary of changes:

  • Introduced a PolarisEventType enum holding the 150+ event types.
  • Introduced a PolarisEventMetadata interface as suggested by @adnanhemani, exposing: timestamp, realm ID, principal, request ID, and OTel context.
  • Introduced a PolarisEventMetadataFactory to centralize the logic for gathering the various elements of an event metadata.
  • Modified PolarisEvent to expose 3 new methods:
    • UUID id()
    • PolarisEventType type()
    • PolarisEventMetadata metadata()
  • Persistence of OTel context is done in additional_properties as suggested by @flyrain.
  • Added InMemoryBufferEventListenerIntegrationTest to verify that all contextual data is properly persisted.

Checklist

  • 🛡️ Don't disclose security issues! (contact [email protected])
  • 🔗 Clearly explained why the changes are needed, or linked related issues: Fixes #
  • 🧪 Added/updated tests with good coverage, or manually tested (and explained how)
  • 💡 Added comments for complex logic
  • 🧾 Updated CHANGELOG.md (if needed)
  • 📚 Updated documentation in site/content/in-dev/unreleased (if needed)

@adutra
Copy link
Contributor Author

adutra commented Nov 6, 2025

\cc @flyrain @adnanhemani @dimas-b

Copy link
Contributor

@adnanhemani adnanhemani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, looks great. Thank you for this, @adutra! One question that I think might help make this a tighter change, but overall no concerns!

polarisEventListener.onBeforeCreateCatalog(
new CatalogsServiceEvents.BeforeCreateCatalogEvent(request.getCatalog().getName()));
new CatalogsServiceEvents.BeforeCreateCatalogEvent(
PolarisEvent.createEventId(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wanted to see your thoughts on this: can we collapse Event ID into the EventMetadata as well? I think it makes sense as part of the "metadata" of the event and may help us save many lines of code uniformly across all events.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that too, but after some hesitation, I think an ID is a bit more than "metadata". The fact that this specific field must be unique makes it imho a bit special. Wdyt?

That said I don't have strong opinions on this, and I'd be OK to move it to metadata if there is a consensus to do so.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your point, it is definitely a bit of a "grey area." I think the amount of cleanliness that we gain from moving it into EventMetadata outweighs the ambiguities here, though. I'm also willing to sway my opinion with the will of the community, but I would vote for merging it into EventMetadata if others feel similarly :)

Copy link
Contributor Author

@adutra adutra Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more I think about this the more I think we shouldn't move the id to the metadata. Here is what the PolarisEventMetadata would look like:

@PolarisImmutable
public interface PolarisEventMetadata {
  UUID id();
  // other methods omitted for brevity
}

But that creates an ambiguity: is id() the identifier of the metadata object, or of the event object?

I could rename it to eventId() of course, but it still doesn't feel right to me. An identifier should be attached to the object it identifies.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say it would be named eventId() in this model. IMO, the rest of the fields in PolarisEventMetadata are also vital parts of the event's identity - but I understand the argument philosophically. It's ultimately your call, but I still don't see the harm of doing this versus the benefits are the conciseness of the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, I moved the id to PolarisEventMetadata. Please take another look at the latest commit 881a477

@adutra adutra force-pushed the events-api-refactoring branch from 1fd437e to c84c18a Compare November 12, 2025 14:44
@adnanhemani
Copy link
Contributor

@adutra - I see you force-pushed the last commit so it's rewritten the commit history. I'm assuming it was for rebasing this PR onto main? Was there any other major change? Since it's a larger change, I fear I may have missed something.

@adutra
Copy link
Contributor Author

adutra commented Nov 13, 2025

@adutra - I see you force-pushed the last commit so it's rewritten the commit history. I'm assuming it was for rebasing this PR onto main? Was there any other major change? Since it's a larger change, I fear I may have missed something.

I rebased because of conflicts - no functional change.

@dimas-b
Copy link
Contributor

dimas-b commented Nov 17, 2025

@adutra : there are more conflicts 🤷

@adutra adutra force-pushed the events-api-refactoring branch from c84c18a to 18f92fc Compare November 18, 2025 23:03
@adutra
Copy link
Contributor Author

adutra commented Nov 18, 2025

@adutra : there are more conflicts 🤷

re-rebased!

This PR implements the action items from the following discussion threads:

- https://lists.apache.org/thread/yx7pkgczl6k7bt4k4yzqrrq9gn7gqk2p
- https://lists.apache.org/thread/rl5cpcft16sn5n00mfkmx9ldn3gsqtfy
- https://lists.apache.org/thread/5dpyo0nn2jbnjtkgv0rm1dz8mpt132j9

Summary of changes:

- Introduced a `PolarisEventType` enum holding the 150+ event types.
- Introduced a `PolarisEventMetadata` interface as suggested by @adnanhemani, exposing: timestamp, realm ID, principal, request ID, and OTel context.
- Introduced a `PolarisEventMetadataFactory` to centralize the logic for gathering the various elements of an event metadata.
- Modified `PolarisEvent` to expose 3 new methods:
  - `UUID id()`
  - `PolarisEventType type()`
  - `PolarisEventMetadata metadata()`
- Persistence of OTel context is done in `additional_properties` as suggested by @flyrain.
- Added `InMemoryBufferEventListenerIntegrationTest` to verify that all contextual data is properly persisted.
@adutra adutra force-pushed the events-api-refactoring branch from 4c64677 to 881a477 Compare November 25, 2025 15:10
@adutra
Copy link
Contributor Author

adutra commented Nov 25, 2025

FYI: re-rebased because of more conflicts.

Copy link
Contributor

@adnanhemani adnanhemani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you for this @adutra!!


/** The unique ID of the event. */
default UUID eventId() {
return UUID.randomUUID();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same concern as for request IDs, if it is used for each event (likely multiple per request), it might be too slow / too much affected by the system source of randomness... but we can address this later if other people think it's a concern.

* "UNKNOWN_REALM_ID".
*/
private String getRealmId() {
return realmContext.isResolvable()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we not sure that realm context is always present?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some tests, events are generated on the "client side", in which case, there is no realm context. (In production code there is always a ream context though.)

}

testImplementation(project(":polaris-api-management-model"))
testImplementation(project(":polaris-relational-jdbc"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might have missing this... How does JDBC get into tests now ? 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To test the persistence-in-memory-buffer event listener with H2: see InMemoryBufferEventListenerIntegrationTest.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx!

@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Nov 26, 2025
@adutra adutra merged commit 64b13a9 into apache:main Nov 27, 2025
15 checks passed
@adutra adutra deleted the events-api-refactoring branch November 27, 2025 15:16
@github-project-automation github-project-automation bot moved this from Ready to merge to Done in Basic Kanban Board Nov 27, 2025
snazy added a commit to snazy/polaris that referenced this pull request Feb 11, 2026
* Do not fail a release when markdown-link-check check fails as it is flaky (apache#3116)

* Source tarball reproducible (apache#3143)

`git --mtime` MUST use the time zone for reproducible builds.

* Skip release e-mail templates from svn dist copy (apache#3147)

* Make pom.xml always reproducible (apache#3145)

It turned out in practice, that there's no guarantee that the `<parent>` element in `pom.xml` files always appear at the same place.

This change ensures that the `<parent>` elements always appears at a deterministic location at the top of `pom.xml` files.

* Fix executable POSIX permission in archive files (apache#3146)

The PR apache#2819 accidentally _removed_ the executable POSIX file permission, assuming that not explicity setting the attributes via `filePermissions` retains the file-system 'x' permission.

This change updates the logic to explicitly check the owner-executable bit and uses `755` or `644` respectively for each individual file in the archive.

* Spark: Initial integration for hudi tables within Polaris  (apache#1862)

* Update actions/setup-python digest to 83679a8 (apache#3157)

* Update actions/stale digest to 5611b9d (apache#3155)

* Fix LICENSE and NOTICE in the distributions and docker images. (apache#3125)

* Remove readEntity() call (apache#3111)

Calling readEntity() is not allowed server-side by some HTTP servers.

* Run CI on release branches (apache#3121)

The release workflows check whether CI passes for the required checks.
This would fail, because CI isn't configured to run on release branches.

This change lets CI run on `release/*` branches.

* adding support to use a kms key for s3 buckets data encryption (AWS only) (apache#2802)

Add catalog-level support for KMS with s3 buckets

* Update plugin jetbrains-changelog to v2.5.0 (apache#3166)

* Update quay.io/keycloak/keycloak Docker tag to v26.4.6 (apache#3163)

* NoSQL: Prepare admin-tool (apache#3134)

No functional changes.

1. Refactor the configuration property to a configuration type.
2. Make `BaseCommand` suitable for non-meta-store-factory use cases.

* Iceberg-Catalog: also set catalog-id for location overlap checks (apache#3136)

* Fix catalog-role creating in `PolarisTestMetaStoreManager` (apache#3122)

`testLookup()` attempts to check for a catalog-role on catalog ID 0, which is an illegal ID for a catalog.

Fix is to move the assertion below the catalog creation.

* Releasy: prepare for Helm 4 (helm package repro) (apache#3088)

Part of apache#3086

* Update Quarkus Platform and Group to v3.30.1 (apache#3168)

* Relax ARN validation logic (apache#3071)

Following up on apache#3005, which allowed a wide range of ARN values in the validation RegEx, remove an additional explicit check for `aws-cn` being present in the ARN as a sub-string.

Update existing unit tests to process `aws-cn` ARNs as common `aws` ARNs.

Note: the old validation code does not look correct because it used to check for `aws-cn` anywhere in the ARN string, not just in its "partition" component.

* docs: Add François as Mentor (apache#3162)

* docs: Add François as Mentor

* update mentor list according to ASF project info

* Event type IDs + event metadata incl. OTel context (apache#2998)

This PR implements the action items from the following discussion threads:

- https://lists.apache.org/thread/yx7pkgczl6k7bt4k4yzqrrq9gn7gqk2p
- https://lists.apache.org/thread/rl5cpcft16sn5n00mfkmx9ldn3gsqtfy
- https://lists.apache.org/thread/5dpyo0nn2jbnjtkgv0rm1dz8mpt132j9

Summary of changes:

- Introduced a `PolarisEventType` enum holding the 150+ event types.
- Introduced a `PolarisEventMetadata` interface as suggested by @adnanhemani, exposing: event ID, timestamp, realm ID, principal, request ID, and OTel context.
- Introduced a `PolarisEventMetadataFactory` to centralize the logic for gathering the various elements of an event metadata.
- Modified `PolarisEvent` to expose 3 new methods:
  - `PolarisEventType type()`
  - `PolarisEventMetadata metadata()`
- Persistence of OTel context is done in `additional_properties` as suggested by @flyrain.
- Added `InMemoryBufferEventListenerIntegrationTest` to verify that all contextual data is properly persisted.

* fix typo in management API yaml (apache#3172)

* Fix homepage Get Started button layout (apache#3169)

Wrap the Get Started button in a div container to prevent it from
becoming inline with text at certain screen widths. Follows Docsy
blocks/cover shortcode pattern.

* fix OPA javadoc referencing `OpaSchemaGenerator` (apache#3153)

`OpaSchemaGenerator` is not on the classpath of `opa/impl/main` so the javadoc tool is not able to resolve a `@link` to it.

Use `@code` instead to avoid build warnings like the following:

* Update dependency com.azure:azure-sdk-bom to v1.3.3 (apache#3179)

* Update dependency com.google.errorprone:error_prone_core to v2.45.0 (apache#3177)

* test: Add Some Spark Client Tests and Update Documentation on Generic Tables (apache#3152)

* Site: Make homepage image full-width (apache#3171)

Add CSS class to allow images to span full viewport width by
canceling out container padding. Apply to homepage hero image
using AsciiDoc role attribute.

* chore(enhancement): gitignore application-local.properties (apache#3175)

* Update registry.access.redhat.com/ubi9/openjdk-21-runtime Docker tag to v1.23-6.1764155306 (apache#3186)

* Update quay.io/keycloak/keycloak Docker tag to v26.4.7 (apache#3185)

* Update dependency software.amazon.awssdk:bom to v2.39.6 (apache#3184)

* Testing: increase visibility + make PCC/PMSM accessible (apache#3137)

* `BasePolarisMetaStoreManagerTest`: make `PolarisCallContext` + `PolarisMetaStoreManager` + `PolarisTestMetaStoreManager` accessible by subclasses
* Make constants of `PolarisRestCatalogMinIOIT` accessible

* Update docker.io/prom/prometheus Docker tag to v3.8.0 (apache#3191)

* Update helm/chart-testing-action action to v2.8.0 (apache#2982)

* chore(enhancement): make custom hidden tasks visible in ./gradlew tasks (apache#3176)

* fix type cast warning in PolarisCatalogUtils (apache#3178)

```
plugins/spark/v3.5/spark/src/main/java/org/apache/polaris/spark/utils/PolarisCatalogUtils.java:131: warning: [unchecked] unchecked cast
            scala.collection.immutable.Map$.MODULE$.apply(
                                                         ^
  required: Map<String,String>
  found:    Map
```

* chore(deps): update actions/stale digest to 9971854 (apache#3197)

* fix(deps): update dependency io.smallrye:jandex to v3.5.3 (apache#3193)

* chore(deps): update actions/checkout digest to 8e8c483 (apache#3192)

* added venv to the gitignore (apache#3199)

* CLI: Add Hive federation option (apache#2798)

* chore(deps): update docker.io/jaegertracing/all-in-one docker tag to v1.76.0 (apache#3201)

* chore(deps): update registry.access.redhat.com/ubi9/openjdk-21-runtime docker tag to v1.23-6.1764562148 (apache#3202)

* fix(deps): update quarkus platform and group to v3.30.2 (apache#3198)

* chore(deps): update dependency boto3 to ~=1.42.2 (apache#3126)

* NoSQL: CDI / Quarkus (apache#3135)

* fix(deps): update dependency com.adobe.testing:s3mock-testcontainers to v4.11.0 (apache#3208)

* Update dependency mypy to >=1.19, <=1.19.0 (apache#3180)

* chore(deps): update actions/setup-java digest to f2beeb2 (apache#3206)

* Fix spelling in comments (apache#3212)

* Make each task attempt run in a dedicated CDI request context (apache#3210)

* Make each task attempt run in a dedicated CDI request context

Currently, tasks inherit the CDI context from the requests that
submitted them, but run asynchronously. Therefore, if the original
request context ends, the task may not be able to use the expired
beans for that context.

This change makes each task run in its own dedicated CDI request
context with `RealmContext` explicitly propagated in `TaskExecutorImpl`.

Test-only error handlers are added to `TaskExecutorImpl` to facilitate
detecting task errors during CI.

Fixes apache#3203

* fix(deps): update dependency com.gradleup.shadow:shadow-gradle-plugin to v9.3.0 (apache#3218)

* Last merged commit be3c88b

---------

Co-authored-by: Pierre Laporte <[email protected]>
Co-authored-by: Rahil C <[email protected]>
Co-authored-by: Mend Renovate <[email protected]>
Co-authored-by: JB Onofré <[email protected]>
Co-authored-by: Alexandre Dutra <[email protected]>
Co-authored-by: fabio-rizzo-01 <[email protected]>
Co-authored-by: Dmitri Bourlatchkov <[email protected]>
Co-authored-by: Tamas Mate <[email protected]>
Co-authored-by: Adam Christian <[email protected]>
Co-authored-by: Artur Rakhmatulin <[email protected]>
Co-authored-by: cccs-cat001 <[email protected]>
Co-authored-by: Yufei Gu <[email protected]>
Co-authored-by: Yong Zheng <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants