Use new Request Context for each realm during implicit bootstrap#3411
Use new Request Context for each realm during implicit bootstrap#3411dimas-b merged 4 commits intoapache:mainfrom
Conversation
* Improve code isolation by using `RealmConfig` (like most other code) instead of the lower-level `PolarisConfigurationStore` * This also enabled proper CDI request-scoped injection in concert with apache#3411 * Additionally, this enables further code cleanup in `PolarisConfigurationStore` as discussed in apache#3324 with the goal of using this interface for the backend configuration code, while `RealmConfig` becomes the corresponding frontend interface. * Fix `ServiceProducers.realmConfig()` to make a `RealmConfigImpl` directly from its dependencies (not from `CallContext`) to avoid cycles in CDI.
| } | ||
| } | ||
|
|
||
| result.forEach( |
There was a problem hiding this comment.
Looks like this function doesn't really need Map<> results anymore, only to populate unusedRealmSecrets below.
There was a problem hiding this comment.
Makes sense - will fix
There was a problem hiding this comment.
I went with @adutra 's suggestion... WDYT?
| @Identifier("task-executor") | ||
| private ExecutorService executor; | ||
|
|
||
| PrincipalSecretsResult bootstrapRealm(String realmId, RootCredentialsSet rootCredentialsSet) { |
There was a problem hiding this comment.
Wouldn't it be better to return a CompletionStage, then combine them all and wait for the combined result?
There was a problem hiding this comment.
I wanted to maintain old sequential bootstrap logic... this is not a main call path in real deployments, so I guess efficiency is not critical 🤔
| RootCredentialsSet.SYSTEM_PROPERTY); | ||
|
|
||
| var result = factory.bootstrapRealms(realmIds, rootCredentialsSet); | ||
| HashMap<String, PrincipalSecretsResult> result = new HashMap<>(); |
There was a problem hiding this comment.
I think this should be moved to Bootstrapper as well. The Bootstrapper should take a set of ream ids as input and return a Map<String, PrincipalSecretsResult>. Wdyt?
There was a problem hiding this comment.
Yes, I'll move the map to Bootstrapper, but I'll keep sequential execution for now, if you do not mind :)
The implicit (auto) bootstrap calls used to share Request Context for potentially many realms. That used to work by coincidence because `RealmConfig`, for example, is a `RequestScoped` bean. With this change each realm will be bootstrapped in its own dedicated Request Context. This change lays down a foundation for future refactoring related to `RealmConfig`.
9da1e48 to
9873165
Compare
runtime/service/src/main/java/org/apache/polaris/service/config/Bootstrapper.java
Show resolved
Hide resolved
runtime/service/src/main/java/org/apache/polaris/service/config/Bootstrapper.java
Show resolved
Hide resolved
runtime/service/src/main/java/org/apache/polaris/service/config/Bootstrapper.java
Outdated
Show resolved
Hide resolved
runtime/service/src/main/java/org/apache/polaris/service/config/Bootstrapper.java
Outdated
Show resolved
Hide resolved
runtime/service/src/main/java/org/apache/polaris/service/config/Bootstrapper.java
Outdated
Show resolved
Hide resolved
| @Identifier("task-executor") | ||
| private ExecutorService executor; |
There was a problem hiding this comment.
| @Identifier("task-executor") | |
| private ExecutorService executor; | |
| AsyncExec executor; |
Would prevent a dependency from bootstrapping to the current tasks implementation.
There was a problem hiding this comment.
Not sure I understand this, sorry... Could you explain is another way 🤔
There was a problem hiding this comment.
task-executor is IIRC used to run the tasks. Not a big deal, just thinking it could be nice to use a "generic" pool.
There was a problem hiding this comment.
Generic pool would be nice, indeed, but I'm not sure about CDI implications there 🤔 The task-executor pool is known NOT to propagate CDI context to tasks (#3210)
* Improve code isolation by using `RealmConfig` (like most other code) instead of the lower-level `PolarisConfigurationStore` * This also enabled proper CDI request-scoped injection in concert with apache#3411 * Additionally, this enables further code cleanup in `PolarisConfigurationStore` as discussed in apache#3324 with the goal of using this interface for the backend configuration code, while `RealmConfig` becomes the corresponding frontend interface. * Fix `ServiceProducers.realmConfig()` to make a `RealmConfigImpl` directly from its dependencies (not from `CallContext`) to avoid cycles in CDI.
* Improve code isolation by using `RealmConfig` (like most other code) instead of the lower-level `PolarisConfigurationStore` * This also enabled proper CDI request-scoped injection in concert with apache#3411 * Additionally, this enables further code cleanup in `PolarisConfigurationStore` as discussed in apache#3324 with the goal of using this interface for the backend configuration code, while `RealmConfig` becomes the corresponding frontend interface. * Fix `ServiceProducers.realmConfig()` to make a `RealmConfigImpl` directly from its dependencies (not from `CallContext`) to avoid cycles in CDI.
* Use injected RealmConfig in JdbcMetaStoreManagerFactory * Improve code isolation by using `RealmConfig` (like most other code) instead of the lower-level `PolarisConfigurationStore` * This also enabled proper CDI request-scoped injection in concert with #3411 * Additionally, this enables further code cleanup in `PolarisConfigurationStore` as discussed in #3324 with the goal of using this interface for the backend configuration code, while `RealmConfig` becomes the corresponding frontend interface. * Fix `ServiceProducers.realmConfig()` to make a `RealmConfigImpl` directly from its dependencies (not from `CallContext`) to avoid cycles in CDI.
…che#3411) * Use new Request Context for each realm during implicit bootstrap The implicit (auto) bootstrap calls used to share Request Context for potentially many realms. That used to work by coincidence because `RealmConfig`, for example, is a `RequestScoped` bean. With this change each realm will be bootstrapped in its own dedicated Request Context. This change lays down a foundation for future refactoring related to `RealmConfig`.
* Use injected RealmConfig in JdbcMetaStoreManagerFactory * Improve code isolation by using `RealmConfig` (like most other code) instead of the lower-level `PolarisConfigurationStore` * This also enabled proper CDI request-scoped injection in concert with apache#3411 * Additionally, this enables further code cleanup in `PolarisConfigurationStore` as discussed in apache#3324 with the goal of using this interface for the backend configuration code, while `RealmConfig` becomes the corresponding frontend interface. * Fix `ServiceProducers.realmConfig()` to make a `RealmConfigImpl` directly from its dependencies (not from `CallContext`) to avoid cycles in CDI.
* Flatten events hierarchy (apache#3293) Co-authored-by: Alexandre Dutra <adutra@apache.org> * (feat)Python CLI: Switch from Poetry to UV for python package management (apache#3410) * chore(deps): update dependency uv to v0.9.24 (apache#3430) * (doc): Fix Polaris getting started doc and docker-compose (apache#3425) * Fix Polaris getting started doc * Fix Polaris getting started doc * [Minor] [Site] fix scheduled meetings table (apache#3423) * NoSQL: add to config-docs (apache#3397) Add the NoSQL specific configurtion options to the configuration docs generation module. * Blog: Add blog for Lance-Polaris integration (apache#3424) * Add `--hierarchical` to Polaris CLI (apache#3426) * Add `--hierarchical` to Polaris CLI Following up on apache#3347 this change adds the `--hierarchical` option to Polaris CLI in order to allow configuring this storage flag in Azure-based Catalogs. * Use new Request Context for each realm during implicit bootstrap (apache#3411) * Use new Request Context for each realm during implicit bootstrap The implicit (auto) bootstrap calls used to share Request Context for potentially many realms. That used to work by coincidence because `RealmConfig`, for example, is a `RequestScoped` bean. With this change each realm will be bootstrapped in its own dedicated Request Context. This change lays down a foundation for future refactoring related to `RealmConfig`. * Change nested docs to use title case (apache#3432) * fix(deps): update dependency com.github.dasniko:testcontainers-keycloak to v4.1.1 (apache#3438) * Fix Helm doc note section under Gateway API (apache#3436) * Relax UV version (apache#3437) * fix(deps): update dependency org.jboss.weld.se:weld-se-core to v6.0.4.final (apache#3439) * Add free-disk-space action to regtest + spark_client_regtests (apache#3429) The "Spark Client Regression Tests" CI job requires some disk space to operate. With just a little bit of added "content", the job will fail to `no space left on device` during the `docker compose` invocation building an image. Such errors make it impossible to get the log from the workflow, unless you capture the log before the workflow runs into the `no space left on device` situation. With "no space left", GitHub workflow infra is unable to capture the logs. ``` #10 ERROR: failed to copy files: userspace copy failed: write /home/spark/polaris/v3.5/integration/build/2.13/quarkus-build/gen/quarkus-app/lib/main/com.google.http-client.google-http-client-1.47.1.jar: no space left on device ``` This change is a stop-gap solution to prevent this error from happening for now. * fix(deps): update dependency com.google.cloud:google-cloud-iamcredentials to v2.82.0 (apache#3449) * Update OPA docker image version (apache#3448) * Blog: Mapping Legacy and Heterogeneous Datalakes in Apache P… (apache#3417) * fix(deps): update dependency org.postgresql:postgresql to v42.7.9 (apache#3453) * chore(deps): update apache/spark docker tag to v3.5.8 (apache#3458) * fix(deps): update dependency org.apache.spark:spark-sql_2.12 to v3.5.8 (apache#3450) * site: add blog anchors (apache#3443) * render anchor * improve readme * RAT * fix(deps): update dependency com.google.cloud:google-cloud-storage-bom to v2.62.0 (apache#3455) * Update renovate to include docker file with suffix (apache#3454) * feat: Add trace_id to AWS STS session tags for end-to-end correlation (apache#3414) * feat: Add trace_id to AWS STS session tags for end-to-end correlation This change enables deterministic correlation between: - Catalog operations (Polaris events) - Credential vending (AWS CloudTrail via STS session tags) - Metrics reports from compute engines (Spark, Trino, etc.) Changes: 1. Add traceId field to CredentialVendingContext - Marked with @Value.Auxiliary to exclude from cache key comparison - Every request has unique trace ID, so including it in equals/hashCode would prevent all cache hits - Trace ID is for correlation/audit only, not authorization 2. Extract OpenTelemetry trace ID in StorageAccessConfigProvider - getCurrentTraceId() extracts trace ID from current span context - Populates CredentialVendingContext.traceId for each request 3. Add trace_id to AWS STS session tags - AwsSessionTagsBuilder includes trace_id in session tags - Appears in CloudTrail logs for correlation with catalog operations - Uses 'unknown' placeholder when trace ID is not available 4. Update tests to verify trace_id is included in session tags This enables operators to correlate: - Which catalog operation triggered credential vending - Which data access events in CloudTrail correspond to catalog operations - Which metrics reports correspond to specific catalog operations * Update AwsCredentialsStorageIntegrationTest.java * Review comments 1. Feature Flag to Disable Trace IDs in Session Tags Added a new feature configuration flag INCLUDE_TRACE_ID_IN_SESSION_TAGS in FeatureConfiguration.java: polaris-core/src/main/java/org/apache/polaris/core/config/FeatureConfiguration.java (EXCERPT) public static final FeatureConfiguration<Boolean> INCLUDE_TRACE_ID_IN_SESSION_TAGS = PolarisConfiguration.<Boolean>builder() .key("INCLUDE_TRACE_ID_IN_SESSION_TAGS") .description("If set to true (and INCLUDE_SESSION_TAGS_IN_SUBSCOPED_CREDENTIAL is also true), ...") .defaultValue(false) .buildFeatureConfiguration(); 2. Cache Key Correctness Solution The solution ensures cache correctness by including trace IDs in cache keys only when they affect the vended credentials: Key changes: 1. `StorageCredentialCacheKey` - Added a new traceIdForCaching() field that is populated only when trace IDs affect credentials: polaris-core/src/main/java/org/apache/polaris/core/storage/cache/StorageCredentialCacheKey.java (EXCERPT) @Value.Parameter(order = 10) Optional<String> traceIdForCaching(); 2. `StorageCredentialCache` - Reads both flags and includes trace ID in cache key only when both are enabled: polaris-core/src/main/java/org/apache/polaris/core/storage/cache/StorageCredentialCache.java (EXCERPT) boolean includeTraceIdInCacheKey = includeSessionTags && includeTraceIdInSessionTags; StorageCredentialCacheKey key = StorageCredentialCacheKey.of(..., includeTraceIdInCacheKey); 3. `AwsSessionTagsBuilder` - Conditionally includes trace ID based on the new flag. 4. Tests - Updated existing tests and added a new test testSessionTagsWithTraceIdWhenBothFlagsEnabled. How This Resolves the Cache Correctness vs. Efficiency Trade-off | Configuration | Trace ID in Session Tags | Trace ID in Cache Key | Caching Behavior | |---------------|--------------------------|----------------------|------------------| | Session tags disabled | No | No | Efficient caching | | Session tags enabled, trace ID disabled (default) | No | No | Efficient caching | | Session tags enabled, trace ID enabled | Yes | Yes | Correct but no caching across requests | This design ensures: • Correctness: When trace IDs affect credentials, they're included in the cache key • Efficiency: When trace IDs don't affect credentials, they're excluded from the cache key, allowing cache hits across requests * Update CHANGELOG.md Co-authored-by: Anand Kumar Sankaran <anand.sankaran@workday.com> * site: Update website for 1.3.0 (apache#3464) * site: Fix blog diagram with corrected architecture image (apache#3466) * Site: Add 20260108 Community Meeting (apache#3460) * CI: CLI Nightly build (apache#3457) * Fix Helm repository update after release vote (apache#3461) The Github workflow included a `svn add index.yaml` command which would be correct if it was a Git repository. But in SVN, this results in an error when the file is already under version control. This line is unnecessary and a simple `svn commit` results in pushing the changes to the SVN server. * Fix typo for the wrong reference (apache#3473) * chore(deps): update apache/ozone docker tag to v2.1.0 (apache#3364) * chore(deps): update docker.io/prom/prometheus docker tag to v3.9.1 (apache#3366) * chore(deps): update quay.io/keycloak/keycloak docker tag to v26.5.1 (apache#3362) * Last merged commit 1451ce4 --------- Co-authored-by: Oleg Soloviov <40199597+olsoloviov@users.noreply.github.com> Co-authored-by: Alexandre Dutra <adutra@apache.org> Co-authored-by: Yong Zheng <yongzheng0809@gmail.com> Co-authored-by: Mend Renovate <bot@renovateapp.com> Co-authored-by: Danica Fine <danica.fine@gmail.com> Co-authored-by: Jack Ye <jackye@apache.org> Co-authored-by: Dmitri Bourlatchkov <dmitri.bourlatchkov@gmail.com> Co-authored-by: Maninder <parmar.maninderjit@gmail.com> Co-authored-by: Kevin Liu <kevinjqliu@users.noreply.github.com> Co-authored-by: Anand K Sankaran <lists@anands.net> Co-authored-by: Anand Kumar Sankaran <anand.sankaran@workday.com> Co-authored-by: Pierre Laporte <pierre@pingtimeout.fr> Co-authored-by: JB Onofré <jbonofre@apache.org> Co-authored-by: Honah (Jonas) J. <honahx@apache.org>
* Use injected RealmConfig in JdbcMetaStoreManagerFactory (apache#3412) * Use injected RealmConfig in JdbcMetaStoreManagerFactory * Improve code isolation by using `RealmConfig` (like most other code) instead of the lower-level `PolarisConfigurationStore` * This also enabled proper CDI request-scoped injection in concert with apache#3411 * Additionally, this enables further code cleanup in `PolarisConfigurationStore` as discussed in apache#3324 with the goal of using this interface for the backend configuration code, while `RealmConfig` becomes the corresponding frontend interface. * Fix `ServiceProducers.realmConfig()` to make a `RealmConfigImpl` directly from its dependencies (not from `CallContext`) to avoid cycles in CDI. * Fix renovate syntax (apache#3471) * Remove client_id, client_secret regex/pattern validation on reset endpoint call (apache#3276) client_id/client_secret patterns are validated when calling reset endpoint but the pattern is hardcoded which can be too rigid. * Use typed config for `TABLE_METADATA_CLEANUP_BATCH_SIZE` (apache#3478) Add a typed `FeatureConfiguration` with the same config name and default value. * chore(deps): update docker.io/adobe/s3mock docker tag to v4.11.0 (apache#3492) * fix(deps): update dependency io.smallrye.config:smallrye-config-core to v3.15.1 (apache#3490) * fix(deps): update dependency ch.qos.logback:logback-classic to v1.5.25 (apache#3489) * chore(deps): update plugin com.gradle.develocity to v4.3.1 (apache#3452) * Last merged commit 738bce9 --------- Co-authored-by: Dmitri Bourlatchkov <dmitri.bourlatchkov@gmail.com> Co-authored-by: Yong Zheng <yongzheng0809@gmail.com> Co-authored-by: Romain Manni-Bucau <rmannibucau@gmail.com> Co-authored-by: Mend Renovate <bot@renovateapp.com>
The implicit (auto) bootstrap calls used to share Request Context for potentially many realms. That used to work by coincidence because
RealmConfig, for example, is aRequestScopedbean.With this change each realm will be bootstrapped in its own dedicated Request Context.
This change lays down a foundation for future refactoring related to
RealmConfig.Checklist
CHANGELOG.md(if needed)site/content/in-dev/unreleased(if needed)