(feat)Python CLI: Switch from Poetry to UV for python package management by MonkeyCanCode · Pull Request #3410 · apache/polaris

MonkeyCanCode · 2026-01-09T21:21:17Z

Per discussion in https://lists.apache.org/thread/2hlh3rvmgo7ol3qn08xmyf85grck2p35, team preferred uv over poetry. Couple major changes with this PR:

Switched from poetry to uv
Installed uv in virtualenv instead of system level to avoid potential conflict and keep system clean
Used hatch instead of poetry-core/setuptool for build management
Used hatch hook for openapi code generator

For the next PR, we can do more cleanup to remove unnecessary code/scripts around virtualenv management and let uv handles those natively if preferred. Currently, it is backward compatible as existed one in term of how to invoke the CLI and run docker-compose.

Checklist

🛡️ Don't disclose security issues! (contact security@apache.org)
🔗 Clearly explained why the changes are needed, or linked related issues: Fixes #
🧪 Added/updated tests with good coverage, or manually tested (and explained how)
💡 Added comments for complex logic
🧾 Updated CHANGELOG.md (if needed)
📚 Updated documentation in site/content/in-dev/unreleased (if needed)

kevinjqliu

generally lgtm, i would double check the build artifacts to make sure that its the same as packaged previously using poetry. That tripped us up during the pyiceberg migration

client/python/generate_clients.py

MonkeyCanCode · 2026-01-10T00:07:26Z

generally lgtm, i would double check the build artifacts to make sure that its the same as packaged previously using poetry. That tripped us up during the pyiceberg migration

Yes, I did that earlier. Most of them are same except wheel one doesn't have the root level init.py (which is fine). One thing caught me earlier was spec as that got excluded from hatch build due to we have spec in gitignore.

dimas-b

LGTM, but please wait for more reviews from people who deal with our python CLI more regularly than me :)

HonahX

Overall LGTM! Thanks for migrating this! I believe this could also make the release process easier in the future.

One thing I noticed was that the spark getting-started example now miss the ipynb in jupyter lab:

But this may not be relevant to this PR.

client/python/pyproject.toml

.gitignore

MonkeyCanCode · 2026-01-12T01:50:35Z

Overall LGTM! Thanks for migrating this! I believe this could also make the release process easier in the future.

One thing I noticed was that the spark getting-started example now miss the ipynb in jupyter lab:
But this may not be relevant to this PR.

So this may be due to change of base image that I done long time back where ipynb is not part of jupyterlab library. But do we need this dependency as the current getting started notebook is fully functional.

kevinjqliu

LGTM

HonahX

LGTM! And feel free to ignore my previous comment about getting-started example. It turns out I am still on the old image....

kevinjqliu · 2026-01-12T23:11:15Z

It turns out I am still on the old image....

ha i ran into this recently too. adding the --build flag helps auto rebuild apache/iceberg-python#2885

MonkeyCanCode · 2026-01-13T00:07:26Z

thanks for the review @kevinjqliu, @HonahX , and @dimas-b . @dimas-b any last concern before I merge this one?

…ent (apache#3410)

* Flatten events hierarchy (apache#3293) Co-authored-by: Alexandre Dutra <adutra@apache.org> * (feat)Python CLI: Switch from Poetry to UV for python package management (apache#3410) * chore(deps): update dependency uv to v0.9.24 (apache#3430) * (doc): Fix Polaris getting started doc and docker-compose (apache#3425) * Fix Polaris getting started doc * Fix Polaris getting started doc * [Minor] [Site] fix scheduled meetings table (apache#3423) * NoSQL: add to config-docs (apache#3397) Add the NoSQL specific configurtion options to the configuration docs generation module. * Blog: Add blog for Lance-Polaris integration (apache#3424) * Add `--hierarchical` to Polaris CLI (apache#3426) * Add `--hierarchical` to Polaris CLI Following up on apache#3347 this change adds the `--hierarchical` option to Polaris CLI in order to allow configuring this storage flag in Azure-based Catalogs. * Use new Request Context for each realm during implicit bootstrap (apache#3411) * Use new Request Context for each realm during implicit bootstrap The implicit (auto) bootstrap calls used to share Request Context for potentially many realms. That used to work by coincidence because `RealmConfig`, for example, is a `RequestScoped` bean. With this change each realm will be bootstrapped in its own dedicated Request Context. This change lays down a foundation for future refactoring related to `RealmConfig`. * Change nested docs to use title case (apache#3432) * fix(deps): update dependency com.github.dasniko:testcontainers-keycloak to v4.1.1 (apache#3438) * Fix Helm doc note section under Gateway API (apache#3436) * Relax UV version (apache#3437) * fix(deps): update dependency org.jboss.weld.se:weld-se-core to v6.0.4.final (apache#3439) * Add free-disk-space action to regtest + spark_client_regtests (apache#3429) The "Spark Client Regression Tests" CI job requires some disk space to operate. With just a little bit of added "content", the job will fail to `no space left on device` during the `docker compose` invocation building an image. Such errors make it impossible to get the log from the workflow, unless you capture the log before the workflow runs into the `no space left on device` situation. With "no space left", GitHub workflow infra is unable to capture the logs. ``` #10 ERROR: failed to copy files: userspace copy failed: write /home/spark/polaris/v3.5/integration/build/2.13/quarkus-build/gen/quarkus-app/lib/main/com.google.http-client.google-http-client-1.47.1.jar: no space left on device ``` This change is a stop-gap solution to prevent this error from happening for now. * fix(deps): update dependency com.google.cloud:google-cloud-iamcredentials to v2.82.0 (apache#3449) * Update OPA docker image version (apache#3448) * Blog: Mapping Legacy and Heterogeneous Datalakes in Apache P… (apache#3417) * fix(deps): update dependency org.postgresql:postgresql to v42.7.9 (apache#3453) * chore(deps): update apache/spark docker tag to v3.5.8 (apache#3458) * fix(deps): update dependency org.apache.spark:spark-sql_2.12 to v3.5.8 (apache#3450) * site: add blog anchors (apache#3443) * render anchor * improve readme * RAT * fix(deps): update dependency com.google.cloud:google-cloud-storage-bom to v2.62.0 (apache#3455) * Update renovate to include docker file with suffix (apache#3454) * feat: Add trace_id to AWS STS session tags for end-to-end correlation (apache#3414) * feat: Add trace_id to AWS STS session tags for end-to-end correlation This change enables deterministic correlation between: - Catalog operations (Polaris events) - Credential vending (AWS CloudTrail via STS session tags) - Metrics reports from compute engines (Spark, Trino, etc.) Changes: 1. Add traceId field to CredentialVendingContext - Marked with @Value.Auxiliary to exclude from cache key comparison - Every request has unique trace ID, so including it in equals/hashCode would prevent all cache hits - Trace ID is for correlation/audit only, not authorization 2. Extract OpenTelemetry trace ID in StorageAccessConfigProvider - getCurrentTraceId() extracts trace ID from current span context - Populates CredentialVendingContext.traceId for each request 3. Add trace_id to AWS STS session tags - AwsSessionTagsBuilder includes trace_id in session tags - Appears in CloudTrail logs for correlation with catalog operations - Uses 'unknown' placeholder when trace ID is not available 4. Update tests to verify trace_id is included in session tags This enables operators to correlate: - Which catalog operation triggered credential vending - Which data access events in CloudTrail correspond to catalog operations - Which metrics reports correspond to specific catalog operations * Update AwsCredentialsStorageIntegrationTest.java * Review comments 1. Feature Flag to Disable Trace IDs in Session Tags Added a new feature configuration flag INCLUDE_TRACE_ID_IN_SESSION_TAGS in FeatureConfiguration.java: polaris-core/src/main/java/org/apache/polaris/core/config/FeatureConfiguration.java (EXCERPT) public static final FeatureConfiguration<Boolean> INCLUDE_TRACE_ID_IN_SESSION_TAGS = PolarisConfiguration.<Boolean>builder() .key("INCLUDE_TRACE_ID_IN_SESSION_TAGS") .description("If set to true (and INCLUDE_SESSION_TAGS_IN_SUBSCOPED_CREDENTIAL is also true), ...") .defaultValue(false) .buildFeatureConfiguration(); 2. Cache Key Correctness Solution The solution ensures cache correctness by including trace IDs in cache keys only when they affect the vended credentials: Key changes: 1. `StorageCredentialCacheKey` - Added a new traceIdForCaching() field that is populated only when trace IDs affect credentials: polaris-core/src/main/java/org/apache/polaris/core/storage/cache/StorageCredentialCacheKey.java (EXCERPT) @Value.Parameter(order = 10) Optional<String> traceIdForCaching(); 2. `StorageCredentialCache` - Reads both flags and includes trace ID in cache key only when both are enabled: polaris-core/src/main/java/org/apache/polaris/core/storage/cache/StorageCredentialCache.java (EXCERPT) boolean includeTraceIdInCacheKey = includeSessionTags && includeTraceIdInSessionTags; StorageCredentialCacheKey key = StorageCredentialCacheKey.of(..., includeTraceIdInCacheKey); 3. `AwsSessionTagsBuilder` - Conditionally includes trace ID based on the new flag. 4. Tests - Updated existing tests and added a new test testSessionTagsWithTraceIdWhenBothFlagsEnabled. How This Resolves the Cache Correctness vs. Efficiency Trade-off | Configuration | Trace ID in Session Tags | Trace ID in Cache Key | Caching Behavior | |---------------|--------------------------|----------------------|------------------| | Session tags disabled | No | No | Efficient caching | | Session tags enabled, trace ID disabled (default) | No | No | Efficient caching | | Session tags enabled, trace ID enabled | Yes | Yes | Correct but no caching across requests | This design ensures: • Correctness: When trace IDs affect credentials, they're included in the cache key • Efficiency: When trace IDs don't affect credentials, they're excluded from the cache key, allowing cache hits across requests * Update CHANGELOG.md Co-authored-by: Anand Kumar Sankaran <anand.sankaran@workday.com> * site: Update website for 1.3.0 (apache#3464) * site: Fix blog diagram with corrected architecture image (apache#3466) * Site: Add 20260108 Community Meeting (apache#3460) * CI: CLI Nightly build (apache#3457) * Fix Helm repository update after release vote (apache#3461) The Github workflow included a `svn add index.yaml` command which would be correct if it was a Git repository. But in SVN, this results in an error when the file is already under version control. This line is unnecessary and a simple `svn commit` results in pushing the changes to the SVN server. * Fix typo for the wrong reference (apache#3473) * chore(deps): update apache/ozone docker tag to v2.1.0 (apache#3364) * chore(deps): update docker.io/prom/prometheus docker tag to v3.9.1 (apache#3366) * chore(deps): update quay.io/keycloak/keycloak docker tag to v26.5.1 (apache#3362) * Last merged commit 1451ce4 --------- Co-authored-by: Oleg Soloviov <40199597+olsoloviov@users.noreply.github.com> Co-authored-by: Alexandre Dutra <adutra@apache.org> Co-authored-by: Yong Zheng <yongzheng0809@gmail.com> Co-authored-by: Mend Renovate <bot@renovateapp.com> Co-authored-by: Danica Fine <danica.fine@gmail.com> Co-authored-by: Jack Ye <jackye@apache.org> Co-authored-by: Dmitri Bourlatchkov <dmitri.bourlatchkov@gmail.com> Co-authored-by: Maninder <parmar.maninderjit@gmail.com> Co-authored-by: Kevin Liu <kevinjqliu@users.noreply.github.com> Co-authored-by: Anand K Sankaran <lists@anands.net> Co-authored-by: Anand Kumar Sankaran <anand.sankaran@workday.com> Co-authored-by: Pierre Laporte <pierre@pingtimeout.fr> Co-authored-by: JB Onofré <jbonofre@apache.org> Co-authored-by: Honah (Jonas) J. <honahx@apache.org>

MonkeyCanCode added 2 commits January 9, 2026 15:08

Switch from Poetry to UV for python package management

47a2e4b

Update CHANGELOG.md

8be6990

github-project-automation bot added this to Basic Kanban Board Jan 9, 2026

github-project-automation bot moved this to PRs In Progress in Basic Kanban Board Jan 9, 2026

MonkeyCanCode requested review from HonahX, dimas-b, eric-maynard and flyrain January 9, 2026 21:28

kevinjqliu reviewed Jan 9, 2026

View reviewed changes

client/python/generate_clients.py Show resolved Hide resolved

dimas-b previously approved these changes Jan 10, 2026

View reviewed changes

github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Jan 10, 2026

MonkeyCanCode mentioned this pull request Jan 10, 2026

(nit)MCP: Switch build from setuptools to hatch apache/polaris-tools#130

Merged

HonahX reviewed Jan 11, 2026

View reviewed changes

client/python/pyproject.toml Show resolved Hide resolved

.gitignore Show resolved Hide resolved

HonahX mentioned this pull request Jan 11, 2026

Workflow to build python CLI snapshot for nightly publishing (sdist + wheels) #3036

Closed

6 tasks

Add uv.lock

486e4d1

MonkeyCanCode dismissed dimas-b’s stale review via 486e4d1 January 12, 2026 02:02

Restore header fix skip for poetry.lock

7f08518

MonkeyCanCode requested review from HonahX, dimas-b and kevinjqliu January 12, 2026 02:04

kevinjqliu approved these changes Jan 12, 2026

View reviewed changes

Add comment for uv.lock

9b29187

HonahX approved these changes Jan 12, 2026

View reviewed changes

dimas-b approved these changes Jan 13, 2026

View reviewed changes

MonkeyCanCode merged commit b17fd73 into apache:main Jan 13, 2026
15 checks passed

github-project-automation bot moved this from Ready to merge to Done in Basic Kanban Board Jan 13, 2026

evindj pushed a commit to evindj/polaris that referenced this pull request Jan 26, 2026

(feat)Python CLI: Switch from Poetry to UV for python package managem…

0439cce

…ent (apache#3410)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(feat)Python CLI: Switch from Poetry to UV for python package management#3410

(feat)Python CLI: Switch from Poetry to UV for python package management#3410
MonkeyCanCode merged 5 commits intoapache:mainfrom
MonkeyCanCode:python_uv

MonkeyCanCode commented Jan 9, 2026

Uh oh!

kevinjqliu left a comment

Uh oh!

Uh oh!

MonkeyCanCode commented Jan 10, 2026

Uh oh!

dimas-b left a comment

Uh oh!

HonahX left a comment

Uh oh!

Uh oh!

Uh oh!

MonkeyCanCode commented Jan 12, 2026

Uh oh!

kevinjqliu left a comment

Uh oh!

HonahX left a comment

Uh oh!

kevinjqliu commented Jan 12, 2026

Uh oh!

MonkeyCanCode commented Jan 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

MonkeyCanCode commented Jan 9, 2026

Checklist

Uh oh!

kevinjqliu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

MonkeyCanCode commented Jan 10, 2026

Uh oh!

dimas-b left a comment

Choose a reason for hiding this comment

Uh oh!

HonahX left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

MonkeyCanCode commented Jan 12, 2026

Uh oh!

kevinjqliu left a comment

Choose a reason for hiding this comment

Uh oh!

HonahX left a comment

Choose a reason for hiding this comment

Uh oh!

kevinjqliu commented Jan 12, 2026

Uh oh!

MonkeyCanCode commented Jan 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants