Conversation
| uri=self.catalog_uri, | ||
| authentication_parameters=auth_params, | ||
| service_identity=service_identity, | ||
| warehouse=self.hadoop_warehouse |
There was a problem hiding this comment.
It's weird to use hadoop_warehouse. We might use the name warehouse, so that it can be applied to both Hadoop and Hive federation.
There was a problem hiding this comment.
Thoughts, suggestions? @HonahX @eric-maynard @MonkeyCanCode
There was a problem hiding this comment.
Sry for the late response. I was away for couple of days. I don't have a strong preference over this naming. Maybe ask @eric-maynard as he initially added it in eb6b6ad.
There was a problem hiding this comment.
The tentative convention I had in mind was to prefix each argument with the federation type it's specific to -- namely, ICEBERG_REMOTE_CATALOG_NAME is specific to iceberg federation type, and HADOOP_WAREHOUSE is specific to HADOOP. I thought it might be unclear what just REMOTE_CATALOG_NAME or WAREHOUSE meant.
There was a problem hiding this comment.
to be clear I'm supportive of re-using these flags across federations types (e.g. across Hive / Hadoop) and indeed I think if we ever flesh out federation in the way that I envisioned at this time we would need to. There would be a way to federate to a Hive catalog both for Iceberg and non-Iceberg tables, and these would surely share arguments.
My only hesitation would be that the CLI 's method of handling arguments is a bit brittle and if we re-use them we should just make sure the parsing and the --help display behave the way we expect.
There was a problem hiding this comment.
Pull Request Overview
This PR adds Hive federation support to the CLI by introducing a new "hive" connection type for external catalogs. This allows users to federate with Hive catalogs alongside the existing Iceberg REST and Hadoop options.
- Added "hive" as a supported catalog connection type in CLI constants and commands
- Updated CLI documentation to reflect the new hive option
- Added validation and configuration logic for Hive connections
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| site/content/in-dev/unreleased/command-line-interface.md | Updated documentation to include "hive" in the list of supported catalog connection types |
| client/python/test/test_cli_parsing.py | Added test cases for Hive federation scenarios with both implicit and OAuth authentication |
| client/python/cli/constants.py | Added HIVE enum value to CatalogConnectionType |
| client/python/cli/command/catalogs.py | Added HiveConnectionConfigInfo import, validation logic, and configuration building for Hive connections |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| f" and {Argument.to_flag_name(Arguments.CATALOG_URI)}") | ||
| elif self.catalog_connection_type == CatalogConnectionType.HIVE.value: | ||
| if not self.hadoop_warehouse or not self.catalog_uri: | ||
| raise Exception(f"Missing required argument for connection type 'HIVE':" |
There was a problem hiding this comment.
nit: --help was updated to list iceberg-rest, hadoop, hive so these logs (and the tests) should follow that same convention
# Conflicts: # client/python/apache_polaris/cli/command/catalogs.py # site/content/in-dev/unreleased/command-line-interface.md
* Do not fail a release when markdown-link-check check fails as it is flaky (apache#3116) * Source tarball reproducible (apache#3143) `git --mtime` MUST use the time zone for reproducible builds. * Skip release e-mail templates from svn dist copy (apache#3147) * Make pom.xml always reproducible (apache#3145) It turned out in practice, that there's no guarantee that the `<parent>` element in `pom.xml` files always appear at the same place. This change ensures that the `<parent>` elements always appears at a deterministic location at the top of `pom.xml` files. * Fix executable POSIX permission in archive files (apache#3146) The PR apache#2819 accidentally _removed_ the executable POSIX file permission, assuming that not explicity setting the attributes via `filePermissions` retains the file-system 'x' permission. This change updates the logic to explicitly check the owner-executable bit and uses `755` or `644` respectively for each individual file in the archive. * Spark: Initial integration for hudi tables within Polaris (apache#1862) * Update actions/setup-python digest to 83679a8 (apache#3157) * Update actions/stale digest to 5611b9d (apache#3155) * Fix LICENSE and NOTICE in the distributions and docker images. (apache#3125) * Remove readEntity() call (apache#3111) Calling readEntity() is not allowed server-side by some HTTP servers. * Run CI on release branches (apache#3121) The release workflows check whether CI passes for the required checks. This would fail, because CI isn't configured to run on release branches. This change lets CI run on `release/*` branches. * adding support to use a kms key for s3 buckets data encryption (AWS only) (apache#2802) Add catalog-level support for KMS with s3 buckets * Update plugin jetbrains-changelog to v2.5.0 (apache#3166) * Update quay.io/keycloak/keycloak Docker tag to v26.4.6 (apache#3163) * NoSQL: Prepare admin-tool (apache#3134) No functional changes. 1. Refactor the configuration property to a configuration type. 2. Make `BaseCommand` suitable for non-meta-store-factory use cases. * Iceberg-Catalog: also set catalog-id for location overlap checks (apache#3136) * Fix catalog-role creating in `PolarisTestMetaStoreManager` (apache#3122) `testLookup()` attempts to check for a catalog-role on catalog ID 0, which is an illegal ID for a catalog. Fix is to move the assertion below the catalog creation. * Releasy: prepare for Helm 4 (helm package repro) (apache#3088) Part of apache#3086 * Update Quarkus Platform and Group to v3.30.1 (apache#3168) * Relax ARN validation logic (apache#3071) Following up on apache#3005, which allowed a wide range of ARN values in the validation RegEx, remove an additional explicit check for `aws-cn` being present in the ARN as a sub-string. Update existing unit tests to process `aws-cn` ARNs as common `aws` ARNs. Note: the old validation code does not look correct because it used to check for `aws-cn` anywhere in the ARN string, not just in its "partition" component. * docs: Add François as Mentor (apache#3162) * docs: Add François as Mentor * update mentor list according to ASF project info * Event type IDs + event metadata incl. OTel context (apache#2998) This PR implements the action items from the following discussion threads: - https://lists.apache.org/thread/yx7pkgczl6k7bt4k4yzqrrq9gn7gqk2p - https://lists.apache.org/thread/rl5cpcft16sn5n00mfkmx9ldn3gsqtfy - https://lists.apache.org/thread/5dpyo0nn2jbnjtkgv0rm1dz8mpt132j9 Summary of changes: - Introduced a `PolarisEventType` enum holding the 150+ event types. - Introduced a `PolarisEventMetadata` interface as suggested by @adnanhemani, exposing: event ID, timestamp, realm ID, principal, request ID, and OTel context. - Introduced a `PolarisEventMetadataFactory` to centralize the logic for gathering the various elements of an event metadata. - Modified `PolarisEvent` to expose 3 new methods: - `PolarisEventType type()` - `PolarisEventMetadata metadata()` - Persistence of OTel context is done in `additional_properties` as suggested by @flyrain. - Added `InMemoryBufferEventListenerIntegrationTest` to verify that all contextual data is properly persisted. * fix typo in management API yaml (apache#3172) * Fix homepage Get Started button layout (apache#3169) Wrap the Get Started button in a div container to prevent it from becoming inline with text at certain screen widths. Follows Docsy blocks/cover shortcode pattern. * fix OPA javadoc referencing `OpaSchemaGenerator` (apache#3153) `OpaSchemaGenerator` is not on the classpath of `opa/impl/main` so the javadoc tool is not able to resolve a `@link` to it. Use `@code` instead to avoid build warnings like the following: * Update dependency com.azure:azure-sdk-bom to v1.3.3 (apache#3179) * Update dependency com.google.errorprone:error_prone_core to v2.45.0 (apache#3177) * test: Add Some Spark Client Tests and Update Documentation on Generic Tables (apache#3152) * Site: Make homepage image full-width (apache#3171) Add CSS class to allow images to span full viewport width by canceling out container padding. Apply to homepage hero image using AsciiDoc role attribute. * chore(enhancement): gitignore application-local.properties (apache#3175) * Update registry.access.redhat.com/ubi9/openjdk-21-runtime Docker tag to v1.23-6.1764155306 (apache#3186) * Update quay.io/keycloak/keycloak Docker tag to v26.4.7 (apache#3185) * Update dependency software.amazon.awssdk:bom to v2.39.6 (apache#3184) * Testing: increase visibility + make PCC/PMSM accessible (apache#3137) * `BasePolarisMetaStoreManagerTest`: make `PolarisCallContext` + `PolarisMetaStoreManager` + `PolarisTestMetaStoreManager` accessible by subclasses * Make constants of `PolarisRestCatalogMinIOIT` accessible * Update docker.io/prom/prometheus Docker tag to v3.8.0 (apache#3191) * Update helm/chart-testing-action action to v2.8.0 (apache#2982) * chore(enhancement): make custom hidden tasks visible in ./gradlew tasks (apache#3176) * fix type cast warning in PolarisCatalogUtils (apache#3178) ``` plugins/spark/v3.5/spark/src/main/java/org/apache/polaris/spark/utils/PolarisCatalogUtils.java:131: warning: [unchecked] unchecked cast scala.collection.immutable.Map$.MODULE$.apply( ^ required: Map<String,String> found: Map ``` * chore(deps): update actions/stale digest to 9971854 (apache#3197) * fix(deps): update dependency io.smallrye:jandex to v3.5.3 (apache#3193) * chore(deps): update actions/checkout digest to 8e8c483 (apache#3192) * added venv to the gitignore (apache#3199) * CLI: Add Hive federation option (apache#2798) * chore(deps): update docker.io/jaegertracing/all-in-one docker tag to v1.76.0 (apache#3201) * chore(deps): update registry.access.redhat.com/ubi9/openjdk-21-runtime docker tag to v1.23-6.1764562148 (apache#3202) * fix(deps): update quarkus platform and group to v3.30.2 (apache#3198) * chore(deps): update dependency boto3 to ~=1.42.2 (apache#3126) * NoSQL: CDI / Quarkus (apache#3135) * fix(deps): update dependency com.adobe.testing:s3mock-testcontainers to v4.11.0 (apache#3208) * Update dependency mypy to >=1.19, <=1.19.0 (apache#3180) * chore(deps): update actions/setup-java digest to f2beeb2 (apache#3206) * Fix spelling in comments (apache#3212) * Make each task attempt run in a dedicated CDI request context (apache#3210) * Make each task attempt run in a dedicated CDI request context Currently, tasks inherit the CDI context from the requests that submitted them, but run asynchronously. Therefore, if the original request context ends, the task may not be able to use the expired beans for that context. This change makes each task run in its own dedicated CDI request context with `RealmContext` explicitly propagated in `TaskExecutorImpl`. Test-only error handlers are added to `TaskExecutorImpl` to facilitate detecting task errors during CI. Fixes apache#3203 * fix(deps): update dependency com.gradleup.shadow:shadow-gradle-plugin to v9.3.0 (apache#3218) * Last merged commit be3c88b --------- Co-authored-by: Pierre Laporte <pierre@pingtimeout.fr> Co-authored-by: Rahil C <32500120+rahil-c@users.noreply.github.com> Co-authored-by: Mend Renovate <bot@renovateapp.com> Co-authored-by: JB Onofré <jbonofre@apache.org> Co-authored-by: Alexandre Dutra <adutra@apache.org> Co-authored-by: fabio-rizzo-01 <fabio.rizzocascio@jpmorgan.com> Co-authored-by: Dmitri Bourlatchkov <dmitri.bourlatchkov@gmail.com> Co-authored-by: Tamas Mate <50709850+tmater@users.noreply.github.com> Co-authored-by: Adam Christian <105929021+adam-christian-software@users.noreply.github.com> Co-authored-by: Artur Rakhmatulin <artur.rakhmatulin@gmail.com> Co-authored-by: cccs-cat001 <56204545+cccs-cat001@users.noreply.github.com> Co-authored-by: Yufei Gu <yufei@apache.org> Co-authored-by: Yong Zheng <yongzheng0809@gmail.com>
No description provided.