Skip to content

Conversation

@adutra
Copy link
Contributor

@adutra adutra commented Aug 8, 2025

This PR contains no added or modified functionality.

The main goal of this change is to facilitate future integration of federated principals:

  • AuthenticatedPolarisPrincipal becomes an interface PolarisPrincipal, as the original class leaks implementation details (references to PrincipalEntity and thus to the storage layer). The new interface does not reference the storage layer. This is one step further towards easy pluggability of authentication in Polaris.

  • The Authenticator.authenticate() method does not return an Optional anymore, as this was ambiguous (returning Optional.empty() vs throwing NotAuthorizedException).

  • Also the Authenticator interface is not generic anymore. This was an artifact of times when there were two kinds of Authenticators in Polaris (one for internal auth, the other for external) and is not necessary anymore.

@adutra
Copy link
Contributor Author

adutra commented Aug 8, 2025

FYI @collado-mike @dennishuo @flyrain

Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This refactoring LGTM 👍 Just one javadoc comment :)

/**
* Returns the unique identifier of the principal.
*
* <p>This identifier is used to uniquely identify the principal within the Polaris system.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unique within a realm, though, right?

dimas-b
dimas-b previously approved these changes Aug 12, 2025
@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Aug 12, 2025
@collado-mike
Copy link
Contributor

I'd like to take a look at this, but I won't have cycles until Friday. Can we please hold off on merging before then?

Copy link
Member

@snazy snazy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, it's a mostly "boring" change.

I wonder though whether PolarisPrincipal should have the id property removed or become optional (see reason below).

.get();
return Uni.createFrom()
.item(() -> setPrincipalAuthInfo(identity, principalMapper, principalRolesMapper));
.item(() -> setPolarisCredential(identity, principalMapper, principalRolesMapper));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: mind adding a comment that the two mappers may do expensive work, hence justifying the async call here?

Copy link
Contributor Author

@adutra adutra Aug 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right but in this case we need context.runBlocking

Comment on lines +68 to +73
/**
* Returns the unique identifier of the principal.
*
* <p>This identifier is used to uniquely identify the principal within a Polaris realm.
*/
long getId();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We differentiate between persisted (PersistedPolarisPrincipal) and non-persistent ones. I wonder whether this getter should be present at all here. It could at least return an OptionalLong to make it clear that only persisted principals have an ID. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that the Resolver requires the principal to have an id, since it resolves principals by id. Thus it seemed to me that it was legit to require from all principals to have at least a numeric identifier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a follow-up task, we could attempt to modify the Resolver so that it makes principal lookups by name. WDYT @collado-mike ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be able to move past the Resolver needing to resolve the principal at all. This was largely present because the Resolver fetched the accessible PrincipalRoles based on grant records, but we have moved that requirement out so that we can rely on roles defined in the token instead. I think it's reasonable to change the Resolver to skip resolving the principal and roles entirely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's an interesting improvement for sure! I will try to do that in a follow-up task 👍

snazy
snazy previously approved these changes Aug 14, 2025
Copy link
Member

@snazy snazy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@dimas-b
Copy link
Contributor

dimas-b commented Aug 14, 2025

Shall we mark it as "draft" until @collado-mike reviews? (just to avoid the impression that it's mergeable)

collado-mike
collado-mike previously approved these changes Aug 16, 2025
Copy link
Contributor

@collado-mike collado-mike left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for holding off. This looks good to me.

@Override
public Set<String> getActiveRoles(AuthenticatedPolarisPrincipal principal) {
public Set<String> getActiveRoles(PolarisPrincipal principal) {
if (!(principal instanceof PersistedPolarisPrincipal persistedPolarisPrincipal)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh. I thought the variable here was always scoped to within the if block! TIL

adutra added 5 commits August 18, 2025 11:27
The main goal of this change is to facilitate future integration of federated principals:

- `AuthenticatedPolarisPrincipal` becomes an interface `PolarisPrincipal`, as the original class leaks implementation details (references to `PrincipalEntity` and thus to the storage layer). The new interface does not reference the storage layer. This is one step further towards easy pluggability of authentication in Polaris.

- The `Authenticator.authenticate()` method does not return an `Optional` anymore, as this was ambiguous (returning `Optional.empty()` vs throwing `NotAuthorizedException`).

- Also the `Authenticator` interface is not generic anymore. This was an artifact of times when there were two kinds of `Authenticators` in Polaris (one for internal auth, the other for external) and is not necessary anymore.
@adutra adutra dismissed stale reviews from collado-mike and snazy via d2ece28 August 18, 2025 09:30
@adutra
Copy link
Contributor Author

adutra commented Aug 18, 2025

Had to rebase to fix merge conflicts – PTAL again.

@adutra adutra merged commit 96f1459 into apache:main Aug 18, 2025
12 checks passed
@github-project-automation github-project-automation bot moved this from Ready to merge to Done in Basic Kanban Board Aug 18, 2025
@adutra adutra deleted the auth-refactor branch August 18, 2025 13:11
snazy added a commit to snazy/polaris that referenced this pull request Nov 20, 2025
* Refactor Authenticator and PolarisPrincipal (apache#2307)

The main goal of this change is to facilitate future integration of federated principals:

- `AuthenticatedPolarisPrincipal` becomes an interface `PolarisPrincipal`, as the original class leaks implementation details (references to `PrincipalEntity` and thus to the storage layer). The new interface does not reference the storage layer. This is one step further towards easy pluggability of authentication in Polaris.

- The `Authenticator.authenticate()` method does not return an `Optional` anymore, as this was ambiguous (returning `Optional.empty()` vs throwing `NotAuthorizedException`).

- Also the `Authenticator` interface is not generic anymore. This was an artifact of times when there were two kinds of `Authenticators` in Polaris (one for internal auth, the other for external) and is not necessary anymore.

* Add PolarisDiagnostics field to TransactionalMetaStoreManagerImpl (apache#2361)

the ultimate goal is removing the PolarisCallContext parameter from every
PolarisMetaStoreManager interface method, so we make steps towards reducing
its usage first.

* Support HMS Federation (apache#2355)

Supports federating to HiveCatalog using the Iceberg REST library. 
All hive dependencies are added in an independent module, i.e., `polaris-extensions-federation-hive` and can be removed/converted to a compile time flag if necessary. 
Similar to HadoopCatalog, HMS federation support is currently restricted to `IMPLICIT` auth. The underlying authentication can be any form that Hive supports, however Polaris will not store and manage any of these credentials. Again, similar to HadoopCatalog, this version supports federating to a single Hive instance. 

This PR relies on Polaris discovering the `hive-site.xml` file to get the configuration options from the classpath (including `HADOOP_CONF_DIR`).

The spec change has been discussed in the [dev mailing list](https://lists.apache.org/thread/5qktjv6rzd8pghcl6f4oohko798o2p2g), followed by a discussion in the Polaris community sync on Aug 7, 2025. 

Testing: 
Modified the regression test to locally test that Hive federation works as expected. The next step would be to add a regression test once the change is baked into the Polaris docker image (for CI builds).


This PR primarily builds on apache#1305 and apache#1466. Thank you @dennishuo and @eric-maynard for helping out with this!

* Add PolarisDiagnostics field to TransactionWorkspaceMetaStoreManager (apache#2359)

the ultimate goal is removing the `PolarisCallContext` parameter from every
`PolarisMetaStoreManager` interface method, so we make steps towards
reducing its usage first.

* Rat-ignore user-settings for hugo-run-in-docker (apache#2376)

* Modularize generic table federation (apache#2379)

In apache#2369 Iceberg table federation was refactored around the new `IcebergRESTExternalCatalogFactory` type based on discussion in the community sync. This has unblocked the ability to federate to more non-Iceberg catalogs, such as in apache#2355. This PR refactors generic table federation to go through the same mechanism. After this, we can go through and implement generic table federation for the existing `IcebergRESTExternalCatalogFactory` implementations.

* Update community meeting dates (apache#2382)

* Reduce getRealmConfig calls (apache#2337)

Classes with a `CallContext` field should call `getRealmConfig` once and store it as a field as well.
The idea is that long term we would want to stop relying on the `CallContext` itself but instead inject its individual items. Thus we also add `RealmConfig` to `TestServices`.

* Python client: make S3 role-ARN optional and add missing endpoint-internal property (apache#2339)

* fix(deps): update dependency io.prometheus:prometheus-metrics-exporter-servlet-jakarta to v1.4.1 (apache#2377)

* chore(deps): bump s3mock from 3.11.0 to 4.7.0 (apache#2375)

Updates S3Mock testcontainer dependency from 3.11.0 to 4.7.0 and refactors usage into a centralized wrapper class in runtime/test-common.

Changes

    Upgraded S3Mock testcontainer to 4.7.0
    Created S3Mock wrapper class for consistent configuration
    Consolidated S3 config properties generation
    Updated integration tests to use new wrapper

No functional changes to test behavior.

* Nit: extract getResolvedCatalogEntity method in IcebergCatalogHandler (apache#2387)

* Nit: remove transitive dependencies from runtime/server/build.gradle.kts (apache#2385)

* Nit: add methods isExternal and isStaticFacade to CatalogEntity (apache#2386)

* Minor refactor of integration test classes (apache#2384)

This change promotes `CatalogConfig` and `RestCatalogConfig` to top-level, public annotations and introduces a few "hooks" in `PolarisRestCatalogIntegrationBase` that can be overridden by subclasses.

This change is a preparatory work for apache#2280 (S3 remote signing).

* Remove BaseMetaStoreManager.serializeProperties (apache#2374)

similar to 7af85be we should prefer
the existing helper methods on the entity instead

* fix: minor corrections of documentation (apache#2397)

- fixed dead link to catalog definition in Iceberg docs on Entities page
- removed single quotes from credential parameter in the cmdline example for connecting a local spark-sql: env variables need to be resolved in cmdline, they will not be resolved by spark-sql itself.

* chore(deps): update azure/setup-helm action to v4.3.1 (apache#2402)

* Add 1.0.1 release to the website (apache#2400)

* Add PolarisDiagnostics field to AbstractTransactionalPersistence (apache#2372)

The ultimate goal is removing the `PolarisCallContext` parameter from every `PolarisMetaStoreManager` interface method, so we make steps towards reducing its usage first.

* NoSQL: javadoc nit

* Last merged commit fcd4777

---------

Co-authored-by: Alexandre Dutra <[email protected]>
Co-authored-by: Christopher Lambert <[email protected]>
Co-authored-by: Pooja Nilangekar <[email protected]>
Co-authored-by: Eric Maynard <[email protected]>
Co-authored-by: JB Onofré <[email protected]>
Co-authored-by: Mend Renovate <[email protected]>
Co-authored-by: Artur Rakhmatulin <[email protected]>
Co-authored-by: olsoloviov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants