Skip to content

core: introduce resolvePathsOnly() in Resolver and PolarisResolutionManifest#3427

Open
sungwy wants to merge 2 commits intoapache:mainfrom
sungwy:resolver-apis
Open

core: introduce resolvePathsOnly() in Resolver and PolarisResolutionManifest#3427
sungwy wants to merge 2 commits intoapache:mainfrom
sungwy:resolver-apis

Conversation

@sungwy
Copy link
Contributor

@sungwy sungwy commented Jan 13, 2026

Introduce path‑only resolution API in the resolver stack without changing handler or authorizer behavior. While working on a PR to move the principal, roles and entity resolution logic into the PolarisAuthorizerImpl, I learned that resolution is done for a few reasons:

  1. Materializing the concrete entity/ID and hierarchy needed for execution (e.g., delete/rename/move, policy attach/detach, create under namespace).
  2. Existence checks (return not‑found early, avoid operating on missing entities).
  3. Providing resolved targets to the authorizer (entity IDs, types, and parent relationships) so auth decisions are based on authoritative state.

While (2) and (3) can be moved into the PolarisAuthorizer in line with the suggestion discussed in this PR comment, (1) needs to still be executed in the Handlers after a successful authorization.

Hence, introducing this path-only resolution API will allow a follow‑up auth refactor to decouple execution‑time path resolution from (2) and (3) for authorizable actions that are only require entity or path resolution, without attempting to resolve the principal, or associated roles.

Checklist

  • 🛡️ Don't disclose security issues! (contact security@apache.org)
  • 🔗 Clearly explained why the changes are needed, or linked related issues: Fixes #
  • 🧪 Added/updated tests with good coverage, or manually tested (and explained how)
  • 💡 Added comments for complex logic
  • 🧾 Updated CHANGELOG.md (if needed)
  • 📚 Updated documentation in site/content/in-dev/unreleased (if needed)

Copilot AI review requested due to automatic review settings January 13, 2026 01:55
@github-project-automation github-project-automation bot moved this to PRs In Progress in Basic Kanban Board Jan 13, 2026
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new path-only resolution API in the resolver stack to support an upcoming authorization refactor. The changes add a resolvePathsOnly() method that resolves reference catalogs, top-level entities, and paths without resolving the caller principal or any roles.

Changes:

  • Introduced MAX_RESOLVE_PASSES constant to replace hard-coded loop limit
  • Added resolvePathsOnly() methods to Resolver and PolarisResolutionManifest that skip principal and role resolution
  • Refactored resolveReferenceCatalog() to extract common logic into new resolveReferenceCatalogWithoutRoles() method
  • Added test coverage for the new path-only resolution functionality

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
Resolver.java Introduced MAX_RESOLVE_PASSES constant, added resolvePathsOnly() and runResolvePassPathsOnly() methods for path-only resolution, refactored catalog resolution to extract resolveReferenceCatalogWithoutRoles() method
PolarisResolutionManifest.java Added resolvePathsOnly() delegation method and updated comments to reference both resolution methods
BaseResolverTest.java Added testResolvePathsOnlySkipsPrincipalAndRoles() test to verify path-only resolution skips principal and role activation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +281 to +283
// retry until a pass terminates, or we reached the maximum iteration count. Note that we
// should finish normally in no more than few passes so the 1000 limit is really to avoid
// spinning forever if there is a bug.
Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment mentions "1000 limit" but should reference the constant MAX_RESOLVE_PASSES instead to improve maintainability. Consider changing "the 1000 limit" to "the MAX_RESOLVE_PASSES limit" or just "this limit".

Copilot uses AI. Check for mistakes.
@sungwy
Copy link
Contributor Author

sungwy commented Jan 13, 2026

Hi @adutra and @dimas-b - it took a minute for me to wrap my head around the PolarisResolutionManifest. Here’s a small PR to inch this auth refactoring forward. Could I ask for your help in reviewing if you approve of this direction for refactoring auth?

Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @sungwy , the general direction of this PR LGTM 👍 Some non-critical comments below.

* roles. Returns SUCCESS, PATH_COULD_NOT_BE_FULLY_RESOLVED, or ENTITY_COULD_NOT_BE_RESOLVED and
* never returns CALLER_PRINCIPAL_DOES_NOT_EXIST.
*/
public ResolverStatus resolvePathsOnly() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the method name sounds a bit obscure to me, TBH 😅 How about resolveNonIamEntities or resolveResourceEntities or resolveCatalogEntities? I personally prefer the latter, but I suppose it may still be confusing wrt "Catalog Roles". WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm I agree that I'm not a fan of the current name, nor these options :)

But naming is important! Let me give this a bit more thought while we continue the review

* incoming rest request, Once resolved, the request can be authorized.
*/
public class Resolver {
private static final int MAX_RESOLVE_PASSES = 1000;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1000 :)

*
* @return status of the resolve pass
*/
private ResolverStatus runResolvePassPathsOnly() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT about making a Resolver sub-class for the new logic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @dimas-b - by that are you suggesting that that the new Resolver sub-class should be chosen over the existing Resolver through a runtime configuration?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, something like that... but it depends on how / where we actually change to the new call path... which is not obvious in this PR 🤔

@dimas-b
Copy link
Contributor

dimas-b commented Jan 14, 2026

@sungwy : how to you envision choosing which Resolver method to use in runtime?

@sungwy
Copy link
Contributor Author

sungwy commented Jan 16, 2026

@sungwy : how to you envision choosing which Resolver method to use in runtime?

Good question @dimas-b.

Today we effectively hard-code the resolver choice in handlers. There are two call sites:

  • Handlers (CatalogHandler, PolarisAdminService)
    • resolve for existence checks before authorization
    • resolve again to fetch entities for execution after authorization
  • Authorizers (PolarisAuthorizerImpl, OpaPolarisAuthorizer)
    • implicitly depend on fully resolved entities to make authorization decisions

If we introduce a new PolarisAuthorizer API that accepts unresolved AuthorizationTargets and move existence checks into the Authorizer, then:

  • the Authorizer decides whether resolution is needed at all for that callsite, and which entities need resolution
  • unsupported actions can fail fast (e.g. PrincipalRole creation in OpaPolarisAuthorizer) without resolution
  • non Polaris-RBAC dependent authorizers can skip RBAC-entity resolution entirely

Before refactor (handlers always resolve eagerly):

Callsite Existence check (Handler) Execution fetch (Handler)
PolarisAuthorizerImpl – RBAC resolveAll() resolveAll()
PolarisAuthorizerImpl – Catalog resolveAll() resolveAll()
OpaPolarisAuthorizer – RBAC resolveAll() resolveAll()
OpaPolarisAuthorizer – Catalog resolveAll() resolveAll()

After refactor (authorizer controls resolution):

Callsite Existence check (Authorizer) Execution fetch (Handler)
PolarisAuthorizerImpl – RBAC resolveAll() resolveAll()
PolarisAuthorizerImpl – Catalog resolvePathsOnly() resolvePathsOnly()
OpaPolarisAuthorizer – RBAC throw (unsupported) skipped
OpaPolarisAuthorizer – Catalog resolvePathsOnly() resolvePathsOnly()

In summary: by moving existence checks into the Authorizer and standardizing catalog call sites on resolvePathsOnly():

  • OpaPolarisAuthorizer can remain truly non-RBAC-dependent
  • unsupported actions fail early, before any metastore lookups
  • handlers do not need to understand authorization-specific resolution semantics

@dimas-b
Copy link
Contributor

dimas-b commented Jan 16, 2026

@sungwy : thanks for the refactoring proposal...

I need some more time to think about it 😅 Ideally, I'd like the resolver algorithm to be tunable per persistence impl. Some efficiencies can be found in the relatively new NoSQL impl. (e.g. the hierarchical ID-based lookup via the parents trail is probably not required, paths can be resolved directly).

If other people have comments, I'd be interested to know their opinions too.

@sungwy
Copy link
Contributor Author

sungwy commented Jan 16, 2026

@sungwy : thanks for the refactoring proposal...

I need some more time to think about it 😅 Ideally, I'd like the resolver algorithm to be tunable per persistence impl. Some efficiencies can be found in the relatively new NoSQL impl. (e.g. the hierarchical ID-based lookup via the parents trail is probably not required, paths can be resolved directly).

If other people have comments, I'd be interested to know their opinions too.

I tried to convey the idea as best as possible here, but I think a proper RFC would be best given the scope of the proposal. I'm working on one, and I'll send it out on the mailing list once I clarify the doc with mentions of specific call sites I'm thinking of updating

@sungwy
Copy link
Contributor Author

sungwy commented Jan 16, 2026

Ideally, I'd like the resolver algorithm to be tunable per persistence impl. Some efficiencies can be found in the relatively new NoSQL impl. (e.g. the hierarchical ID-based lookup via the parents trail is probably not required, paths can be resolved directly).

This is not a requirement I have been thinking of so far... how do you envision the relationship of persistence impl <-> resolver impl <-> auth impl to be in an ideal world?

So far, I've only been thinking of a world where auth impl would affect the resolver's behavior

@dimas-b
Copy link
Contributor

dimas-b commented Jan 17, 2026

how do you envision the relationship of persistence impl <-> resolver impl <-> auth impl to be in an ideal world?

Rough sketch :)

  • REST API provides initial "seeds" (generally catalog paths or names) for the Resolver
  • Resolver "outputs" are Entities found from those "seeds".
  • The resolution algorithm is specific to the Persistence impl. (this is where Resolver sub-classes may be needed as I commented above). The one for JDBC uses current parent ID-based loop. The one for NoSQL uses direct lookup by path/name (this one may be implemented later).
  • One of the "seeds" is the Principal + Role Names (provided by existing AuthN code)
  • Whether Principal + Roles needs to be resolved to Entities is indicated by the Authorizer impl. This part is common to all Resolvers and can probably be done in the common base class.
  • Ideally there is only one resolution "action" - it produces the complete data set for the service endpoint to perform AuthZ and actual API impl. logic. Some early abandoned ideas can be found in Rework PolarisAuthorizer to use self-contained manifest on input #499, but we do not have to do it exactly that way.

WDYT?

Copy link
Contributor

@adutra adutra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sungwy I like both the direction being taken here and the ideas outlined by @dimas-b here: #3427 (comment) !

We need to find a way to articulate the changes to Resolver not only with the authorizer (the "receiving end") but also with the authenticator (the "sending end"). Your work on external principals in #3250 is imho relevant: if a principal is external, I don't think we should call resolveAll anywhere, or resolveAll should react differently to external principals.

Looking forward to the design doc!

@@ -812,14 +900,10 @@ private List<ResolvedPolarisEntity> resolvePrincipalRolesByName(
*/
private ResolverStatus resolveReferenceCatalog(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd prefer to have this method renamed to resolveReferenceCatalogAndRoles and the one below to resolveReferenceCatalog.

@dimas-b
Copy link
Contributor

dimas-b commented Jan 19, 2026

Your work on external principals in #3250 is imho relevant [...]

Good idea about using the Principal's isExternal / isFederated / type property to make the decision whether to resolve it to a Polaris entity (same for "active" roles).

resolutionManifest.addPath(
new ResolverPath(Arrays.asList(ns.levels()), PolarisEntityType.NAMESPACE), ns));
ResolverStatus status = resolutionManifest.resolveAll();
ResolverStatus status = resolutionManifest.resolvePathsOnly();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we plan to make the same changes for other classes, like Cataloghanlder, PolicyCatalogHandler?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

Comments