Skip to content

Conversation

@fabio-rizzo-01
Copy link
Contributor

First draft of the doc to document a realm.

@flyrain
Copy link
Contributor

flyrain commented Apr 28, 2025

Comment on lines 49 to 50
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for Polaris admins this is the most salient point -- "realm" as a concept is essentially one layer above "catalog", which itself is used to segregate different Iceberg Catalogs from one another.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be a miss on my end... Originally, I did not put Realm in this set of docs because technically, there is no Realm entity in Polaris. However, it's still a conceptual entity, so I think having it here makes sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR created a new page for realm only. Two options:

  1. Move the realm related doc to the entity page here, https://polaris.apache.org/in-dev/unreleased/entities/
  2. Rename the title to Realm instead of Entities

I'm OK with either one, but prefer option 1 to make the doc compacted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yeah this looks wrong. @fabio-rizzo-01 can you fix the title or move the content?

@dimas-b
Copy link
Contributor

dimas-b commented Apr 28, 2025

This PR seems to have a lot of commits that are not relevant to the change it brings 🤔

Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for starting this doc page, @fabio-rizzo-01 !

However, I'm not sure I agree on the approach (as you'll see from my comments). From my POV, it would be preferable to state specifically what a realm means in Polaris, without going into general discussion.

I believe it is important to show that Realm is not part of any API and is defined by a separate HTTP header in a way that does not affect existing API functionality.

Next, I think we need to state that Polaris behaviours in a particular realms are independent of data / config in other realms.

Then, we need to state that at the persistence layer realm data may be isolated in different database or may be co-located in the same database, but is still segratated using different keys.

Then, we can give examples for how realms can be applied to handle different security domains, or different deployments, or different environments (dev / QA)... etc.

That would be useful for end users from my POV. However, if other people prefer the current approach to describing realms, that would be fine with me too :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first paragraph talks about ream in Polaris, but here we generalize to "realm in software systems". I'd prefer to stay focused on Polaris.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is <br/> really necessary?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could define... - let's talk about what a realm IS in Polaris specifically, not what it could be in general.

From my POV we should define how realms are dealt with in Prolaris runtime, and then give a few examples to how realms can be applied in a particular deployment... However, concrete implications of how realms work should be the primary focus of this doc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to talk about application of realms (e.g. to deployments) is a separate paragraph after we establish how realms work in Polaris.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be crucial is too broad IMHO.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can, allows - I'm not sure these verbs put the right emphasis. Realms force isolation of authentication and authorization. There's no way in Polaris to share anything across realms.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does application refer to in this context?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What services and modules represent in this context?

@fabio-rizzo-01
Copy link
Contributor Author

@eric-maynard @dimas-b I have updated the document based on your comments, might not be perfect yet. Let me know thx!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A realm in Polaris serves as logical partitioning mechanism within the catalog system. This isolation allows for multitenancy, enabling different teams, environment or organizations to operate independently within the same Polaris deployment.
A realm in Polaris serves as logical partitioning mechanism within the catalog system. This isolation allows for multitenancy, enabling different teams, environments or organizations to operate independently within the same Polaris deployment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove the example; it does not demonstrate "separation of security concerns across different realms".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would again remove the example for a few reasons:

  1. it implicitly refers to database-per-realm, but it's not the only enacted way of doing multitenancy at the persistence layer, we also agreed on single-database with realm in primary key.
  2. It uses H2 and file system path, which is not a recommended approach.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mention that later, I was just trying to give a concrete example. should I change to something like jdbc:postgresql://localhost:5432/{realm} ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's better 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very geared towards EclipseLink, I'd suggest to stay more generic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to give a concrete examples from the codebase

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This imho should come as the first item, since this is a key component that is triggered even before the authentication phase.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
**ConfigurationScope:** Realm identifiers are used in various configurations, such as database paths:
**Configuration Scope:** Realm identifiers are used in various configurations, such as connection strings, feature configurations, etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
**Isolation:** In methods like `createEntityManagerFactory(@Nonnull RealmContext realmContext)` from `PolarisEclipseLinkPersistenceUnit` interface, the realm context influence how resources are created or managed based on the security policies of that realm.
**Isolation:** In the persistence layer, the realm context influences how resources are created or managed based on the security policies of that realm.

Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update @fabio-rizzo-01 ! It looks good to me overall with a couple of small new comments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd exclude security breaches from this statement... It's a very strong claim.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say "Pricipals' credentials" to be more precise. There are also storage credentials, which can be shared across realms ATM.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add **Persistence:** Some implementations of Polaris Persistence may use realm IDs in primary keys for Polaris data.

I believe it is important to inform uses of this aspect.

@dimas-b
Copy link
Contributor

dimas-b commented Apr 30, 2025

@fabio-rizzo-01 would you mind simplifying this PR to contain just doc-changing commits? The current list of commits looks overwhelming (even though the diff is good).

@github-project-automation github-project-automation bot moved this from PRs In Progress to Done in Basic Kanban Board May 6, 2025
dimas-b
dimas-b previously approved these changes May 6, 2025
Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution @fabio-rizzo-01 !

From my POV the current state of this PR is ok to merge, but I encourage other reviewers to have another look too.

Obviously notes about how realm IDs are used in runtime will be subject to on-going server changes, so we'll have to adjust them as the Server evolves.

@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board May 6, 2025
pingtimeout
pingtimeout previously approved these changes May 6, 2025
Copy link
Contributor

@pingtimeout pingtimeout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor suggestion (nit), +1

authorization.

**Isolation:** In methods like `createEntityManagerFactory(@Nonnull RealmContext realmContext)` from `PolarisEclipseLinkPersistenceUnit` interface, the realm context influence how resources are created or managed based on the security policies of that realm.
An example of this is the way a realm name is used to create a database connection url so that you have one database instance per realm, or it can be more granular and applied at primary key level (within the same database instance).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
An example of this is the way a realm name is used to create a database connection url so that you have one database instance per realm, or it can be more granular and applied at primary key level (within the same database instance).
An example of this is the way a realm name can be used to create a database connection url so that you have one database instance per realm, when applicable. Or it can be more granular and applied at primary key level (within the same database instance).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yeah this looks wrong. @fabio-rizzo-01 can you fix the title or move the content?

@github-project-automation github-project-automation bot moved this from Ready to merge to PRs In Progress in Basic Kanban Board May 6, 2025
Copy link
Contributor

@flyrain flyrain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @fabio-rizzo-01 for working on it. Left some comments. We are getting close.

#
Title: Entities
type: docs
weight: 400
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change it to 350, so that it will show between the page CLI and Entities in the left side menu?

Screenshot 2025-05-06 at 1 32 33 PM


**MetaStore and Cache Management:** Realms are used to manage different instances or configurations of metadata stores and caches. An example of this is `LocalPolarisMetaStoreManagerFactory`.

**Persistence:** Some implementations of Polaris Persistence may use realm IDs in primary keys for Polaris data. No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may remove this as it is already covered in line 53.

**Isolation:** In methods like `createEntityManagerFactory(@Nonnull RealmContext realmContext)` from `PolarisEclipseLinkPersistenceUnit` interface, the realm context influence how resources are created or managed based on the security policies of that realm.
An example of this is the way a realm name is used to create a database connection url so that you have one database instance per realm, or it can be more granular and applied at primary key level (within the same database instance).

**MetaStore and Cache Management:** Realms are used to manage different instances or configurations of metadata stores and caches. An example of this is `LocalPolarisMetaStoreManagerFactory`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can remove this line, as MetaStore is a part of the persistence layer, which is covered above already?

@fabio-rizzo-01 fabio-rizzo-01 dismissed stale reviews from dimas-b and pingtimeout via 8e83a37 May 7, 2025 08:01
@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board May 7, 2025
@flyrain flyrain merged commit 473f379 into apache:main May 7, 2025
6 checks passed
@github-project-automation github-project-automation bot moved this from Ready to merge to Done in Basic Kanban Board May 7, 2025
@flyrain
Copy link
Contributor

flyrain commented May 7, 2025

Thanks @fabio-rizzo-01 for the contribution! Thanks everyone for the review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants