-
Notifications
You must be signed in to change notification settings - Fork 366
docs: Add Polaris Evolution page
#1890
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doc is a great start! For the "Polaris as a Library" section, I am wondering:
- Should we distinguish guarantees offered by different modules? E.g. should
polaris-coreor any of the API modules offer stronger backwards compatibility guarantees than persistence or service modules? - A statement like
Maintainers try to keep binary compatibility on the "best effort" basiseffectively contradicts Semver guarantees; it could be good to explicitly define our relationship with Semver here, wdyt? As most people implicitly infer semver when the versioning scheme is not explicitly stated. - Should we also mention Java version guarantees? Currently
polaris-core, the API modules, the Spark plugin modules, and some tooling modules all require Java 11, whereas the service modules, the Quarkus modules, the persistence modules and a few tooling modules require Java 21. It's not clear if these requirements are allowed to evolve, and how.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will clarify later today.
From my POV it is too early to marks some modules as "API" with stronger guarantees than other module. I think we should strive for that, but in the current state of the project, I think, all code is equally subject to major changes.
Attaching semver semantics to code at this time will probably require a version sequence like 1.0, 2.0, 3.0 for a few releases at least -- this is something I am personally fine with, but I wonder what other people think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated and added semver
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| community. Polaris attempts to accurately implement the most recent version of this specification, | |
| community. Polaris attempts to accurately implement this specification, |
I think, "most recent" is a too strong commitment. There might be legit reasons to not implement the "latest greatest". WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a valid concern. Can we remove the second sentence completely? We may not document it if we don't have a formal strategy to update IRC in Polaris.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated (slightly different text)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compatible with prior versions
Worth mentioning deprecation-for-removal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a new version of the API will be introduced
It's also a very strong commitment. It may be a new version or a deprecation or addition.
"Version" can be ambiguous here IMO. It might be a new version of the spec or a new base path (e.h. .../api/v42/...) or both. Mean, it's fine to be vague here, but then it should be clear that it's vague. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good time to resolve this ambiguity maybe. I have often wondered under what conditions we'll cut a new version of the spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Classic answer is probably: it depends. 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated / rephrased and expanded a bit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: "ABI" is the term here. But "binary" is much stronger than "source".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ## Polaris as a Library | |
| ## Polaris is not a Library |
The Polaris project is primarily intended for consumption "as is" by end users, while also allowing people to integrate it into their services. But there are no clear boundaries between API/SPI and implementation. Those could be introduced by having a clear separation of these via specific modules - but that's not the case yet.
Binary compatibility (in the sense of "ABI", application binary interface) is a very strong commitment, much stronger than source compatibility (think: rebuild).
From experience, we already had quite some changes throughout the (short) history of the project that count as ABI breaking changes, and quite a few exist as PRs (in progress/review) for "1.x".
My take on this is that there is strictly speaking neither a binary nor a source level compatibility guarantee. We should however work towards (more or less) strict API/SPI and implementation boundaries and clearly state the intent, scope and compatibility guarantees for each of the APIs and SPIs are.
Java visibility modifiers are not a way to "declare" APIs/SPIs and their guarantees. public, protected et al just define the technical visibility, often a broad(er) visibility is required by pure technical reasons - but those are not (necessarily) "public stable APIs". @VisibleForTesting means that the technical visibility for something has been changed for testing purposes - but it says absolutely nothing about the "assumed production visibility" ; it's also just an annotation, nothing that's enforced, similar to other annotations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this comment, but I'd like to keep the original section title :) Once way or another Polaris jars / code will be used downstream so in this section we're just setting expectations for these workflows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Polaris as a Library reads almost like Polaris is a Library to me... Maybe Building Dependencies on Polaris or Using Polaris as a Dependency or some variation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed title to Using Polaris as a Build-Time Dependency ... WDYT?
singhpk234
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for adding Managing Polaris Database section
This is useful for setting the expectations of users trying to migrate from eclipse link to jdbc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove the "binary compat" sentence entirely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
| * Iceberg REST Catalog API and Generic Tables API (refer to this [link](../polaris-catalog-service/) | ||
| for their combined Open API definition). | ||
| * Note: Polaris implementing an optional Iceberg REST Catalog feature that was unimplemented | ||
| in the previous release is not considered a major change. | ||
| * Supporting a new revision of the Iceberg REST Catalog spec in a backward-compatible way | ||
| is not considered a major change. | ||
| * Changing the implementation of an Iceberg REST Catalog feature / endpoint in a non-backward | ||
| compatible way is a major change. | ||
| * [Polaris Management API](../polaris-management-service/) | ||
| * [Polaris Policies](http://localhost:1313/in-dev/unreleased/policy/) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, as I said here, #1890 (comment). I don't think the REST APIs follow Semver. Can we remove them here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow your suggestion, @flyrain . This is not about identifying a particular API spec revision (v1, v2, etc.) but about changes in Polaris server behaviour.
IMHO, adding support for a new API spec version (e.g. v2) is not a major change from the SemVer perspective, because it does not break backward-compatibility. In that regard, Polaris may support several API spec versions in the same release of Polaris.
On the other hand, if Polaris had a bug in implementing the Catalog REST API (for example) such that fixing it introduced a backward incompatible change in behaviour, why would we not do a major Polaris version bump? I think it would make sense. WDYT?
| ## Semantic Versioning | ||
|
|
||
| Polaris strives to follow [Semantic Versioning](https://semver.org/) conventions both with | ||
| respect to Java code and REST APIs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment here that REST APIs doesn't follow semver.
| ## Semantic Versioning | ||
|
|
||
| Polaris strives to follow [Semantic Versioning](https://semver.org/) conventions both with | ||
| respect REST APIs (beta and experimental APIs excepted), [Polaris Policies](../policy/) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| respect REST APIs (beta and experimental APIs excepted), [Polaris Policies](../policy/) | |
| respect to REST APIs (beta and experimental APIs excepted), [Polaris Policies](../policy/) |
Is this true? Is it still semver if you only have a major version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thx - fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reopening just to confirm the answer is "yes" here, I don't know the details of semver well enough to say myself. We essentially just don't have minor or patch versions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My take on that is that API revisions are orthogonal to Polaris release versions.
We can have any number of different API revisions (v1, v2, v3) in releases X and Y. The key is what changes between these releases with respect to Polaris behaviour observable via those APIs. If the there are no observable changes, there's no need to bump the major version (from the SemVer perspective). If there are observable changes that are not backward-compatible, we have to bump the major version number.
In that regard, Polaris can introduce API v2 in a 1.0.1 release, as long as we do not drop API v1. We can choose to bump the release version to 2.0.0 in that case, but it's not mandated by SemVer.
| any release. Different Polaris jars may have different minimal JRE version requirements. | ||
|
|
||
| Changes in Java class should be expected at any time regardless of the module name or | ||
| whether the class / method is `public` or `private`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
| whether the class / method is `public` or `private`. | |
| whether the class / method is `public` or not. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
Co-authored-by: Eric Maynard <[email protected]>
--------- Co-authored-by: Eric Maynard <[email protected]>
* Cleanup unnecessary files in client/python (apache#1878) Cleanup unnecessary files in `client/python` * Bump version in version.txt With the release/1.0.0 branch being cut, we should bump this to reflect the current state of main * JDBC: Refactor DatabaseOps (apache#1843) * removes the databaseType computation from JDBCMetastoreManagerFactory to DbOperations * wraps the bootstrap in a transaction ! * refactor Production Readiness checks for Postgres * Fix two wrong links in README.md (apache#1879) * Avoid using org.testcontainers.shaded.** (apache#1876) * main: Update dependency io.smallrye.config:smallrye-config-core to v3.13.2 (apache#1888) * main: Update registry.access.redhat.com/ubi9/openjdk-21-runtime Docker tag to v1.22-1.1749462970 (apache#1887) * main: Update dependency boto3 to v1.38.36 (apache#1886) * fix(build): Fix deprecation warnings in PolarisIntegrationTestExtension (apache#1895) * Enable patch version updates for maintained Polaris version (apache#1891) Polaris 1.x will be a supported/maintained release. It is crucial to apply bug and security fixes to such release branches. Therefore, this change enables patch-version updates for Polaris 1.* * Add Polaris community meeting record for 2025-06-12 (apache#1892) * Do not use relative path inside CLI script Issue apache#1868 reported that the Polaris script can fail when it's run from an unexpected path. The recent addition of a reference to `./gradlew` looks incorrect here, and should be changed to use an absolute path. Fixes apache#1868 * feat(build): Add Checkstyle plugin and an IllegalImport rule (apache#1880) * Python CI: pin mypy version to avoid CI failure due to new release (apache#1903) Mypy did a new release 1.16.1 and it cause our CI to fail for about 20 minutes due to missing wheel (upload not completed) ``` | Unable to find installation candidates for mypy (1.16.1) | | This is likely not a Poetry issue. | | - 14 candidate(s) were identified for the package | - 14 wheel(s) were skipped as your project's environment does not support the identified abi tags | | Solutions: | Make sure the lockfile is up-to-date. You can try one of the following; | | 1. Regenerate lockfile: poetry lock --no-cache --regenerate | 2. Update package : poetry update --no-cache mypy | | If neither works, please first check to verify that the mypy has published wheels available from your configured source that are compatible with your environment- ie. operating system, architecture (x86_64, arm64 etc.), python interpreter. | ``` This PR temporarily restrict the mypy version to avoid the similar issue. We may consider bring poetry.lock back to git tracking so we won't automatically update test dependencies all the time * Remove `.github/CODEOWNERS` (apache#1902) As per this [dev-ML discussion](https://lists.apache.org/thread/jjr5w3hslk755yvxy8b3z45c7094cxdn) * Rename quarkus as runtime (apache#1695) * Rename runtime/test-commons to runtime/test-common (for consistency with module name) (apache#1906) * docs: Add `Polaris Evolution` page (apache#1890) --------- Co-authored-by: Eric Maynard <[email protected]> * feat(ci): Split Java Gradle CI in many jobs to reduce execution time (apache#1897) * Add webpage for Generic Table support (apache#1889) * add change * add comment * address feedback * update limitations * update docs * update doc * address feedback * Improve the parsing and validation of UserSecretReferenceUrns (apache#1840) This change addresses all the TODOs found the org.polaris.core.secrets package. Main changes: - Create a helper to parse, validate and build the URN strings. - Use Regex instead of `String.split()`. - Add Precondition checks to ensure that the URN is valid and the UserSecretManager matches the expected type. - Remove the now unused `GLOBAL_INSTANCE` of the UnsafeInMemorySecretsManager. Testing - Existing `UnsafeInMemorySecretsManagerTest` captures most of the functional changes. - Added `UserSecretReferenceUrnHelperTest` to capture the utilities exposed. * Reuse shadowJar for spark client bundle jar maven publish (apache#1857) * fix spark client * fix test failure and address feedback * fix error * update regression test * update classifier name * address comment * add change * update doc * update build and readme * add back jr * udpate dependency * add change * update * update tests * remove merge service file * update readme * update readme * fix(ci): Remove dummy "build" job from Gradle CI (apache#1911) Since apache#1897, the jobs in gradle.yaml changed and the "build" job was split into many smaller jobs. But since it was a required job, it couldn't be removed immediately. * main: Update Quarkus Platform and Group to v3.23.3 (apache#1797) * main: Update Quarkus Platform and Group to v3.23.3 * Adopt polaris-admin test invocation --------- Co-authored-by: Robert Stupp <[email protected]> * Feature: Rollback compaction on conflict (apache#1285) Intention is make the catalog smarter, to revert the compaction commits in case of crunch to let the writers who are actually adding or removing the data to the table succeed. In a sense treating compaction as always a lower priority process. Presently the rest catalog client creates the snapshot and asks the Rest Server to apply the snapshot and gives this in a combination of requirement and update. Polaris could apply some basic inference and generate some updates to metadata given a property is enabled at a table level, by saying that It will revert back the commit which was created by compaction and let the write succeed. I had this PR in OSS, which was essentially doing this at the client end, but we think its best if we do this as server end. to support more such clients. How to use this Enable a catalog level configuration : polaris.config.rollback.compaction.on-conflicts.enabled when this is enabled polaris will apply the intelligence of rollbacking those REPLACE ops snapshot which have the property of polaris.internal.rollback.compaction.on-conflict in their snapshot summary to resolve conflicts at the server end ! a sample use case is there is a deployment of a Polaris where this config is enabled and there is auto compaction (maintenance job) which is updating the table state, it adds the snapshot summary that polaris.internal.rollback.compaction.on-conflict is true now when a backfill process running for 8 hours want to commit but can't because the compaction job committed before so in this case it will reach out to Polaris and Polaris will see if the snapshot of compation aka replace snapshot has this property if yes roll it back and let the writer succeed ! Devlist: https://lists.apache.org/thread/8k8t77dgk1vc124fnb61932bdp9kf1lc * NoSQL: nits * `AutoCloseable` for `PersistenceTestExtension` * checkstyle adoptions * fix: unify bootstrap credentials and standardize POLARIS setup (apache#1905) - unified formatting across docker, gradle - reverted secret to s3cr3t - updated docker-compose, README, conftest.py use POLARIS for consistency across docker, gradle and others. * Add doc for rollback config (apache#1919) * Revert "Reuse shadowJar for spark client bundle jar maven publish (apache#1857)" (apache#1921) …857)" This reverts commit 1f7f127. The shadowJar plugin actually stops publish the original jar, which is not what spark client intend to publish for the --package usage. Revert it for now, will follow up with a better way to reuse the shadow jar plugin, likely with a separate bundle project * fix(build): Gradle caching effectively not working (apache#1922) Using a `custom()` spotless formatter check effectively disables caching, see `com.diffplug.gradle.spotless.FormatExtension#custom(java.lang.String, com.diffplug.spotless.FormatterFunc)` using `globalState`, which is a `NeverUpToDateBetweenRuns`. This change refactors this to be cachable. We also already have a errorprone rule, so we can get rid entirely of the spotless step. * Update spark client to use the shaded iceberg-core in iceberg-spark-runtime to avoid spark compatibilities issue (apache#1908) * add change * add comment * update change * add comment * add change * add tests * add comment * clean up style check * update build * Revert "Reuse shadowJar for spark client bundle jar maven publish (apache#1857)" This reverts commit 1f7f127. * Reuse shadowJar for spark client bundle jar maven publish (apache#1857) * fix spark client * fix test failure and address feedback * fix error * update regression test * update classifier name * address comment * add change * update doc * update build and readme * add back jr * udpate dependency * add change * update * update tests * remove merge service file * update readme * update readme * update checkstyl * rebase with main * Revert "Reuse shadowJar for spark client bundle jar maven publish (apache#1857)" This reverts commit 40f4d36. * update checkstyle * revert change * address comments * trigger tests * Last merged commit 93938fd --------- Co-authored-by: Honah (Jonas) J. <[email protected]> Co-authored-by: Eric Maynard <[email protected]> Co-authored-by: Prashant Singh <[email protected]> Co-authored-by: Yufei Gu <[email protected]> Co-authored-by: Dmitri Bourlatchkov <[email protected]> Co-authored-by: Mend Renovate <[email protected]> Co-authored-by: Alexandre Dutra <[email protected]> Co-authored-by: JB Onofré <[email protected]> Co-authored-by: Eric Maynard <[email protected]> Co-authored-by: Yun Zou <[email protected]> Co-authored-by: Pooja Nilangekar <[email protected]> Co-authored-by: Seungchul Lee <[email protected]>
No description provided.