Skip to content

Conversation

@guitcastro
Copy link

@guitcastro guitcastro commented Aug 6, 2024

Description

As today, It's impossible to load configurations outside the classpath using eclipse-link. This PR allow do load config files from any path.

Fixes # (issue)

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

I've update the PolarisEclipseLinkMetaStoreTest to ensure that the file is being loaded from outside the classpath.

Checklist:

Please delete options that are not relevant.

  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • If adding new functionality, I have discussed my implementation with the community using the linked GitHub issue
  • I have signed and submitted the ICLA and if needed, the CCLA. See Contributing for details.

@guitcastro guitcastro requested a review from a team as a code owner August 6, 2024 03:09

try {
InputStream input = this.getClass().getClassLoader().getResourceAsStream(confFile);
FileInputStream input = new FileInputStream(confFile);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine but Persistence.createEntityManagerFactory(persistenceUnitName, properties) puts the limitation which can only loads the configuration from a jar. I'm making a change to workaround that. #88

Copy link
Author

@guitcastro guitcastro Aug 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able to fix this in the last commit. Please see the this comment in #79

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replied in another thread.

@guitcastro Actually it's not because of the properties. Of course we need the properties because we want to replace realm in the database name - each realm will have its own database.

Persistence.createEntityManagerFactory() will still load configuration file from resource even without properties.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aihuaxu only if you specify the ECLIPSELINK_PERSISTENCE_XML property. I've tested and it worked.

From their docs:

This property is only used by EclipseLink when it is locating the configuration file.
When used within an EJB/Spring container in container managed mode the locating and reading of this file is done by the container and will not use this configuration.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added unit test to ensure config loads from outside the classpath is working.

Copy link
Contributor

@aihuaxu aihuaxu Aug 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@guitcastro Can you list the steps how you get it to work? It's possible the unit test found the persistence.xml from the embedded jar. That's why it doesn't fail for you.

I don't see the changes or unit tests in the PR. Did you forget to include that?

I only see
properties.put(ECLIPSELINK_PERSISTENCE_XML, confFile); -- removed

FileInputStream input = new FileInputStream(confFile);

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aihuaxu Sorry, I have forgot to push it. Can you please take a look now? I also added a test to ensure that the config is not being loaded from the classpath.

@eric-maynard
Copy link
Contributor

eric-maynard commented Aug 6, 2024

This looks similar to #79, which did not work when tested.

@guitcastro
Copy link
Author

guitcastro commented Aug 6, 2024

This looks similar to #79, which did not work when tested.

I fix this in the last commit. Please see the this comment in #79

@eric-maynard
Copy link
Contributor

Hi @guitcastro, I tested this change and found that it did not work. See my notes below:

I copied the default persistence.xml to my desktop and updated the polaris-server.yml to point to it:

. . .
metaStoreManager:
  type: eclipse-link
  persistence-unit: polaris-dev
  conf-file: /Users/emaynard/Desktop/persistence.xml
. . .

I then modified the default persistence.xml at .../resources/META-INF/persistence.xml to use a different persistence-unit -- polaris-fake instead of polaris-dev. When I attempt to bootstrap, I see this:

jakarta.persistence.PersistenceException: No Persistence provider for EntityManager named polaris-dev
	at jakarta.persistence.Persistence.createEntityManagerFactory(Persistence.java:86)
	at io.polaris.extension.persistence.impl.eclipselink.PolarisEclipseLinkMetaStoreSessionImpl.<init>(PolarisEclipseLinkMetaStoreSessionImpl.java:109)
. . .

If I revert the default persistence.xml back to using polaris-dev, it bootstraps normally.

Based on this, I suspect that your change is in fact just loading the default persistence.xml which is on the classpath, and that it does not actually allow us to load a persistence.xml outside of the classpath. This makes sense when you consider your change is deleting the config that is supposed to point to the location of the persistence.xml. Is it possible you were observing the same in your testing?

@guitcastro
Copy link
Author

Hi @guitcastro, I tested this change and found that it did not work. See my notes below:

I copied the default persistence.xml to my desktop and updated the polaris-server.yml to point to it:

. . .
metaStoreManager:
  type: eclipse-link
  persistence-unit: polaris-dev
  conf-file: /Users/emaynard/Desktop/persistence.xml
. . .

I then modified the default persistence.xml at .../resources/META-INF/persistence.xml to use a different persistence-unit -- polaris-fake instead of polaris-dev. When I attempt to bootstrap, I see this:

jakarta.persistence.PersistenceException: No Persistence provider for EntityManager named polaris-dev
	at jakarta.persistence.Persistence.createEntityManagerFactory(Persistence.java:86)
	at io.polaris.extension.persistence.impl.eclipselink.PolarisEclipseLinkMetaStoreSessionImpl.<init>(PolarisEclipseLinkMetaStoreSessionImpl.java:109)
. . .

If I revert the default persistence.xml back to using polaris-dev, it bootstraps normally.

Based on this, I suspect that your change is in fact just loading the default persistence.xml which is on the classpath, and that it does not actually allow us to load a persistence.xml outside of the classpath. This makes sense when you consider your change is deleting the config that is supposed to point to the location of the persistence.xml. Is it possible you were observing the same in your testing?

Maybe the /Users/emaynard/Desktop/persistence.xml is not defining a polaris-dev. I just commit two unit test that ensure that the default config is not being loaded. If you test the lasted version it should at least throw an exception instead of loading the default one.

}

@Test
void ensureNotLoadingDefaultClassPersistence() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this test that you're not still loading the file from /META-INF/persistence.xml?

@eric-maynard
Copy link
Contributor

No @guitcastro it was definitely there -- like I mentioned, I copied it directly from the existing default file. Have you done an end-to-end test using this change including bootstrapping Polaris? If so can you share the steps?

PolarisEclipseLinkMetaStoreSessionImpl session =
new PolarisEclipseLinkMetaStoreSessionImpl(
store, Mockito.mock(), () -> "realm", null, "polaris-dev");
store, Mockito.mock(), () -> "realm", tmpFile.toString(), "polaris-dev");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this, Persistence.createEntityManagerFactory() will use the one in the jar, not the /tmp/%s-persistence.xml.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can try to remove persistence.xml from the project and see the issue.

@guitcastro
Copy link
Author

@aihuaxu @eric-maynard You are right, what happened is that my external configuration had a persistence unit also named polaris-dev and got merged with the one in the classpath. Thus my jakarta.persistence.jdbc.url took precedence from the class path and I was able to connect with my database (postgres).

@sfc-gh-aixu
Copy link
Contributor

@aihuaxu @eric-maynard You are right, what happened is that my external configuration had a persistence unit also named polaris-dev and got merged with the one in the classpath. Thus my jakarta.persistence.jdbc.url took precedence from the class path and I was able to connect with my database (postgres).

Yeah. So let's move forward with my workaround solution for now until we have a better one?

@guitcastro guitcastro closed this Aug 7, 2024
snazy added a commit to snazy/polaris that referenced this pull request Nov 20, 2025
* chore(deps): update dependency mypy to >=1.17, <=1.17.0 (apache#2114)

* Spark 3.5.6 and Iceberg 1.9.1 (apache#1960)

* Spark 3.5.6 and Iceberg 1.9.1

* Cleanup

* Add `pathStyleAccess` to AwsStorageConfigInfo (apache#2012)

* Add `pathStyleAccess` to AwsStorageConfigInfo

This change allows configuring the "path-style" access
mode in S3 clients (both in Polaris Servers and Iceberg
REST Catalog API clients).

This change is applicable both to AWS storage and to
non-AWS S3-compatible storage (apache#1530).

* Add TestFileIOFactory helper (apache#2105)

* Add FileIOFactory.wrapExisting helper

* fix(deps): update dependency gradle.plugin.org.jetbrains.gradle.plugin.idea-ext:gradle-idea-ext to v1.2 (apache#2125)

* fix(deps): update dependency boto3 to v1.39.7 (apache#2124)

* Abstract polaris-runtime-service tests for all persistence implementations (apache#2106)

The NoSQL persistence implementation has to run the Iceberg table & view catalog plus the Polaris specific tests as well. Reusing existing tests is beneficial to avoid a lot of code duplcation.

This change moves the actual tests to `Abstract*` classes and refactors the existing tests to extend those. The NoSQL persistence work extends the same `Abstract*` classes but runs with different Quarkus test profiles.

* Add IMPLICIT authentication support to the CLI (apache#2121)

PRs apache#1925 and apache#1912 were merged around the same time.  This PR connects the two changes and enables the CLI to accept IMPLICIT authentication type. 

Since Hadoop federated catalogs rely purely on IMPLICIT authentication, the CLI parsing test has been updated to reflect the same.

* feat(helm): Add support for external authentication (apache#2104)

* fix(deps): update dependency org.apache.iceberg:iceberg-bom to v1.9.2 (apache#2126)

* fix(deps): update quarkus platform and group to v3.24.4 (apache#2128)

* fix(deps): update dependency boto3 to v1.39.8 (apache#2129)

* fix(deps): update dependency io.smallrye.config:smallrye-config-core to v3.13.3 (apache#2130)

* Add newIcebergCatalog helper (apache#2134)

creation of `IcebergCatalog` instances was quite redundant as tests
mostly use the same parameters most of the time.

also remove an unused field in 2 other tests.

* Add server and client support for the new generic table `baseLocation` field (apache#2122)

* Use Makefile to simplify setup and commands (apache#2027)

* Use Makefile to simplify setup and commands

* Add targets for minikube state management

* Add podman support and spark plugin build

* Add version target

* Update README.md for Makefile usage and relation to the project

* Fix nit

* Package polaris client as python package (apache#2049)

* Package polaris client as python package

* Package polaris client as python package

* Change owner to spark when copying files from local into Dockerfile

* CI: Address failure from accessing GH API (apache#2132)

CI sometimes fails with this failure:
```
* What went wrong:
Execution failed for task ':generatePomFileForMavenPublication'.
> Unable to process url: https://api.github.com/repos/apache/polaris/contributors?per_page=1000
```

The sometimes failing request fetches the list of contributors to be published in the "root" POM. Unauthorized GH API requests have an hourly(?) limit of 60 requests per source IP. Authorized requests have a much higher rate limit. We do have a GitHub token available in every CI run, which can be used in GH API requests. This change adds the `Authorization` header for the failing GH API request to leverage the higher rate limit and let CI not fail (that often).

* fix(deps): update dependency com.nimbusds:nimbus-jose-jwt to v10.4 (apache#2139)

* fix(deps): update dependency com.diffplug.spotless:spotless-plugin-gradle to v7.2.0 (apache#2142)

* fix(deps): update dependency software.amazon.awssdk:bom to v2.32.4 (apache#2146)

* fix(deps): update dependency org.xerial.snappy:snappy-java to v1.1.10.8 (apache#2138)

* fix(deps): update dependency org.junit:junit-bom to v5.13.4 (apache#2147)

* fix(deps): update dependency boto3 to v1.39.9 (apache#2137)

* fix(deps): update dependency com.fasterxml.jackson:jackson-bom to v2.19.2 (apache#2136)

* Python client: add support for endpoint, sts-endpoint, path-style-access (apache#2127)

This change adds support for endpoint, sts-endpoint, path-style-access to the Polaris Python client.

Amends apache#1913 and apache#2012

* Remove PolarisEntityManager.getCredentialCache (apache#2133)

`PolarisEntityManager` itself is not using the `StorageCredentialCache` but just hands it out via `getCredentialCache`.
the only caller of `getCredentialCache` is `FileIOUtil.refreshAccessConfig`, which in in turn is only called by `DefaultFileIOFactory` and `IcebergCatalog`.

note that in a follow-up we will likely be able to remove `PolarisEntityManager` usage completely from `IcebergCatalog`.

additional cleanups:
- use `StorageCredentialCache` injection in tests (but we need to invalidate all entries on test start)
- remove unused `UserSecretsManagerFactory` from `PolarisCallContextCatalogFactory`

* chore(deps): update registry.access.redhat.com/ubi9/openjdk-21-runtime docker tag to v1.22-1.1752676419 (apache#2150)

* fix(deps): update dependency com.diffplug.spotless:spotless-plugin-gradle to v7.2.1 (apache#2152)

* fix(deps): update dependency boto3 to v1.39.10 (apache#2151)

* chore: fix class reference in the javadoc of TableLikeEntity (apache#2157)

* fix(deps): update dependency commons-codec:commons-codec to v1.19.0 (apache#2160)

* fix(deps): update dependency boto3 to v1.39.11 (apache#2159)

* Last merged commit 395459f

---------

Co-authored-by: Mend Renovate <[email protected]>
Co-authored-by: Yong Zheng <[email protected]>
Co-authored-by: Dmitri Bourlatchkov <[email protected]>
Co-authored-by: Christopher Lambert <[email protected]>
Co-authored-by: Pooja Nilangekar <[email protected]>
Co-authored-by: Alexandre Dutra <[email protected]>
Co-authored-by: Yun Zou <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants