Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the default XMLInputFactory instance #127

Merged
merged 1 commit into from
Jun 14, 2024
Merged

Conversation

gsmet
Copy link
Contributor

@gsmet gsmet commented Jun 13, 2024

This is part of my work to reduce the consequences of CL leaks in Quarkus.

I think this is the right thing to do... but while it fixes some issues I have in Quarkus... it makes the tests here fail loudly:

[ERROR] io.smallrye.beanbag.maven.MavenFactoryTestCase.testAllTestBeansDiscovered -- Time elapsed: 0.004 s <<< ERROR!
io.smallrye.beanbag.NoSuchBeanException: No matching bean available: type is class io.smallrye.beanbag.maven.beans.Vigna
	at io.smallrye.beanbag.Scope.getBean(Scope.java:259)
	at io.smallrye.beanbag.Scope.requireBean(Scope.java:164)
	at io.smallrye.beanbag.BeanBag.requireBean(BeanBag.java:85)
	at io.smallrye.beanbag.maven.MavenFactoryTestCase.testAllTestBeansDiscovered(MavenFactoryTestCase.java:86)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)

I tried to pass a few different CLs to the tests but it didn't fix anything.

@dmlloyd any chance you could have a look, I know you love class loaders :)

@gsmet
Copy link
Contributor Author

gsmet commented Jun 13, 2024

If you need help with breaking your tests, just ask :).

Copy link
Contributor

@gastaldi gastaldi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loop is never called with this change

dmlloyd
dmlloyd previously approved these changes Jun 13, 2024
@gsmet
Copy link
Contributor Author

gsmet commented Jun 13, 2024

Yeah I’m perfectly aware this doesn’t work in the tests. Thus why I asked for some insights.
But it seems to work a lot better for Quarkus tests and I’m not exactly sure why the beans wouldn’t be found by loading the services with the CL that we actually want to use.

@dmlloyd
Copy link
Collaborator

dmlloyd commented Jun 13, 2024

I guess that it's due to not having any such factory with that name available outside of Quarkus (or another environment which specifically provides it).

Is the important part of the change the factory ID, or the class loader? If it's the class loader, could you see if using a factory ID of javax.xml.stream.XMLInputFactory works?

@gastaldi
Copy link
Contributor

Tests still fail when changing to javax.xml.stream.XMLInputFactory unfortunately, just tested

@dmlloyd
Copy link
Collaborator

dmlloyd commented Jun 13, 2024

I guess this is due to the lack of a fallback class in this case. We probably need a utility method which tries newFactory with a class loader, and then falls back to newFactory() if nothing else works.

@gastaldi
Copy link
Contributor

This error is thrown when the factory is created:

javax.xml.stream.FactoryConfigurationError: Provider for plexus-components-factory cannot be found
	at java.xml/javax.xml.stream.FactoryFinder.find(FactoryFinder.java:275)
	at java.xml/javax.xml.stream.XMLInputFactory.newFactory(XMLInputFactory.java:275)
	at io.smallrye.beanbag.sisu.Sisu.addPlexusComponents(Sisu.java:124)
	at io.smallrye.beanbag.sisu.Sisu.lambda$addClassLoader$1(Sisu.java:85)
	at io.smallrye.beanbag.sisu.Sisu.lambda$loadBeans$2(Sisu.java:104)
	at io.smallrye.beanbag.sisu.BeanLoadingTaskRunner.lambda$run$0(BeanLoadingTaskRunner.java:32)
	at java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1804)
	at java.base/java.util.concurrent.CompletableFuture$AsyncRun.exec(CompletableFuture.java:1796)

@dmlloyd
Copy link
Collaborator

dmlloyd commented Jun 13, 2024

Yeah there is no nice null return unfortunately, so we have to try/catch (FactoryConfigurationError ignored) each thing we try. 🙁

@dmlloyd
Copy link
Collaborator

dmlloyd commented Jun 13, 2024

As a final fallback, we should probably be calling javax.xml.stream.XMLInputFactory#newDefaultFactory though.

@gastaldi
Copy link
Contributor

gastaldi commented Jun 13, 2024

Using this makes the tests pass:

    private static XMLInputFactory newXMLInputFactory(ClassLoader classLoader) {
        try {
            return XMLInputFactory.newFactory("plexus-components-factory", classLoader);
        } catch (FactoryConfigurationError e) {
            return XMLInputFactory.newDefaultFactory();
        }
    }

But I am not sure if the first statement will ever work 🫤

@dmlloyd
Copy link
Collaborator

dmlloyd commented Jun 13, 2024

I'd go with making a createXMLInputFactory(ClassLoader) method to factor out common code.

That method should first try a factoryId of plexus-components-factory, then fall back to trying a factoryId of XMLInputFactory.class.getName() (see the Javadoc of XMLInputFactory for more info), and last fall back to using XMLInputFactory.newDefaultFactory(). And, factor it outside of the loop so that we only ever do it all once instead of once per file.

@gastaldi
Copy link
Contributor

Do we ever set the plexus-components-factory system property in Quarkus? I'm curious to see when that first statement will work.

@dmlloyd
Copy link
Collaborator

dmlloyd commented Jun 13, 2024

I'm inferring a lot from @gsmet's original change, but I'm guessing that for whatever reason, the Plexus libraries include their own XML implementation that in turn is useful for us to use for some reason.

If using XMLInputFactory.class.getName() works fine in the Quarkus situation, then we should just do that and skip the plexus stuff.

@gsmet
Copy link
Contributor Author

gsmet commented Jun 13, 2024

FWIW, I cannot reproduce the issue anymore with or without this patch (it was reproducible always when I worked on this, not sure what's going on...).

@aloubyansky
Copy link
Member

Do we know why exactly this change was necessary in the first place?

@aloubyansky
Copy link
Member

I'd be -1 on merging it until we clarify it

@dmlloyd dmlloyd dismissed their stale review June 13, 2024 19:15

Until everyone is happy...

@gsmet
Copy link
Contributor Author

gsmet commented Jun 14, 2024

Yeah sorry, I should have clarified from the get go but I was building experiments on top of experiments and it was hard to get a clear status.

Actually, I'm not reproducing my original issue anymore but with my safeguards in place, I can reproduce the fact that SmallRye BeanBag tries to use a closed CL to load some resources/classes.

The original issue I want to solve is that I want to be able to clean the resources of the QuarkusClassLoader when we close it (nullify the fields, empty the collections/maps...) because we regularly have CL leaks in tests and dev mode and they are not really easy to fix so we'd better try to reduce the consequences of the leaks, especially since there is no point in keeping all these resources open.

I started a patch cleaning the resources, but then I had very weird side effects which pointed out the fact that closed CLs were still in use in parts of Quarkus.
You can see the WIP patch here: quarkusio/quarkus#41172 (be aware it currently points to a 1.5.1-SNAPSHOT of BeanBag).

I added some code to detect these cases and got several occurrences of SmallRye BeanBag trying to access a closed CL.

For instance:

2024-06-13 19:48:37,143 ERROR [io.sma.bea.sis.BeanLoadingTaskRunner] (main) 4) This class loader has been closed: java.lang.IllegalStateException: This class loader has been closed
	at io.quarkus.bootstrap.classloading.QuarkusClassLoader.ensureOpen(QuarkusClassLoader.java:716)
	at io.quarkus.bootstrap.classloading.QuarkusClassLoader.loadClass(QuarkusClassLoader.java:495)
	at io.quarkus.bootstrap.classloading.QuarkusClassLoader.loadClass(QuarkusClassLoader.java:549)
	at io.quarkus.bootstrap.classloading.QuarkusClassLoader.loadClass(QuarkusClassLoader.java:497)
	at io.quarkus.bootstrap.classloading.QuarkusClassLoader.getResources(QuarkusClassLoader.java:245)
	at io.quarkus.bootstrap.classloading.QuarkusClassLoader.getResources(QuarkusClassLoader.java:220)
	at java.base/java.util.ServiceLoader$LazyClassPathLookupIterator.nextProviderClass(ServiceLoader.java:1203)
	at java.base/java.util.ServiceLoader$LazyClassPathLookupIterator.hasNextService(ServiceLoader.java:1228)
	at java.base/java.util.ServiceLoader$LazyClassPathLookupIterator.hasNext(ServiceLoader.java:1273)
	at java.base/java.util.ServiceLoader$2.hasNext(ServiceLoader.java:1309)
	at java.base/java.util.ServiceLoader$3.hasNext(ServiceLoader.java:1393)
	at java.xml/javax.xml.stream.FactoryFinder$1.run(FactoryFinder.java:350)
	at java.base/java.security.AccessController.doPrivileged(AccessController.java:318)
	at java.xml/javax.xml.stream.FactoryFinder.findServiceProvider(FactoryFinder.java:339)
	at java.xml/javax.xml.stream.FactoryFinder.find(FactoryFinder.java:310)
	at java.xml/javax.xml.stream.FactoryFinder.find(FactoryFinder.java:223)
	at java.xml/javax.xml.stream.XMLInputFactory.newInstance(XMLInputFactory.java:166)
	at io.smallrye.beanbag.sisu.Sisu.addPlexusComponents(Sisu.java:123)
	at io.smallrye.beanbag.sisu.Sisu.lambda$addClassLoader$1(Sisu.java:85)
	at io.smallrye.beanbag.sisu.Sisu.lambda$loadBeans$2(Sisu.java:104)
	at io.smallrye.beanbag.sisu.BeanLoadingTaskRunner.lambda$run$0(BeanLoadingTaskRunner.java:32)

With this patch, I don't have the issue anymore as we use the CL passed to BeanBag - which is a low level JDK CL -, instead of the QuarkusClassLoader.

You can reproduce the issue with the branch I pointed above:

  • Get the PR branch Attempt to alleviate QuarkusClassLoader leaks quarkusio/quarkus#41172
  • Build Quarkus
  • Get this BeanBag branch
  • Build BeanBag
  • Run mvn clean install -Djaxp.debug -f extensions/opentelemetry/deployment -Dno-build-cache
  • Everything should work fine (note that it's very long, you don't need to run them to completion, when failing, it fails very fast)

Now:

  • Update independent-projects/bootstrap/pom.xml, adjust smallrye-beanbag.version to 1.5.0
  • Build Quarkus
  • Run mvn clean install -Djaxp.debug -f extensions/opentelemetry/deployment -Dno-build-cache
  • You should get some errors

Now I'm not entirely sure what's specific with the opentelemetry tests because I don't have this issue when running the hibernate-validator extension ones.

@aloubyansky
Copy link
Member

It looks like the previous impl would use javax.xml.stream.FactoryFinder.class.getClassLoader()

@brunobat
Copy link

brunobat commented Jun 14, 2024

Now I'm not entirely sure what's specific with the opentelemetry tests because I don't have this issue when running the hibernate-validator extension ones.

There are 2 different problems with the OpenTelemetry tests:

  1. On Deployment we have a lot of them and they store data in memory, therefore making it easier to hit a OOM.
  2. On the Vert.x exporter tests we have 10 different TestContainer tests executed in sequence. TestContainer resources are not cleaned up properly, also a friendly context to hit a OOM.

Also, OTel brings a lot of jars and classes.

Comment on lines 123 to 124
XMLStreamReader xr = XMLInputFactory.newFactory(XMLInputFactory.class.getName(), classLoader)
.createXMLStreamReader(br);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'm not reproducing my original issue anymore but with my safeguards in place, I can reproduce the fact that SmallRye BeanBag tries to use a closed CL to load some resources/classes.

If it's all about using a closed CL due to service loader shenanigans, then I'd recommend this simpler solution:

Suggested change
XMLStreamReader xr = XMLInputFactory.newFactory(XMLInputFactory.class.getName(), classLoader)
.createXMLStreamReader(br);
XMLStreamReader xr = XMLInputFactory.newDefaultFactory()
.createXMLStreamReader(br);

This avoids class loading completely, and just directly instantiates the factory from the JDK.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me give it a try.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dmlloyd I adjusted the PR to follow your suggestion and the tests pass and the issue is solved in Quarkus too so I would say let's go for it.

@aloubyansky I let you have a look too.

If things are fine for both of you, I would appreciate a quick release so I can iterate on my side. Thanks!

@gsmet gsmet changed the title Use proper class loader when creating the XMLInputFactory Use the default XMLInputFactory instance Jun 14, 2024
Copy link
Member

@aloubyansky aloubyansky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks like a win from multiple perspectives, awesome!

@dmlloyd dmlloyd merged commit c74b472 into smallrye:main Jun 14, 2024
1 check passed
@dmlloyd dmlloyd added this to the 1.5.1 milestone Jun 14, 2024
@gsmet
Copy link
Contributor Author

gsmet commented Jun 14, 2024

@dmlloyd thanks for the pointer and the release, I included the upgrade in my patch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants