Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ConcurrentModificationException when Quarkus Micrometer extension is used #43471

Closed
mabartos opened this issue Sep 24, 2024 · 13 comments · Fixed by #43645
Closed

ConcurrentModificationException when Quarkus Micrometer extension is used #43471

mabartos opened this issue Sep 24, 2024 · 13 comments · Fixed by #43645
Labels
area/metrics kind/bug Something isn't working triage/upstream Used for issues which are caused by issues in upstream projects/dependency
Milestone

Comments

@mabartos
Copy link
Contributor

mabartos commented Sep 24, 2024

Describe the bug

In Keycloak, we're facing an issue (keycloak/keycloak#33246) that the server is unable to start when metrics are enabled - with the Quarkus Micrometer extension.

AFAIK, it started to happen after the Micrometer 1.13.x upgrade.

It happens only sometimes (after ~10 startups) and is reproducible after multiple start/stops of the Keycloak (Quarkus-based) server.

The related log:

2024-09-24 15:24:07,330 INFO  [org.jgroups.protocols.FD_SOCK2] (ForkJoinPool.commonPool-worker-1) server listening on *.34764
2024-09-24 15:24:07,389 INFO  [org.jgroups.protocols.pbcast.GMS] (ForkJoinPool.commonPool-worker-1) mabartos-12454: no members discovered after 55 ms: creating cluster as coordinator
2024-09-24 15:24:07,406 INFO  [org.infinispan.CLUSTER] (ForkJoinPool.commonPool-worker-1) ISPN000094: Received new cluster view for channel ISPN: [mabartos-12454|0] (1) [mabartos-12454]
2024-09-24 15:24:07,489 INFO  [org.infinispan.CLUSTER] (ForkJoinPool.commonPool-worker-1) ISPN000079: Channel `ISPN` local address is `mabartos-12454`, physical addresses are `[10.202.149.163:50300]`
2024-09-24 15:24:07,901 DEBUG [io.micrometer.common.util.internal.logging.InternalLoggerFactory] (main) Using SLF4J as the default logging framework
2024-09-24 15:24:07,903 DEBUG [io.quarkus.micrometer.runtime.MicrometerRecorder] (main) Discovered global MeterFilters : [io.micrometer.core.instrument.config.MeterFilter$9@15311af2, io.micrometer.core.instrument.config.MeterFilter$9@796434a3, io.micrometer.core.instrument.config.MeterFilter$9@1626612d, io.micrometer.core.instrument.config.MeterFilter$9@542aba71]
2024-09-24 15:24:07,904 DEBUG [io.quarkus.micrometer.runtime.MicrometerRecorder] (main) Discovered MeterFilters for class io.micrometer.prometheus.PrometheusMeterRegistry: []
2024-09-24 15:24:07,904 DEBUG [io.quarkus.micrometer.runtime.MicrometerRecorder] (main) Discovered global MeterRegistryCustomizer : []
2024-09-24 15:24:07,905 DEBUG [io.quarkus.micrometer.runtime.MicrometerRecorder] (main) Configuring Micrometer registries : [class io.micrometer.prometheus.PrometheusMeterRegistry]
2024-09-24 15:24:07,905 DEBUG [io.quarkus.micrometer.runtime.MicrometerRecorder] (main) Discovered MeterRegistryCustomizer for class io.micrometer.prometheus.PrometheusMeterRegistry: []
2024-09-24 15:24:07,907 INFO  [com.arjuna.ats.jbossatx] (main) ARJUNA032014: Stopping transaction recovery manager
ERROR: Failed to start server in (production) mode
Error details:
java.util.ConcurrentModificationException
	at java.base/java.util.HashMap$HashIterator.nextNode(HashMap.java:1605)
	at java.base/java.util.HashMap$KeyIterator.next(HashMap.java:1628)
	at java.base/java.util.AbstractCollection.addAll(AbstractCollection.java:337)
	at io.micrometer.core.instrument.MeterRegistry$Config.meterFilter(MeterRegistry.java:832)
	at io.quarkus.micrometer.runtime.MicrometerRecorder.applyMeterFilters(MicrometerRecorder.java:233)
	at io.quarkus.micrometer.runtime.MicrometerRecorder.configureRegistries(MicrometerRecorder.java:89)
	at io.quarkus.deployment.steps.MicrometerProcessor$configureRegistry1996010268.deploy_0(Unknown Source)
	at io.quarkus.deployment.steps.MicrometerProcessor$configureRegistry1996010268.deploy(Unknown Source)
	at io.quarkus.runner.ApplicationImpl.doStart(Unknown Source)
	at io.quarkus.runtime.Application.start(Application.java:101)
	at io.quarkus.runtime.ApplicationLifecycleManager.run(ApplicationLifecycleManager.java:119)
	at io.quarkus.runtime.Quarkus.run(Quarkus.java:71)
	at org.keycloak.quarkus.runtime.KeycloakMain.start(KeycloakMain.java:147)
	at org.keycloak.quarkus.runtime.cli.command.AbstractStartCommand.run(AbstractStartCommand.java:51)
	at picocli.CommandLine.executeUserObject(CommandLine.java:2030)
	at picocli.CommandLine.access$1500(CommandLine.java:148)
	at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2465)
	at picocli.CommandLine$RunLast.handle(CommandLine.java:2457)
	at picocli.CommandLine$RunLast.handle(CommandLine.java:2419)
	at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2277)
	at picocli.CommandLine$RunLast.execute(CommandLine.java:2421)
	at picocli.CommandLine.execute(CommandLine.java:2174)
	at org.keycloak.quarkus.runtime.cli.Picocli.parseAndRun(Picocli.java:138)
	at org.keycloak.quarkus.runtime.KeycloakMain.main(KeycloakMain.java:107)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at io.quarkus.bootstrap.runner.QuarkusEntryPoint.doRun(QuarkusEntryPoint.java:62)
	at io.quarkus.bootstrap.runner.QuarkusEntryPoint.main(QuarkusEntryPoint.java:33)

AFAIK, the bug is on the Micrometer side, but it'd be necessary to upgrade Micrometer and backport it to the Quarkus 3.15 LTS.

The issue is probably touching the addition of meter filters after the creation of Meters described here: micrometer-metrics/micrometer#4920

Issue and PR for the Micrometer: micrometer-metrics/micrometer#5489

Expected behavior

Server properly started

Actual behavior

ConcurrentModificationException thrown during the startup

How to Reproduce?

Sequentially execute Keycloak multiple times (maybe plain Quarkus) with the Quarkus Micrometer extension.

Or execute multiple times this test: https://github.com/keycloak/keycloak/blob/main/quarkus/tests/integration/src/test/java/org/keycloak/it/cli/dist/StartCommandDistTest.java#L134

Output of uname -a or ver

Linux mabartos 6.10.6-100.fc39.x86_64 SMP PREEMPT_DYNAMIC Mon Aug 19 14:35:32 UTC 2024 x86_64 GNU/Linux

Output of java -version

openjdk 21.0.3 2024-04-16 LTS

Quarkus version or git rev

3.14.4

Build tool (ie. output of mvnw --version or gradlew --version)

Apache Maven 3.9.1 (Red Hat 3.9.1-3)

Additional information

No response

@mabartos mabartos added the kind/bug Something isn't working label Sep 24, 2024
Copy link

quarkus-bot bot commented Sep 24, 2024

/cc @brunobat (micrometer), @ebullient (micrometer)

@brunobat
Copy link
Contributor

brunobat commented Sep 24, 2024

Thanks @mabartos, as soon as the release is performed with your fix will create the upgrade and the backport.

@gsmet
Copy link
Member

gsmet commented Sep 24, 2024

What exactly causes us to have some concurrency issue here? I would have expected the setup to not be concurrent and done once and for all.
Wondering if there would be something to adjust in Quarkus (typically by constraining the order of the recorder methods for instance).

@brunobat
Copy link
Contributor

@gsmet this PR explains it: micrometer-metrics/micrometer#5490

@gsmet
Copy link
Member

gsmet commented Sep 24, 2024

Yes, I have seen the PR. But I'm curious as to why we end up accessing this set concurrently. I would have expected the initialization to be done sequentially and then the collection being used.

@geoand
Copy link
Contributor

geoand commented Sep 25, 2024

I wonder if this can be reproduced in a simpler context... This is very suspect...

@brunobat
Copy link
Contributor

They have a reproducer on the Micrometer side.

preFilterIdToMeterMap is being modified (a new entry added) while it is being iterated over.

@gsmet
Copy link
Member

gsmet commented Sep 26, 2024

OK, FWIW, my question was more to know if we could find a workaround while the issue is being addressed on the Micrometer side. But it seems like we need to wait.

@gsmet gsmet added the triage/upstream Used for issues which are caused by issues in upstream projects/dependency label Sep 26, 2024
@vmuzikar
Copy link

Given Micrometer 1.13 was already part of some Quarkus 3.14 releases, I assume it's not possible to temporarily downgrade back to Micrometer 1.12 due to the breaking changes. Correct?

@brunobat
Copy link
Contributor

Sorry @vmuzikar, we cannot downgrade. We'll need to wait for their fix.

@brunobat
Copy link
Contributor

The Micrometer side PR has been merged. Waiting for realease.

@brunobat brunobat moved this from Todo to In Progress in Quarkus Roadmap/Planning Sep 30, 2024
@vmuzikar
Copy link

vmuzikar commented Oct 2, 2024

@brunobat Micrometer with the fix has been released.

@brunobat
Copy link
Contributor

brunobat commented Oct 2, 2024

Will create the PR now

@github-project-automation github-project-automation bot moved this from In Progress to Done in Quarkus Roadmap/Planning Oct 2, 2024
@quarkus-bot quarkus-bot bot added this to the 3.16 - main milestone Oct 2, 2024
@gsmet gsmet modified the milestones: 3.16.0.CR1, 3.15.2 Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/metrics kind/bug Something isn't working triage/upstream Used for issues which are caused by issues in upstream projects/dependency
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

5 participants