-
Notifications
You must be signed in to change notification settings - Fork 846
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace ArrayBlockingQueue with jctools queue. #3034
Changes from 6 commits
0357414
7645a66
c545b2f
3552fbe
73aac08
8573ef7
57e5210
1f22b51
029390f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
plugins { | ||
`java-library` | ||
|
||
id("com.github.johnrengelman.shadow") | ||
} | ||
|
||
// This project is not published, it is bundled into :sdk:trace | ||
|
||
description = "Internal use only - shaded dependencies of OpenTelemetry SDK for Tracing" | ||
extra["moduleName"] = "io.opentelemetry.sdk.trace.internal" | ||
|
||
dependencies { | ||
implementation("org.jctools:jctools-core") | ||
} | ||
|
||
tasks { | ||
shadowJar { | ||
minimize() | ||
|
||
relocate("org.jctools", "io.opentelemetry.internal.shaded.jctools") | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
/* | ||
* Copyright The OpenTelemetry Authors | ||
* SPDX-License-Identifier: Apache-2.0 | ||
*/ | ||
|
||
package io.opentelemetry.sdk.trace.internal; | ||
|
||
import java.util.Queue; | ||
import org.jctools.queues.MessagePassingQueue; | ||
import org.jctools.queues.MpscArrayQueue; | ||
|
||
/** Internal accessor of JCTools package for fast queues. */ | ||
public final class JcTools { | ||
|
||
/** | ||
* Returns a new {@link Queue} appropriate for use with multiple producers and a single consumer. | ||
*/ | ||
public static <T> Queue<T> newMpscArrayQueue(int capacity) { | ||
return new MpscArrayQueue<>(capacity); | ||
} | ||
|
||
/** | ||
* Returns the capacity of the {@link Queue}, which must be a JcTools queue. We cast to the | ||
* implementation so callers do not need to use the shaded classes. | ||
*/ | ||
public static long capacity(Queue<?> queue) { | ||
return ((MessagePassingQueue<?>) queue).capacity(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. you can use MpscArrayQueue right? It does have capacity() There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We're shading it and using this in a test in a different artifact where we don't want to have to reference the shaded class. I could cast to |
||
} | ||
|
||
private JcTools() {} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -42,7 +42,7 @@ public static class BenchmarkState { | |
private long exportedSpans; | ||
private long droppedSpans; | ||
|
||
@Setup(Level.Iteration) | ||
@Setup(Level.Trial) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this will mess up with the exporter metrics.. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since when collecting metrics it clears the current values I think it might be ok. Either way, either this needs to be There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. could you please run this benchmark and verify the exportedSpans metric is valid, i.e; current values do get cleared after collecting them at the end of each iteration. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It does keep on going up now - I don't think we should need to restart the BSP for it though. @jkwatson Do you know a nice way to aggregate the metrics into a rate for reporting in the JMH benchmark? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't understand the question. Is something not working about the way things are before this change? Also, should we be doing a And, honestly, I'm not sure I understand the purpose of this benchmark in the first place. I ran this on the main branch, and for the 20-thread case, we drop almost all of the spans. Is the goal to see if we can make the BSP drop fewer spans if we can improve things? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this change still needs to be reverted @anuraaga There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok I went ahead and changed the shutdown to be per-iteration too then, need one of the changes to make sure threads are closed or JMH complains (at least to me). I don't think we actually wanted to initialize a whole BSP (worker thread, etc) per iteration though. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What complaints did you get from jmh? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Something about threads not all being shut down, waiting XXX seconds for them to shutdown. Which is definitely true since currently we don't call shutdown on every created BSP. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah yes. I had been wondering about that. Good find! |
||
public final void setup() { | ||
sdkMeterProvider = SdkMeterProvider.builder().buildAndRegisterGlobal(); | ||
SpanExporter exporter = new DelayingSpanExporter(delayMs); | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -17,8 +17,10 @@ | |
import io.opentelemetry.sdk.trace.ReadableSpan; | ||
import io.opentelemetry.sdk.trace.SpanProcessor; | ||
import io.opentelemetry.sdk.trace.data.SpanData; | ||
import io.opentelemetry.sdk.trace.internal.JcTools; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Now this comment https://github.com/open-telemetry/opentelemetry-java/blob/main/sdk/trace/src/main/java/io/opentelemetry/sdk/trace/export/BatchSpanProcessor.java#L40 is not relevant any more ! |
||
import java.util.ArrayList; | ||
import java.util.Collections; | ||
import java.util.Queue; | ||
import java.util.concurrent.ArrayBlockingQueue; | ||
import java.util.concurrent.BlockingQueue; | ||
import java.util.concurrent.TimeUnit; | ||
|
@@ -73,7 +75,7 @@ public static BatchSpanProcessorBuilder builder(SpanExporter spanExporter) { | |
scheduleDelayNanos, | ||
maxExportBatchSize, | ||
exporterTimeoutNanos, | ||
new ArrayBlockingQueue<>(maxQueueSize)); | ||
JcTools.newMpscArrayQueue(maxQueueSize)); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note that MpscArrayQueue rounds the queue size to power to 2 for various perf reasons. In my opinion it is better to enforce this so users know what the actual memory that is getting allocated. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you mean falling back to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I meant enforcing the maxQueueSize to be a power of 2. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That we can't do we don't want to lose usability (adding restrictions that can only be conveyed through documentation or error messages) here. Would like to hear more thoughts on whether we should fallback if it's not power-of-2 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Falling back is not great really since it is not an efficient solution. How about calling it out in the documentation that queue size is rounded to the next power of 2? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I added a note to the builder that some more memory may be allocated, without going too much into implementation detail. |
||
Thread workerThread = new DaemonThreadFactory(WORKER_THREAD_NAME).newThread(worker); | ||
workerThread.start(); | ||
} | ||
|
@@ -131,7 +133,8 @@ private static final class Worker implements Runnable { | |
private final long exporterTimeoutNanos; | ||
|
||
private long nextExportTime; | ||
private final BlockingQueue<ReadableSpan> queue; | ||
|
||
private final Queue<ReadableSpan> queue; | ||
// When waiting on the spans queue, exporter thread sets this atomic to the number of more | ||
// spans it needs before doing an export. Writer threads would then wait for the queue to reach | ||
// spansNeeded size before notifying the exporter thread about new entries. | ||
|
@@ -149,7 +152,7 @@ private Worker( | |
long scheduleDelayNanos, | ||
int maxExportBatchSize, | ||
long exporterTimeoutNanos, | ||
BlockingQueue<ReadableSpan> queue) { | ||
Queue<ReadableSpan> queue) { | ||
this.spanExporter = spanExporter; | ||
this.scheduleDelayNanos = scheduleDelayNanos; | ||
this.maxExportBatchSize = maxExportBatchSize; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existing logic is verifying the remainingCapacity(). Do you want to do the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nah, it only used tha tmethod since the JDK only provides that one, but capacity is what we're checking