-
Notifications
You must be signed in to change notification settings - Fork 1.9k
IGNITE-12108 TCP Communication Metrics ported to a new framework. #6814
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| * @param mreg Metrics registry. | ||
| */ | ||
| protected GridAbstractCommunicationClient(int connIdx, @Nullable GridNioMetricsListener metricsLsnr) { | ||
| protected GridAbstractCommunicationClient(int connIdx, @Nullable MetricRegistry mreg) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why mreg can be null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class might be used by components other than TCP SPI. You can see that "GridNioMetricsListener" was nullable as well.
| IgniteBiInClosure<GridNioSession, Integer> msgQueueLsnr, | ||
| boolean readWriteSelectorsAssign, | ||
| @Nullable GridWorkerListener workerLsnr, | ||
| @Nullable MetricRegistry mreg, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why mreg can be null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me please argue with you about this particular constructor =)
I totally forgot that it's a private constructor and only "GridNioServer.Builder#build" invokes it. There's no point in splitting it because no one will invoke other constructor with fewer parameters. Let's leave it as is.
Just in case - GridAbstractCommunicationClient constructor will be fixed as you asked. I't not about arguing for the sake of arguing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, Ilya.
Please, let me know when PR becomes ready to be reviewed.
| return -1; | ||
|
|
||
| return res; | ||
| return (int) mreg.longAdderMetric( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should store LongMetric inside object instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get it, can you please explain this comment in more details? Is this about excessive memory allocation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's about unnecessary Map#get operation.
Every 'mreg.longAdderMetrictranslates toMap#get`.
Why do we need it? We can store the variable, isn't it?
| RECEIVED_MESSAGES_BY_TYPE_METRIC_DESC | ||
| ); | ||
|
|
||
| sentMsgsCntByNodeIdMetricFactory = nodeId -> mmgr.registry(COMMUNICATION_METRICS_GROUP_NAME + "." + nodeId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use MetricUtils#metricName instead of COMMUNICATION_METRICS_GROUP_NAME + "." + nodeId
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
| public static final short HANDSHAKE_WAIT_MSG_TYPE = -28; | ||
|
|
||
| /** Communication metrics group name. */ | ||
| public static final String COMMUNICATION_METRICS_GROUP_NAME = "communication.tcp"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use MetricUtils#metricName instead of "communication.tcp".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
|
|
||
|
|
||
| /** */ | ||
| public static String sentMessagesByTypeMetricName(Short directType) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's declare all constants, first, and methods after.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
|
|
||
| /** */ | ||
| public static String sentMessagesByTypeMetricName(Short directType) { | ||
| return "sentMessagesByType." + directType; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use MetricUtils#metricName instead of this and similar methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
| this.log = rootLog.getLogger(getClass()); | ||
| this.jmx = prepareMBeanServer(); | ||
| this.rsrcProc = new GridResourceProcessor(new GridTestKernalContext(this.log, this.cfg)); | ||
| log = rootLog.getLogger(getClass()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"this." was not necessary so I removed it while adding new field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, revert unnecessary changes.
| this.jmx = jmx; | ||
| this.log = rootLog.getLogger(getClass()); | ||
| this.rsrcProc = new GridResourceProcessor(new GridTestKernalContext(this.log)); | ||
| log = rootLog.getLogger(getClass()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, revert unnecessary changes.
| this.log = log.getLogger(getClass()); | ||
| this.jmx = prepareMBeanServer(); | ||
| this.rsrcProc = new GridResourceProcessor(new GridTestKernalContext(this.log)); | ||
| jmx = prepareMBeanServer(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, revert unnecessary changes.
| new GridResourceLoggerInjector(ctx.config().getGridLogger()); | ||
| injectorByAnnotation[GridResourceIoc.ResourceAnnotation.IGNITE_INSTANCE.ordinal()] = | ||
| new GridResourceBasicInjector<>(ctx.grid()); | ||
| injectorByAnnotation[GridResourceIoc.ResourceAnnotation.METRIC_MANAGER.ordinal()] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need new GridResourceSupplierInjector?
Can we use existing GridResourceBasicInejctor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because supplier injector is lazy. In this particular place metric manager might not be initialized yet.
|
|
||
| this.formatter = formatter; | ||
|
|
||
| sentBytesCntMetric = mreg == null ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mreg can't be null here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, thank you!
| * Gets sent messages count. | ||
| * | ||
| * @return Sent messages count. | ||
| * @deprecated Will be removed in the next major release and replaced with new metrics API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should inform user - how he(or she) should obtain information previously gathered from deprecated method.
Let's write "Use metric 'xxx.yyy.zzz' instead" where xxx.yyy.zzz the name of the metric providing required number.
Please, apply this to other deprecated method comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need you advice about the comment. Are there any plans to cancel JMX support for these values? I didn't think about it long enough. Maybe just remove that Deprecated annotation.
Anyway, specifying some metrics names here would be wrong, CommunicationSpi is interface after all and some implementations may want to use some other specific metric names, or no metrics at all.
| } | ||
|
|
||
| /** */ | ||
| public static String receivedMessagesByTypeMetricName(Short directType) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move this method to TcpCommunicationMetricsListener.
As I can see it used only there.
| public static final String RECEIVED_MESSAGES_BY_NODE_ID_METRIC_DESC = "Total number of messages received by current node from the given node"; | ||
|
|
||
| /** */ | ||
| public static String sentMessagesByTypeMetricName(Short directType) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move this method to TcpCommunicationMetricsListener.
As I can see it used only there.
| } | ||
| }; | ||
| /** Sent bytes count metric.*/ | ||
| private final AtomicLongMetric sentBytesMetric; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep LongAdder as an internal storage for all metrics that was based on it previously.
You can use MetricRegistry#longAdderMetric for it.
|
|
||
| /** Sent bytes count.*/ | ||
| private final LongAdder sentBytesCnt = new LongAdder(); | ||
| /** */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, add javadoc for this variable.
| /** All registered metrics. */ | ||
| private final Set<ThreadMetrics> allMetrics = Collections.newSetFromMap(new ConcurrentHashMap<>()); | ||
| /** */ | ||
| private final Function<Short, AtomicLongMetric> rcvdMsgsCntByTypeMetricFactory; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, add javadoc for this variable.
| @Override protected ThreadMetrics initialValue() { | ||
| ThreadMetrics metrics = new ThreadMetrics(); | ||
| /** */ | ||
| private final Function<Object, AtomicLongMetric> sentMsgsCntByConsistentIdMetricFactory; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, add javadoc for this variable.
| ConcurrentHashMap<UUID, AtomicLong> rcvdMsgsMetricsByNodeId = new ConcurrentHashMap<>(); | ||
|
|
||
| /** Sent messages count metrics grouped by message node consistent id. */ | ||
| ConcurrentHashMap<Object, AtomicLongMetric> sentMsgsMetricsByConsistentId = new ConcurrentHashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we should change the way we collect metrics in this PR.
Let's keep existing behaviour?
This also will simplify PR a lot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed that because in old implementation we stored basically the same map in each thread separately. That's the obvious answer.
The other thing is that old implementation had concurrency issues that no one cared about. Like broken values visibility or potential ConcurrentModificationExceptions. I wanted new implementation to be correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You introduced metrics based on node consistency id.
I think we should avoid introducing new metrics here.
The only goal of this ticket is to migrate existing metrics to the new framework.
Let's
- remove metrics based on consistency id.
- revert changes in
ConnectionKeyandTcpCommunicationSpithat made to support it.
|
@ibessonov Thanks for the PR update. Looks much better now! |
| import org.apache.ignite.plugin.extensions.communication.MessageFormatter; | ||
| import org.jetbrains.annotations.Nullable; | ||
|
|
||
| import static org.apache.ignite.internal.util.nio.GridNioServer.SENT_BYTES_METRIC_DESC; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: unused import.
|
|
||
| /** | ||
| * Resets metrics for this SPI instance. | ||
| * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: unnecessary change.
All other LGTM. Thanks! |
|
@ibessonov As we discussed privately we should resolve potential contention on We also should perform a yardstick check of these changes. |
| private static synchronized MetricRegistry getOrCreateMetricRegistry(GridMetricManager mmgr, UUID nodeId) { | ||
| String regName = MetricUtils.metricName(COMMUNICATION_METRICS_GROUP_NAME, nodeId.toString()); | ||
|
|
||
| for (MetricRegistry mreg : mmgr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can eliminate iteration on all registries if we use GridMetricManager#addMetricRegistryCreationListener to implement registry initialization logic.
mmgr.addMetricRegistryCreationListener(mreg -> {
if (!mreg.name().startsWith(COMMUNICATION_METRICS_GROUP_NAME))
return;
mreg.longAdderMetric(SENT_MESSAGES_BY_NODE_ID_METRIC_NAME, SENT_MESSAGES_BY_NODE_ID_METRIC_DESC);
mreg.longAdderMetric(RECEIVED_MESSAGES_BY_NODE_ID_METRIC_NAME, RECEIVED_MESSAGES_BY_NODE_ID_METRIC_DESC);
});
| res.put(typeName, ((LongMetric)metric).value()); | ||
| } | ||
| } | ||
| catch (NumberFormatException ignore) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's print the warning to log here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll just remove it, it's excessive
...e/src/main/java/org/apache/ignite/spi/communication/tcp/TcpCommunicationMetricsListener.java
Show resolved
Hide resolved
|
|
||
| res.put(nodeId, mreg.<LongMetric>findMetric(metricName).value()); | ||
| } | ||
| catch (IllegalArgumentException ignore) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't swallow exception here. Let's remove empty catch block.
…cks; metrics registration moved to listeners.
|
LGTM. @ibessonov Thank you, so much for the valuable contribution! |
No description provided.