Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] Expose jmx endpoint for sentinel #1947

Closed
wants to merge 16 commits into from

Conversation

brotherlu-xcq
Copy link
Collaborator

@brotherlu-xcq brotherlu-xcq commented Jan 6, 2021

Describe what this PR does / why we need it

expose the metric in Sentinel as jmx bean, which can make the Sentinel metric collected by popular monitor tools like prometheus.

Does this pull request fix one issue?

Fixs #1814

Describe how you did it

Describe how to verify it

visit the resource which control by Sentinel, the open your JMX tool. you will find the metric info like below.
image
the root type is Sentinel. the second type is your sentinel application name, the JMX bean name is you resource name. the metric data is same with the MetricNode which is writed into metric file.

Special notes for reviews

@sczyh30 sczyh30 added the area/metrics Issues or PRs related to metrics and monitoring label Jan 6, 2021
@sczyh30 sczyh30 requested review from zhaoyuguang, a team, sczyh30 and jasonjoo2010 January 6, 2021 06:53
@sczyh30 sczyh30 added the kind/feature Category issues or prs related to feature request. label Jan 15, 2021
@sczyh30 sczyh30 requested a review from cdfive January 15, 2021 15:20
Comment on lines 59 to 67
private static final ScheduledExecutorService SCHEDULER = Executors.newScheduledThreadPool(1,
private static final ScheduledExecutorService SCHEDULER = Executors.newScheduledThreadPool(2,
new NamedThreadFactory("sentinel-metrics-record-task", true));

static {
flowRules.set(Collections.<String, List<FlowRule>>emptyMap());
currentProperty.addListener(LISTENER);
startMetricTimerListener();
SCHEDULER.scheduleAtFixedRate(new MetricBeanTimerListener(), 0, 1, TimeUnit.SECONDS);
}
Copy link
Collaborator

@cdfive cdfive Jan 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now there are two thread to execute, node.metrics() is called separately in MetricBeanTimerListener and MetricTimerListener, is there data inaccuracy in multithreading?
Tried with FlowQpsDemo, the PassQps,BlockQps are always 0 in jconsole.

com.alibaba.csp.sentinel.node.metric.MetricWriter#write
com.alibaba.csp.sentinel.node.metric.jmx.MetricBeanWriter#write
When the two operations are put in a same thread, it works fine and got the expected result in jconsole.

Maybe that's the reason, as the comment mentioned in com.alibaba.csp.sentinel.node.StatisticNode#metrics,
The fetch operation is thread-safe under a single-thread scheduler pool.

@Override
public Map<Long, MetricNode> metrics() {
    // The fetch operation is thread-safe under a single-thread scheduler pool.
    long currentTime = TimeUtil.currentTimeMillis();
    currentTime = currentTime - currentTime % 1000;
    ...

    return metrics;
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you to review my code, I will check the codes and sovle it.

@@ -113,7 +113,7 @@
private long lastFetchTime = -1;

@Override
public Map<Long, MetricNode> metrics() {
public synchronized Map<Long, MetricNode> metrics() {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cdfive because the two timer have different bussiness scenes, so I am not put them in one timer and add the synchronized to ensure the thread-safe. could you please help to review this when you you are free? I ran the local test and can find the data in jvm tool.

Copy link
Collaborator

@cdfive cdfive Jan 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I run the FlowQpsDemo, and start jconsole, still can't find Sentinel node in jconsole UI. Then I make a breakpoint at com.alibaba.csp.sentinel.node.metric.jmx.MetricBeanTimerListener#aggregate, never entered.
Then make another breakapoint at com.alibaba.csp.sentinel.node.metric.MetricTimerListener#aggregate,
it entered, and resume running, the first breakpoint entered, and Sentinel node appeared, but the the PassQps,BlockQps are always 0.

I wonder that once the StatisticNode#metrics have been invoked, the data will be changed.

If comment the startMetricBeanTimerListener out and merge the logic of MetricBeanTimerListener#run into MetricTimerListener#run and only invoke StatisticNode#metrics once, can get the expected result in jconsole.

could you please try the demo and give some help? @sczyh30 @brothelul

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try the demo and re-produce the problem you met again, thank you so much!

@brotherlu-xcq brotherlu-xcq mentioned this pull request Jan 21, 2021
@@ -40,11 +40,12 @@ public void run() {

private List<MetricNode> getLastMetrics(ClusterNode node) {
final long currentTime = TimeUtil.currentTimeMillis();
final long minTime = currentTime - currentTime % 1000;
final long maxTime = currentTime - currentTime % 1000;
final long minTime = maxTime - 1000;
return node.rawMetricsInMin(new Predicate<Long>() {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cdfive I change the poolSize of SCHEDULER into 1, so the two listener will run one by one in thread-safe. And I use the method rawMetricsInMin instead of metrics to get the metrics in last sencod so that the JMX will be same as metrics in file.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cdfive any idea about this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brothelul Nice work! I can get the expected result in jconsole now.

Comment on lines 70 to 77
/**
* start the MetricBeanTimerListener at the fix rate, the initialDelay is different with the {@link #startMetricTimerListener()},
* so that the two task will start running at different time.
*/
private static void startMetricBeanTimerListener() {
SCHEDULER.scheduleAtFixedRate(new MetricBeanTimerListener(), 0, 1, TimeUnit.SECONDS);
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found that we can config the interval with csp.sentinel.metric.flush.interval in startMetricTimerListener,
but startMetricBeanTimerListener has one second fixed interval. Do we need to make it configurable? At present, this function is enabled by default, maybe some users want to disable it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that's a good idea! I will change it later.

@cdfive
Copy link
Collaborator

cdfive commented Feb 2, 2021

Nice work! With this jmx endpoint, it will be helpful when collecting monitor data.
Any other idea? @sczyh30 @jasonjoo2010

BTW, the issue which this PR is to fix seems not corresponding. Maybe another issue? @brothelul

Does this pull request fix one issue?
Fixs #1848

@brotherlu-xcq
Copy link
Collaborator Author

yes, it's #1814, not #1848. I changed it.

@sczyh30 sczyh30 added this to the 1.8.2 milestone Feb 3, 2021
Set<String> existResource = new HashSet<>();
// set or update the new value
for (MetricNode metricNode : map.values()) {
final String mBeanName = "Sentinel:type=" + appName + ",name=\"" + metricNode.getResource() +"\"";

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is recommended to add classification here, so that Prometheus can search according to classification.
ex:
final String mBeanName = "Sentinel:type=" + appName + ",name=\"" + metricNode.getResource() +"\"" + ",classification=\"" + metricNode.getClassification() +"\"";

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sczyh30 @cdfive Do you think so? I am not clear about the work of Prometheus?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has been compiled by me for production environment. 😄 You can refer to the mbean of tomcat, which contains the port number.

return;
}
for (MetricBean metricBean : metricBeans) {
final String mBeanName = "Sentinel:type=" + appName + ",name=\"" + metricBean.getResource() +"\"";

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here as above.
ex:
final String mBeanName = "Sentinel:type=" + appName + ",name=\"" + metricBean.getResource() +"\"" + ",classification=\"" + metricBean.getClassification() +"\"";

@sczyh30
Copy link
Member

sczyh30 commented Jun 24, 2021

Nice work. IMHO, how about make it an extension module (users could enable it by just adding a dependency of the extension module)? This could decouple metrics exporters from FlowRuleManager and core (actually initializing it in static block of FlowRuleManager is not a good design...)

@SuppressWarnings("PMD.ThreadPoolCreationRule")
private static final ScheduledExecutorService SCHEDULER = Executors.newScheduledThreadPool(1,
new NamedThreadFactory("sentinel-metrics-record-task", true));

static {
flowRules.set(Collections.<String, List<FlowRule>>emptyMap());
currentProperty.addListener(LISTENER);
startMetricBeanTimerListener();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could initialize it via InitFunc SPI mechanism?

@brotherlu-xcq
Copy link
Collaborator Author

Nice work. IMHO, how about make it an extension module (users could enable it by just adding a dependency of the extension module)? This could decouple metrics exporters from FlowRuleManager and core (actually initializing it in static block of FlowRuleManager is not a good design...)

I am really agree with you that export the metric function to other module instead of in core module. I want still work on this feature, and make a design first about it.

@sczyh30 sczyh30 modified the milestones: 1.8.2, 1.8.3 Jun 28, 2021
@brotherlu-xcq
Copy link
Collaborator Author

This function need some change after we discussed. please do not merge it before I re-submit the new implement.

@sczyh30 sczyh30 removed this from the 1.8.3 milestone Sep 15, 2021
@sczyh30
Copy link
Member

sczyh30 commented Sep 15, 2021

#2275 has been merged

@sczyh30 sczyh30 closed this Sep 15, 2021
@brotherlu-xcq brotherlu-xcq deleted the feature-#1848 branch September 15, 2021 11:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/metrics Issues or PRs related to metrics and monitoring kind/feature Category issues or prs related to feature request.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants