Skip to content

metrics: Remove labels from gauge metrics#4606

Merged
algorandskiy merged 1 commit into
algorand:masterfrom
iansuvak:remove_labels_from_gauges
Oct 6, 2022
Merged

metrics: Remove labels from gauge metrics#4606
algorandskiy merged 1 commit into
algorand:masterfrom
iansuvak:remove_labels_from_gauges

Conversation

@iansuvak
Copy link
Copy Markdown
Contributor

@iansuvak iansuvak commented Sep 29, 2022

Summary

Gauges have optional, currently unused, arguments allowing user to set labels on multiple values per single gauge struct. This has potential to leak memory if unbound datasets were used as labels.

Closes #3040

Test Plan

Current tests pass as well as the new modified one. The format of a line of output from WriteMetric is the same except for the labels added by the test:

old:
metric_test_name1{pid="5824",host="algo-suvak-mbp.lan",host_name="host_one",session_id="AFX-229",pid="123",data_host="host0"}:150
new
gauge_0{host_name="host_one",session_id="AFX-229",pid="5229",host="algo-suvak-mbp.lan"}:606

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Sep 29, 2022

CLA assistant check
All committers have signed the CLA.

@iansuvak iansuvak changed the title Remove labels from gauge metrics Bug-Fix: Remove labels from gauge metrics Sep 29, 2022
@algorandskiy algorandskiy requested a review from winder September 29, 2022 13:37
@algorandskiy algorandskiy changed the title Bug-Fix: Remove labels from gauge metrics metrics: Remove labels from gauge metrics Sep 29, 2022
@codecov
Copy link
Copy Markdown

codecov Bot commented Sep 29, 2022

Codecov Report

Merging #4606 (8436b1c) into master (5be155b) will increase coverage by 0.03%.
The diff coverage is 91.30%.

@@            Coverage Diff             @@
##           master    #4606      +/-   ##
==========================================
+ Coverage   54.20%   54.23%   +0.03%     
==========================================
  Files         402      402              
  Lines       51830    51789      -41     
==========================================
- Hits        28096    28090       -6     
+ Misses      21370    21339      -31     
+ Partials     2364     2360       -4     
Impacted Files Coverage Δ
node/node.go 4.40% <0.00%> (ø)
util/metrics/gauge.go 81.08% <91.66%> (+15.69%) ⬆️
ledger/metrics.go 100.00% <100.00%> (ø)
network/wsNetwork.go 65.10% <100.00%> (+0.28%) ⬆️
catchup/service.go 68.88% <0.00%> (ø)
ledger/tracker.go 74.89% <0.00%> (+0.85%) ⬆️
agreement/proposalManager.go 98.03% <0.00%> (+1.96%) ⬆️
agreement/cryptoVerifier.go 69.71% <0.00%> (+2.11%) ⬆️
... and 1 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@iansuvak
Copy link
Copy Markdown
Contributor Author

Implemented a PR as suggested in the issue with removing labels functionality from gauges but not convinced that this is the right approach. This issue was filed as tech debt following #2508 which removed expiration functionality from gauges which could trigger memory leaks if used with unbound label dataset but the same functionality exists and is actually used in counters.go without expiry either.

My suggestion is to discard the issue and file a new one which would deal with both potential sources of memory leaks in a manner that doesn't remove functionality.

Gauges had optional, unusued, arguments allowing user to set labels on multiple values per single gauge struct. This had potential to leak memory if unbound datasets were used as labels.
@iansuvak iansuvak force-pushed the remove_labels_from_gauges branch from 2a4adc1 to 8436b1c Compare September 29, 2022 14:05
@iansuvak
Copy link
Copy Markdown
Contributor Author

Perhaps a better solution would be to allow labels but require labels to be associated with a enum bitmask so that no unbounded labels are allowed across both gauges and counters?

@cce cce removed the Bug-Fix label Sep 30, 2022
@winder winder requested a review from brianolson September 30, 2022 20:48
Copy link
Copy Markdown
Contributor

@winder winder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine to me. I tagged Brian, since he's actually using these gauges.

Copy link
Copy Markdown
Contributor

@brianolson brianolson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! For bonus points, note that all calls to Set() (there are no calls to Add()) are casting to float64 and that's weird. If they took int64 or uint64 we could use sync/atomic which is faster than mutex.Lock()/Unlock()

@iansuvak
Copy link
Copy Markdown
Contributor Author

iansuvak commented Oct 3, 2022

Yes! For bonus points, note that all calls to Set() (there are no calls to Add()) are casting to float64 and that's weird. If they took int64 or uint64 we could use sync/atomic which is faster than mutex.Lock()/Unlock()

I can do it but Registry.metrics expectsmap[string]float64 I would have to do the cast and mutex locking before writing the metrics out in that case. Should I go ahead and do it? and I assume we should be using uint64 since one of the gauges notes the round number which is uint64

@brianolson
Copy link
Copy Markdown
Contributor

I think the bonus round of making gauges faster through sync/atomic should wait for a second PR and this one should just go in as is.

@algorandskiy algorandskiy merged commit c5ac7a8 into algorand:master Oct 6, 2022
@iansuvak iansuvak deleted the remove_labels_from_gauges branch November 1, 2022 17:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Memory leak in unused metric label feature

6 participants