Prevent duplicate tags in router spans added by dynamic attributes#8865
Prevent duplicate tags in router spans added by dynamic attributes#8865
Conversation
✅ Docs preview has no changesThe preview was not built because there were no changes. Build ID: c020198b5ace5c5d780bc300 |
This comment has been minimized.
This comment has been minimized.
| pub(crate) fn insert(&mut self, kv: KeyValue) { | ||
| self.attributes.push(kv); | ||
| // Replace existing attribute with same key, or add new one | ||
| if let Some(existing) = self.attributes.iter_mut().find(|a| a.key == kv.key) { |
There was a problem hiding this comment.
I'm not sure if there's any point of keeping this as a Vec<KeyValue> at this point, should probably be changed to a HashMap<Key, Value>
There was a problem hiding this comment.
I did some brief testing on this against ahash::HashMap and the results are pretty much as expected:
Comparing Vec<KeyValue> vs HashMap<Key, Value> for deduplication
--- Insert single attributes (10 unique keys) ---
Vec: insert 10 unique 10000 iterations, mean: 458.00ns, min: 125.00ns, max: 37.46µs
HashMap: insert 10 unique 10000 iterations, mean: 769.00ns, min: 250.00ns, max: 78.17µs
--- Insert with duplicates (initial 10 + 6 with 2 duplicates) ---
Vec: initial + dynamic with dupes 10000 iterations, mean: 377.00ns, min: 166.00ns, max: 22.63µs
HashMap: initial + dynamic with dupes 10000 iterations, mean: 642.00ns, min: 333.00ns, max: 35.00µs
--- Extend with unique attributes (batch of 20) ---
Vec: extend 20 unique 10000 iterations, mean: 2.57µs, min: 1.88µs, max: 55.83µs
HashMap: extend 20 unique 10000 iterations, mean: 2.12µs, min: 1.79µs, max: 91.17µs
--- Extend with 50% duplicates (batch of 20, 10 unique keys) ---
Vec: extend 20 with 50% dupes 10000 iterations, mean: 1.77µs, min: 1.54µs, max: 54.17µs
HashMap: extend 20 with 50% dupes 10000 iterations, mean: 1.54µs, min: 1.21µs, max: 125.38µs
--- Large scale: 50 unique attributes ---
Vec: extend 50 unique 10000 iterations, mean: 5.29µs, min: 4.83µs, max: 52.42µs
HashMap: extend 50 unique 10000 iterations, mean: 4.33µs, min: 3.83µs, max: 45.08µs
--- Large scale: 100 attributes, 50% duplicates ---
Vec: extend 100 with 50% dupes 10000 iterations, mean: 10.41µs, min: 9.00µs, max: 876.46µs
HashMap: extend 100 with 50% dupes 10000 iterations, mean: 7.34µs, min: 6.58µs, max: 58.17µs
--- Realistic: pre-populate 10, then extend with 6 (2 dupes) ---
Vec: realistic scenario 10000 iterations, mean: 264.00ns, min: 166.00ns, max: 709.00ns
HashMap: realistic scenario 10000 iterations, mean: 306.00ns, min: 208.00ns, max: 458.00ns
--- Small scale: 3 attributes ---
Vec: insert 3 unique 10000 iterations, mean: 66.00ns, min: 0.00ns, max: 21.42µs
HashMap: insert 3 unique 10000 iterations, mean: 83.00ns, min: 0.00ns, max: 209.00ns
=== Benchmark Complete ===
I think the Vec implementation is worth keeping as the complexity of avoiding duplicates is low
carodewig
left a comment
There was a problem hiding this comment.
Overall looks good - great find! I've added a few minor comments around refactoring but they're not critical
| let method = attrs | ||
| .attributes() | ||
| .iter() | ||
| .find(|kv| kv.key.as_str() == "http.method") | ||
| .unwrap(); | ||
| assert_eq!(method.value.as_str(), Cow::Borrowed("POST")); | ||
|
|
||
| let route = attrs | ||
| .attributes() | ||
| .iter() | ||
| .find(|kv| kv.key.as_str() == "http.route") | ||
| .unwrap(); | ||
| assert_eq!(route.value.as_str(), Cow::Borrowed("/new")); | ||
|
|
||
| let status = attrs | ||
| .attributes() | ||
| .iter() | ||
| .find(|kv| kv.key.as_str() == "http.status") | ||
| .unwrap(); | ||
| assert_eq!(status.value.as_str(), Cow::Borrowed("200")); |
There was a problem hiding this comment.
| let method = attrs | |
| .attributes() | |
| .iter() | |
| .find(|kv| kv.key.as_str() == "http.method") | |
| .unwrap(); | |
| assert_eq!(method.value.as_str(), Cow::Borrowed("POST")); | |
| let route = attrs | |
| .attributes() | |
| .iter() | |
| .find(|kv| kv.key.as_str() == "http.route") | |
| .unwrap(); | |
| assert_eq!(route.value.as_str(), Cow::Borrowed("/new")); | |
| let status = attrs | |
| .attributes() | |
| .iter() | |
| .find(|kv| kv.key.as_str() == "http.status") | |
| .unwrap(); | |
| assert_eq!(status.value.as_str(), Cow::Borrowed("200")); | |
| let expected_kvs = [ | |
| ("http.method", "POST"), | |
| ("http.route", "/new"), | |
| ("http.status", "200"), | |
| ]; | |
| for (key, expected_value) in expected_kvs { | |
| let value = attrs.attributes().iter().find(|kv| kv.key.as_str() == key).unwrap().value; | |
| assert_eq!(value.as_str(), Cow::Borrowed(expected_value), "key = {}", key); | |
| } |
Nit: thoughts on something like this? I don't have super strong feelings about it but do think it's a bit easier to read.
(Would also apply to test_event_attributes_extend_replaces_existing)
| if let Some(existing) = self.attributes.iter_mut().find(|a| a.key == kv.key) { | ||
| *existing = kv; | ||
| } else { | ||
| self.attributes.push(kv); | ||
| } |
There was a problem hiding this comment.
Should this just use self.insert? You could annotate the insert function with #[inline] if the function call overhead is concerning.
There was a problem hiding this comment.
Same comment RE the various uses of self.attributes.iter_mut below - minor, but it would be nice to centralize the logic so that we don't inadvertently reintroduce this bug later
There was a problem hiding this comment.
Good catch, I've introduced the upsert_attribute fn to reduce the duplication
carodewig
left a comment
There was a problem hiding this comment.
That refactor made a big difference IMO - looks great!
When dynamic attributes are added via
SpanDynAttribute::insert,SpanDynAttribute::extend,LogAttributes::insert,LogAttributes::extend,EventAttributes::insertorEventAttributes::extendand the key already exists, it will now replace the existing value to avoid duplicate attributes.Checklist
Complete the checklist (and note appropriate exceptions) before the PR is marked ready-for-review.
Exceptions
Note any exceptions here
Notes
Footnotes
It may be appropriate to bring upcoming changes to the attention of other (impacted) groups. Please endeavour to do this before seeking PR approval. The mechanism for doing this will vary considerably, so use your judgement as to how and when to do this. ↩
Configuration is an important part of many changes. Where applicable please try to document configuration examples. ↩
A lot of (if not most) features benefit from built-in observability and
debug-level logs. Please read this guidance on metrics best-practices. ↩Tick whichever testing boxes are applicable. If you are adding Manual Tests, please document the manual testing (extensively) in the Exceptions. ↩