Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Metrics SDK] Fix hash calculation for nostd::string #2999

Merged
merged 22 commits into from
Jul 17, 2024

Conversation

lalitb
Copy link
Member

@lalitb lalitb commented Jul 10, 2024

Fixes #2997

Changes

Step 1:

Added test to validate the hash creation issue, where the Hash is being calculated on the address of key , and not it's value as below:

TEST(AttributesHashMap, HashWithKeyValueIterable)

 // Create mock KeyValueIterable instances with the same content but different variables
  MockKeyValueIterable attributes1({{make_unique_string("k1"), make_unique_string("v1")},
                                    {make_unique_string("k2"), make_unique_string("v2")}});
  MockKeyValueIterable attributes2({{make_unique_string("k1"), make_unique_string("v1")},
                                    {make_unique_string("k2"), make_unique_string("v2")}});
  MockKeyValueIterable attributes3({{make_unique_string("k1"), make_unique_string("v1")},
                                    {make_unique_string("k2"), make_unique_string("v2")},
                                    {make_unique_string("k3"), make_unique_string("v3")}});

Most existing tests directly use key and value as literals, and these literals share a common address from the read-only data segment. This masked the issue, and it is also the most common use case where metric attributes are statically created and used during measurement recording.

CI failure - https://github.com/open-telemetry/opentelemetry-cpp/actions/runs/9870060037/job/27255004557?pr=2999

[ RUN      ] AttributesHashMap.HashWithKeyValueIterable
/home/runner/work/opentelemetry-cpp/opentelemetry-cpp/sdk/test/metrics/attributes_hashmap_test.cc:142: Failure
Expected equality of these values:
  hash1
    Which is: 7942177087367374725
  hash2
    Which is: 7942177087698536463

/home/runner/work/opentelemetry-cpp/opentelemetry-cpp/sdk/test/metrics/attributes_hashmap_test.cc:144: Failure
Expected equality of these values:
  hash1
    Which is: 7942177087367374725
  hash3
    Which is: 7942177087497598195

Step 2:
The fix is to replace the hash calculation:

GetHash(seed, key.data());

with

GetHash(seed, key);

and so directly use the custom hash implementation for nostd::string_view. Though it comes with a cost, as this custom hash-implementation creates an intermediate string. This needs to be revisited.

For significant contributions please make sure you have completed the following items:

  • CHANGELOG.md updated for non-trivial changes
  • Unit tests have been added
  • Changes in public API reviewed

@lalitb lalitb requested a review from a team July 10, 2024 01:40
Copy link

codecov bot commented Jul 10, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 87.66%. Comparing base (497eaf4) to head (bb92dc5).
Report is 101 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2999      +/-   ##
==========================================
+ Coverage   87.12%   87.66%   +0.55%     
==========================================
  Files         200      190      -10     
  Lines        6109     5858     -251     
==========================================
- Hits         5322     5135     -187     
+ Misses        787      723      -64     
Files Coverage Δ
...clude/opentelemetry/sdk/common/attributemap_hash.h 90.00% <100.00%> (+6.13%) ⬆️

... and 120 files with indirect coverage changes

@marcalff
Copy link
Member

marcalff commented Jul 10, 2024

Is there a way to make sure that GetHash(seed, key); is the same, with:

  • const char *key="foo";
  • std::string_view key{"foo"};
  • nostd::string_view key{"foo"};
  • std::string key{"foo"};

Maybe instantiate the GenHash template explicitly for all string types that matter.

@@ -71,7 +71,7 @@ inline size_t GetHashForAttributeMap(
{
return true;
}
GetHash(seed, key.data());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the GetHash function, could we assert that arg should not be a pointer?

Copy link
Member Author

@lalitb lalitb Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add it, but since this is an internal API, it might not be necessary as long as we ensure it's used correctly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking again, we do need to support the GetHash function with char pointer, as the AttributeValue enum contains it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do need supporter pointer, perhaps create a dedicated specialization for it? Then a difference hash function could be applied to pointers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it is done in the latest commit.

@lalitb
Copy link
Member Author

lalitb commented Jul 10, 2024

Is there a way to make sure that GetHash(seed, key); is the same, with:

  • const char *key="foo";
  • std::string_view key{"foo"};
  • nostd::string_view key{"foo"};
  • std::string key{"foo"};

Maybe instantiate the GenHash template explicitly for all string types that matter.

I believe we need template specialization for const char *, as it is the valid AttributeValue type, rest should work with default hashing implementation, I can update the tests to validate it.

@lalitb lalitb changed the title Add more tests to validate cardinality limit [Metrics SDK] Fix hash calculation for nostd::string Jul 15, 2024
@lalitb lalitb added the pr:please-review This PR is ready for review label Jul 15, 2024
@lalitb
Copy link
Member Author

lalitb commented Jul 15, 2024

bazel-asan is failing. Will check that-

//sdk/test/metrics:attributes_hashmap_test FAILED in 0.5s
/home/runner/.cache/bazel/e6b8d6759295e14b76bc8cb98b604748/execroot/_main/bazel-out/k8-fastbuild-asan/testlogs/sdk/test/metrics/attributes_hashmap_test/test.log

@lalitb lalitb added ok-to-merge The PR is ok to merge (has two approves or raised by a maintainer/approver and has one approve) and removed pr:please-review This PR is ready for review labels Jul 17, 2024
@lalitb lalitb merged commit d8ae09e into open-telemetry:main Jul 17, 2024
51 checks passed
@marcalff
Copy link
Member

LGTM, thanks for the fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ok-to-merge The PR is ok to merge (has two approves or raised by a maintainer/approver and has one approve)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cardinality limit on metrics (otel_metrics_overflow)
3 participants