Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding semantic-convention attributes for resources #872

Merged
merged 8 commits into from
Jun 24, 2021

Conversation

lalitb
Copy link
Member

@lalitb lalitb commented Jun 22, 2021

This PR adds semantics as defined here:
https://github.com/open-telemetry/opentelemetry-specification/tree/main/specification/resource/semantic_conventions

This will let us use uniform attributes/events/exceptions names across services.

For significant contributions please make sure you have completed the following items:

  • CHANGELOG.md updated for non-trivial changes
  • Unit tests have been added
  • Changes in public API reviewed

@lalitb lalitb requested a review from a team June 22, 2021 10:27
@lalitb
Copy link
Member Author

lalitb commented Jun 22, 2021

@maxgolov @ThomsonTan - Semantic convention for the Resources is also done similar to that for traces in #868 , ie using constexpr static methods for class defined in header file. Though this is sdk, and we have the flexibility to define it within the source file, and reference it through extern, I still felt that the constexpr method approach is better for the reasons mentioned here: #868 (comment).

Feel free to comment on a better approach to follow here and/or for traces. thanks.

@codecov
Copy link

codecov bot commented Jun 22, 2021

Codecov Report

Merging #872 (98c8b14) into main (2fd7bd2) will decrease coverage by 0.07%.
The diff coverage is 96.67%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #872      +/-   ##
==========================================
- Coverage   95.50%   95.44%   -0.06%     
==========================================
  Files         156      157       +1     
  Lines        6620     6633      +13     
==========================================
+ Hits         6322     6330       +8     
- Misses        298      303       +5     
Impacted Files Coverage Δ
api/include/opentelemetry/common/string_util.h 100.00% <ø> (ø)
sdk/src/resource/resource.cc 80.00% <83.34%> (-2.60%) ⬇️
.../opentelemetry/sdk/resource/semantic_conventions.h 100.00% <100.00%> (ø)
sdk/test/resource/resource_test.cc 93.46% <100.00%> (-3.48%) ⬇️

{"telemetry.sdk.name", "opentelemetry"},
{"telemetry.sdk.version", OPENTELEMETRY_SDK_VERSION},
{"service.name", "unknown_service"}};
{SemanticConventions::GetAttributeTelemetrySdkLanguage(), "cpp"},
Copy link
Contributor

@maxgolov maxgolov Jun 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I honestly would prefer an API model similar to Qt5 "translation resources", which solves the same problem - just in a different knowledge domain. More on that here.

To elaborate:

  • let's say you have a namespace called MyAttributeNames , and either constexpr or extern that looks like Qt5 tr(name) ... Let's name it attr(name)

  • then in order to obtain the actual name for a given system / naming convention, instead of SemanticConventions::GetAttributeTelemetrySdkLanguage() , you'd write attr("sdk.language") .... Moreover, since you are not gonna be using all attributes at once, it is Okay to declare these as literals, e.g. constexpr const char[] OTEL_ATTR_SDK_LANGUAGE = "sdk.language"; - so the overhead of having these as static inline constexpr is pay-per-play. You'd only use what you use. And in msvc you can use string pooling to merge identical strings.

That way instead of having N getters, which will also present a challenge - should you ever need to project this C++ code to C#, Python or Javascript/Node, etc... you'd have ONE attribute semantic convention "translation" method - called AttributeNames::attr(const char * name) and you'd invoke it as:

using namespace AttributeNames;
...
{attr(OTEL_ATTR_SDK_LANGUAGE), "cpp"}
... // or
{attr("sdk.language"), "cpp"}

Then we should strive the transform to be done at compile time, without generating those 50 methods. It does not scale. It'd scale nicely if you have one method / one utility function, hopefully constexpr. And a table of literals, resources in a classic definition of this world in C++ (as in translation resources, resource compiler, etc.).

Again, tr() from Qt5 seems like the most reasonable way to achieve that.

If you want to enforce , such as - if attribute has not been found in the lookup table, fail compilation (not allowing users to pass invalid semantic convention parameters) - we can use a constexpr hash function that calculates a hash of an argument string. And verifies that the value passed to attr is one of allowed values. You can also do this with constexpr, if you pre-generate the table. Which you should generate anyways.

The model with attr() would also scale nicely, should there be new conventions or new attributes added. Polluting namespace with 50 getters just won't scale.


public:
// service attributes
OTEL_RESOURCE_GENERATE_SEMCONV_METHOD(ServiceName, "service.name");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can do this instead. Take this compile-time hash function:

/// <summary>
/// Compile-time constexpr djb2 hash function for strings
/// </summary>
static constexpr uint32_t hashCode(const char *str, uint32_t h = 0)
{
  return (uint32_t)(!str[h] ? 5381 : ((uint32_t)hashCode(str, h + 1) * (uint32_t)33) ^ str[h]);
}

#define CONST_UINT32_T(x) std::integral_constant<uint32_t, (uint32_t)x>::value

#define CONST_HASHCODE(name) CONST_UINT32_T(OPENTELEMETRY_NAMESPACE::utils::hashCode(#name))

Then for every attribute, generate a table of values. At compile time - by listing all of them. Sorry for using this example - something like this:

// Hash function below is computing module tag from literal at compile-time using C++11 constexpr
#define DBG_MODULE_CTAG(name)       CONST_UINT32_T(djb2_constexpr_hash(#name))
#define DBG_MODULE_TAG(name)        djb2_hash(name)
#define DBG_LOG_NAMETAG(name)       { DBG_MODULE_CTAG(name) , #name } ,

So - you created a compile time table that maps your attribute names to unique IDs. You can still refer to original attributes by their aliases. Like ATTR_ServiceName, macro-magic permitting.

The goal is that you can do tricks like switch-case, constexpr, and mapping from attribute to unique attribute id. Then instead of many methods, you end up with attr(name) or attr(id). That's how Qt5 tr() works, and how translations / resources work in Android pretty much. Except that tr() requires string literal, and Android resources are all revolving around getResource(id).

In our model, I think, it'd be more appropriate to perform compile-time mapping from attr("my.attribute.name") -> returns actual attribute name within given semantic convention. Not many getters. One getter. Lookup table. And preferably, compile-time lookup, not method call.

{"telemetry.sdk.name", "opentelemetry"},
{"telemetry.sdk.version", OPENTELEMETRY_SDK_VERSION},
{"service.name", "backend"},
{SemanticConventions::GetAttributeTelemetrySdkLanguage(), "cpp"},
Copy link
Contributor

@maxgolov maxgolov Jun 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually to elaborate on my idea - you probably do not even need to change anything. Keep passing original string literals. Maybe replace them by constants, such as OTEL_ATTR_SDK_LANG. That's it. And even the original form of constant - should be fine.

Then we have an assignment operator in the ResourceAttributes implementation, that could have done the semantic transformation depending on what exporter / system is being used. Then you'd just completely avoid calling getters. You'd pre-configure your ResourceAttributes by specifying destination convention (like, imagine - your ResourceAttributes class can translate from English to French, and you pass in English keys, but they get magically translated to French, because your system "locale" is French).

If you need concrete example, I can send some WIP code to illustrate the model?

Copy link
Member Author

@lalitb lalitb Jun 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @maxgolov for the elaboration.

Copy link
Contributor

@maxgolov maxgolov Jun 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can definitely avoid getters. OR - at least, it all boils down to a single getter method (attr(...) or GetResourceName(...) or resName(...)) - that allows to transform from schema-agnostic common attribute name or attribute id to semantic- / schema- specific attribute name.

I feel like more concrete code example (based off above constexpr hash calculation) would be helpful?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maxgolov I have done the changes (hopefully) as per your suggestion :)
Have kept both the alternatives for now so that we can compare both and finalize.
semantic_conventions.h -> the suggested change.
semantic_conventions_old.h -> Initial implementation

Copy link
Contributor

@maxgolov maxgolov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thank you for implementing it that way. This implementation is in alignment with what I strongly believe is reasonable way of doing it! ❤️ We may think about generating the table automatically from schema in a separate follow-up PR, later when this is needed. Manually changing it as a table is also looking much cleaner.

@lalitb
Copy link
Member Author

lalitb commented Jun 23, 2021

Yes, thank you for implementing it that way. This implementation is in alignment with what I strongly believe is reasonable way of doing it! ❤️ We may think about generating the table automatically from schema in a separate follow-up PR, later when this is needed. Manually changing it as a table is also looking much cleaner.

Make sense. Will do some cleanup in this PR - removing old semantic code, and move the hash function to api so that it can be utilized for both traces and resources. And then modify PR #868 similarly.

@lalitb
Copy link
Member Author

lalitb commented Jun 24, 2021

@ThomsonTan - I have resolved the comments from you. I hope it's fine to merge it, as I need to update #868 PR on top of these changes. If you have some other comments, I can take them in separately.

@lalitb lalitb merged commit ae6d810 into open-telemetry:main Jun 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants