Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add unit semantic convention generator #21

Merged
merged 12 commits into from
Dec 14, 2020

Conversation

justinfoote
Copy link
Member

This PR adds a new semantic convention type: units.
There will only be the one table of units in the spec, but because we're expecting every language to create common constants for metric instrument units (see: #1177, I'd like to make it as easy as possible to generate the constants in code.

The unit yaml looks something like this:

groups:
  - id: units
    type: units 
    brief: Specification of commonly used units.
    members:
    - id: percent
      brief: fraction of a total
      value: "%"
    - id: nanosecond
      brief: time
      value: NS
    - id: connections
      brief: connections
      value: "{connections}"

And the resulting markdown looks like this:

Name Kind of Quantity Unit String
percent fraction of a total %
nanosecond time NS
connections connections {connections}

There are a few concepts to check out:

  • I've created four new classes to define out (currently four) semantic convention types (span, resource, metric, unit). These classes share yaml validation logic, but they each define their own allowed/required keys. The SemanticConvention base class acts as a factory for the four subclasses.
  • I removed the @dataclass annotation from SemanticConvention, and added a custom constructor that takes a single yaml node as an argument.

I've added tests for the new functionality, and I've manually run the markdown generator against the opentelemetry-specification repo, and it doesn't have any side-effects.

@justinfoote justinfoote requested a review from a team November 5, 2020 00:31
Copy link
Member Author

@justinfoote justinfoote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thisthat - I would especially your feedback on this PR. I see that you and I don't have similar styles, but I tried to match the style of the project at least a little.
I have some future changes in mind for the metrics work.

convention_type.validate_keys(group)
model = convention_type(group)
# Also, validate that the value of the fields is acceptable
model.validate_values()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines appear to have gotten much simpler, but the complexity is moved from here to the constructor for the class that's returned from the parse_semantic_convention_type function.

This is because the required parameters for each type of convention is different, and passing a single yaml node to the constructor allows the subclasses to own their unique implementations.

else: # No parent, sort of current attributes
if parent_attributes or semconv.attributes:
semconv.attrs_by_name = parent_attributes
elif semconv.attributes: # No parent, sort of current attributes
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is added so we only try to sort attributes for semantic conventions that actually have attributes (spans and resources).

@@ -48,3 +48,34 @@ def check_no_missing_keys(yaml, mandatory):
position = yaml.lc.data[list(yaml)[0]]
msg = "Missing keys: {}".format(missing)
raise ValidationError.from_yaml_pos(position, msg)


class ValidatableYamlNode:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the base class for SemanticConvention (and also UnitMember). In the future, I expect it to be the base class for MetricLabel.
It owns validation of keys (that all the mandatory keys are present in a yaml node, and that only allowed keys are present), and the validation of values (that the values of the constructed object are allowed).

It may be worth making SemanticAttribute derive from this class, but I didn't do that work now.

@@ -36,7 +35,8 @@ def testRef(self):
renderer._render_single_file(content, md, output)
with open(self.load_file("markdown/ref/expected.md"), "r") as markdown:
expected = markdown.read()
self.assertEqualWithDiff(expected, output.getvalue())

assert output.getvalue() == expected
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I almost refactored this whole class, but I settled for just using pytest style assertions, which give much more usable messages (in my opinion). Luckily, it plays well with unittest. :)

Copy link
Member

@thisthat thisthat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for your great contribution! 🚀
I noticed that we have different styles but I believe the more we work on this tool, the faster we will converge to a common one :)


@property
def attributes(self):
return list(self.attrs_by_name.values())
return []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should never be called since we expect subclasses to overwrite this method, right?
If so, I would throw an exception to indicate that something fishy is happening.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you intend to go this way, consider using https://docs.python.org/3/library/abc.html#abc.abstractmethod and the abc module.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is used downstream in the SemanticConventionSet to sort attributes. That refactor felt a little too big and rabbit-holey for this PR, and there's a proposal (#1113) in flight to standardize on "attributes" for metrics as well, so I thought we mightend up undoing the refactor anyway in the very near future.

Copy link
Member

@aabmass aabmass left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

Comment on lines -53 to -55
SPAN = 1
RESOURCE = 2
METRIC = 3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not keep this instead of bare strings here and in main.py?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to have this map return something that can be used (like the correct class for the semantic convention). I don't see the value in having the bare strings mapped to integers.

Copy link
Member

@Oberon00 Oberon00 Nov 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Posted my comment about this in the wrong place, sorry: #21 (review)

Please don't remove the SemanticConventionType enum. The integer values don't matter, but the enum is safer against typos. Even if we don't use static checking like pylint or mypy (yet), we at least would get a runtime error if we type SemanticConventionType.SAN whereas with == "san" we would just get False.

Also the enum is a nice documentation of the possible types.

Comment on lines 126 to 127
convention_type.validate_keys(group)
model = convention_type(group)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the convention class validate itself in the constructor so you don't have to remember to validate separately?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered this. The logical place to put this would be in ValidatableYamlNode.__init__, because that superclass owns the validation logic. ...but, the subclasses are likely adding attributes in their own init methods, so the correct place to call it would be after the superclass and subclass init methods had completed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe in a factory then?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved the logic to a factory function.
This triggered some renaming (so I wouldn't have a naming collision between def SemanticConvention and class SemanticConvention), which had some cascading effects.
I removed several import SemanticConvention lines where we were only using the imported class for type declarations, and I refactored references to SemanticConvention.parse to a new parse_semantic_convention_groups function (and in the process I split the parsing concern away from the data struct concern).

Copy link
Member

@Oberon00 Oberon00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't remove the SemanticConventionType enum. The integer values don't matter, but the enum is safer against typos. Even if we don't use static checking like pylint or mypy (yet), we at least would get a runtime error if we type SemanticConventionType.SAN whereas with == "san" we would just get False.

Also the enum is a nice documentation of the possible types.

@Oberon00
Copy link
Member

I think this PR is too huge to properly review it. It seems to be almost a complete rewrite. If others are fine with still merging this, I won't object, personally I would be much happier with smaller pieces if possible (e.g. only have the base class refactoring in a PR, without the addition of Unit, command line interface changes, etc).

But I think we should decide on this PR soon, as having a huge refactoring looming might discourage any other changes, as they would lead to merge conflicts.

@jmacd
Copy link

jmacd commented Dec 2, 2020

I can't really review this either, but would be happy to see it merged presuming the output is good and others who have looked at it agree.

@arminru
Copy link
Member

arminru commented Dec 9, 2020

Same as @jmacd.

Both @aabmass and @thisthat already reviewed and approved it - thanks for that! If no one objects by tomorrow, let's merge it.

cc @open-telemetry/specs-approvers

@justinfoote
Copy link
Member Author

@open-telemetry/specs-approvers As @arminru suggested above, let's go ahead and merge.

@arminru
Copy link
Member

arminru commented Dec 14, 2020

Thanks for your contribution and for sticking it out, @justinfoote!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants