Add requirement section specification #1211

ocelotl · 2020-11-09T06:17:17Z

Changes

Adds a new way of specifying requirements. More details here. Already added requirement sections for the metrics api and sdk md files, more can be added later.

Oberon00

The requirement matrix is supposed to (mostly) solve this: https://github.com/open-telemetry/opentelemetry-specification/blob/master/spec-compliance-matrix.md I guess it would be important to add links from the rows to the corresponding spec item though.

While I think a bit smaller linkable sections would be nice, I think in the current form, this PR makes the spec harder to read (and write).

In my opinion, if we want such a highly structured format, Markdown is not really suitable and we would have to move to something else, e.g. Asciidoc, ReStructuredText, YAML and move the spec from being viewable as-is on GH to be displayed on GH pages after some generation process.

ocelotl · 2020-11-11T04:05:21Z

The requirement matrix is supposed to (mostly) solve this: https://github.com/open-telemetry/opentelemetry-specification/blob/master/spec-compliance-matrix.md I guess it would be important to add links from the rows to the corresponding spec item though.

I don't think the requirement matrix solves the issue this PR solves. How does the requirement matrix handle the different RFC 2119 keywords that can show up in the specification? That matrix it says nothing of the feature in the implementation being mandatory (I mean a MUST requirement) or not (I mean a MAY requirement). This makes it not possible for the reader to know how actually complete an implementation is because in a certain implementation, a MAY requirement can be marked as + while a MUST requirement can be marked as - but the requirement matrix reader can't tell that the implementation lacks a mandatory requirement because that information is not present there.

While I think a bit smaller linkable sections would be nice, I think in the current form, this PR makes the spec harder to read (and write).

I think it makes the specification easier to read. For example, let's consider someone who is looking for all the mandatory requirements in a certain document because that person wants to make sure a certain implementation has everything that MUST be implemented, anything that MAY be implemented is of no interest for this person. The way the specification is written right now makes it necessary for this person to read and understand the whole specification document when looking for MUST requirements because there is no check that makes sure that all the things that are mandatory are properly specified with a MUST keyword. The specification can say "a span MUST have a name" and the specification can also say "it is mandatory that every span has a name". Both statements are semantically equivalent but the fact that the second one can be present in the specification as it is now makes it necessary for the reader to read and understand the document of interest completely to make sure no mandatory requirement has been missed. If the specification was written how this PR suggests, it would be very easy for the specification reader to find the mandatory requirements, as it would only be a matter of looking for the occurrences of MUST.

I think it makes the specification a bit harder to write. I mean that it is an additional effort for the specification developer to add a requirement section with a proper RFC 2119 keyword in order to make sure that the requirement that is added is clear and specific. Nevertheless, that is how specifications should be 🙂

In my opinion, if we want such a highly structured format, Markdown is not really suitable and we would have to move to something else, e.g. Asciidoc, ReStructuredText, YAML and move the spec from being viewable as-is on GH to be displayed on GH pages after some generation process.

I also think Markdown is not the perfect file format for this. Nevertheless, the main change I want to introduce in this PR is the strict specification of requirements to make it possible to extract them programatically and also check them in the same way. The format we use is secondary, so I did not want to introduce a change that is not central to this PR here. I think Markdown is good enough if we use it as it is suggested here (it seems like it may be necessary to adapt the CI tests for these changes to be accepted, though). Github is capable of rendering RestructuredText and Asciidoc directly without any additional process.

yurishkuro · 2020-11-11T05:25:14Z

"a span MUST have a name" ... "it is mandatory that every span has a name". Both statements are semantically equivalent

I think such instances MUST be corrected to use MUST.

strict specification of requirements to make it possible to extract them programmatically and also check them in the same way

You'd be able to enumerate them programmatically, but that's about it, isn't it? Someone still needs to interpret the text.

ocelotl · 2020-11-11T07:05:51Z

"a span MUST have a name" ... "it is mandatory that every span has a name". Both statements are semantically equivalent

I think such instances MUST be corrected to use MUST.

strict specification of requirements to make it possible to extract them programmatically and also check them in the same way

You'd be able to enumerate them programmatically, but that's about it, isn't it? Someone still needs to interpret the text.

Correct, someone still needs to interpret the text. I think enumerating the requirements is an advantage in itself as it allows for the implementation developers to have a clear list of requirements that can be used to measure compliance with the specification.

ocelotl · 2020-11-11T23:09:05Z

The requirement matrix is supposed to (mostly) solve this: https://github.com/open-telemetry/opentelemetry-specification/blob/master/spec-compliance-matrix.md I guess it would be important to add links from the rows to the corresponding spec item though.

I don't think the requirement matrix solves the issue this PR solves. How does the requirement matrix handle the different RFC 2119 keywords that can show up in the specification? That matrix it says nothing of the feature in the implementation being mandatory (I mean a MUST requirement) or not (I mean a MAY requirement). This makes it not possible for the reader to know how actually complete an implementation is because in a certain implementation, a MAY requirement can be marked as + while a MUST requirement can be marked as - but the requirement matrix reader can't tell that the implementation lacks a mandatory requirement because that information is not present there.

While I think a bit smaller linkable sections would be nice, I think in the current form, this PR makes the spec harder to read (and write).

I think it makes the specification easier to read. For example, let's consider someone who is looking for all the mandatory requirements in a certain document because that person wants to make sure a certain implementation has everything that MUST be implemented, anything that MAY be implemented is of no interest for this person. The way the specification is written right now makes it necessary for this person to read and understand the whole specification document when looking for MUST requirements because there is no check that makes sure that all the things that are mandatory are properly specified with a MUST keyword. The specification can say "a span MUST have a name" and the specification can also say "it is mandatory that every span has a name". Both statements are semantically equivalent but the fact that the second one can be present in the specification as it is now makes it necessary for the reader to read and understand the document of interest completely to make sure no mandatory requirement has been missed. If the specification was written how this PR suggests, it would be very easy for the specification reader to find the mandatory requirements, as it would only be a matter of looking for the occurrences of MUST.

I think it makes the specification a bit harder to write. I mean that it is an additional effort for the specification developer to add a requirement section with a proper RFC 2119 keyword in order to make sure that the requirement that is added is clear and specific. Nevertheless, that is how specifications should be

In my opinion, if we want such a highly structured format, Markdown is not really suitable and we would have to move to something else, e.g. Asciidoc, ReStructuredText, YAML and move the spec from being viewable as-is on GH to be displayed on GH pages after some generation process.

I also think Markdown is not the perfect file format for this. Nevertheless, the main change I want to introduce in this PR is the strict specification of requirements to make it possible to extract them programatically and also check them in the same way. The format we use is secondary, so I did not want to introduce a change that is not central to this PR here. I think Markdown is good enough if we use it as it is suggested here (it seems like it may be necessary to adapt the CI tests for these changes to be accepted, though). Github is capable of rendering RestructuredText and Asciidoc directly without any additional process.

For the sake of completeness, here are the supported Github markups, @Oberon00 👍

Oberon00 · 2020-11-12T08:48:16Z

Even though Github supports basic ReStructuredText, the real usefulness of rst would lie in the ability to write custom markup like .. requirement:: which I don't think GitHub supports.

github-actions · 2020-11-20T03:20:57Z

This PR was marked stale due to lack of activity. It will be closed in 7 days.

github-actions · 2020-11-27T03:23:07Z

Closed as inactive. Feel free to reopen if this PR is still being worked on.

ocelotl · 2020-12-01T15:39:38Z

Even though Github supports basic ReStructuredText, the real usefulness of rst would lie in the ability to write custom markup like .. requirement:: which I don't think GitHub supports.

Yes, but that is actually not central to this PR. The way that we can display the requirement sections in Github is possible with the code in this PR (although it is less than ideal), but the main idea of this PR is to separate the hard requirements into their own sections from the rest of the specification so that they can be identified quickly and processed by a parser.

ocelotl · 2020-12-01T21:25:17Z

Is it possible to reopen this PR? I don't seem able to do so. @carlosalberto

ocelotl · 2020-12-02T19:44:51Z

Thanks @Oberon00 for reopening!

Looking around I found this. Funny thing, it pretty much includes everything I am suggesting in this PR. This W3C specification says: It is important for readers to be able to differentiate requirements in the specification from non-requirements in order to either implement or review them.

This is exactly what I am attempting to do here 😅 by separating the specification parts that include an RFC Keyword into their own separately-labeled sections.

It even mentions this: It will be easier to extract conformance requirements and better for accessibility. This is precisely what I want with the parsing tool that extracts the requirements into a JSON file.

I'll have to take a better look at this document, probably everything that I have imagined for this PR is already well defined there.

mattmccleary · 2021-03-02T16:22:13Z

+1 for aligning our "shalls", "musts", "required", "recommended", etc. to clearly defined standards.

Can we start here, which I think is the same suggestion as @ocelotl above?
https://www.ietf.org/rfc/rfc2119.txt

For a short time, I worked on construction specifications, and I used to get my hand slapped for using "must" instead of "shall". As an emerging discipline, software conventions seem less established, and we can learn from the traditional engineering disciplines in tightening up our language.

Oberon00 · 2021-03-02T16:28:16Z

+1 for aligning our "shalls", "musts", "required", "recommended", etc. to clearly defined standards.

We already have this in theory, see https://github.com/open-telemetry/opentelemetry-specification#notation-conventions-and-compliance. But there are some places where we use statements like "is", "has", "can" instead of MUST/SHOULD/MAY.

reyang · 2021-03-02T16:32:53Z

+1 for aligning our "shalls", "musts", "required", "recommended", etc. to clearly defined standards.

Can we start here, which I think is the same suggestion as @ocelotl above?
https://www.ietf.org/rfc/rfc2119.txt

For a short time, I worked on construction specifications, and I used to get my hand slapped for using "must" instead of "shall". As an emerging discipline, software conventions seem less established, and we can learn from the traditional engineering disciplines in tightening up our language.

@mattmccleary check this https://github.com/open-telemetry/opentelemetry-specification#notation-conventions-and-compliance.

mattmccleary · 2021-03-02T16:37:41Z

We already have this in theory, see https://github.com/open-telemetry/opentelemetry-specification#notation-conventions-and-compliance. But there are some places where we use statements like "is", "has", "can" instead of MUST/SHOULD/MAY.

@Oberon00 @reyang Thank you for the clarification. Digging through the text to button-up the language where there's divergence and making the feature matrix auto-generable would be valuable experience for an intern. I will happily volunteer to be a reviewer for this work.

reyang · 2021-03-03T02:29:28Z

@ocelotl do you want to resurrect this? I'm willing to work with you to get this merged. We've discussed this during the 03/02/2021 Spec SIG Mtg that we do need something like this.

I proposed something like ECMA262 then I noticed your PR was doing the same thing.

ocelotl · 2021-03-04T01:55:06Z

@ocelotl do you want to resurrect this? I'm willing to work with you to get this merged. We've discussed this during the 03/02/2021 Spec SIG Mtg that we do need something like this.

I proposed something like ECMA262 then I noticed your PR was doing the same thing.

@reyang, yes, I personally would love to see this move forward. I'm not sure how much time I can dedicate to this PR now 😞 (sorry if this is a dumb question, but how does ECMA262 does the same thing as this PR?)

That being said, @reyang, @mattmccleary, @Oberon00:

My goal with this PR is to make OpenTelemetry compliant with this:

It is important for readers to be able to differentiate requirements in the specification from non-requirements in order to either implement or review them.

This means:

Define a way to separately format the RFC2119-keyworded requirements in the OpenTelemetry documents.
Review all the current specification to make sure everywhere there should be an RFC2119-keyworded requirement, it is written following the previously defined format.

Only reviewing and fixing the current document to make sure that we are using an RFC2119 keyword everywhere it should is not enough. With time, we will very likely make the same mistake of not using the correct RFC2119 keyword where it is needed again and will end up with the same problem that we have now. We need this separate format for what I call "requirement sections" so that we clearly separate what needs to be exactly written with an RFC2119 keyword from what does not so that our readers can easily tell what they need to implement to be compliant.

Ok, the previous point 1 is in my opinion not that long or time consuming, it is mostly a matter of finding the right way to format these "requirement sections". @Oberon00 has previously raised very valid points regarding the inability of Github to render nicely these sections in HTML when we use markdown. I have some hope that using a different markup language (RestructuredText, maybe?) may allow us to work around this problem. Of course, rewriting the entire specification in a different markup language can be something hard to sell to the whole OpenTelemetry community because it may require developers to learn the differences between a new markup language and Github-flavored markdown.

I think those are the less time consuming problems that need solving (which does not mean that they will consume little time). The bulk of the work (if we decide to move forward with this PR, of course) will be reading, understanding and rewriting every occurrence of a requirement in the "requirement section" format.

I believe the changes suggested in this PR will make OpenTelemetry significantly better. Requirements will be clearly defined, implementations would be able to compare their compliance against all the "requirement sections" (and so, they will be able to know how compliant they are) in this PR and it will be easier for them (and us) to refer to a specific requirement when we communicate with each other. I also want to be honest and tell you all that I understand that this PR aims a review and rewrite of the whole specification and maybe even using a new markup language to write it, and that is of course a big change that can impact this project.

Sorry for the long post 😅 If someone has any idea on how to minimize the impact of this PR (or can provide resources to do this work (an intern, as @mattmccleary mentioned, maybe?)), please leave a comment below 🙂

Thank you all!

reyang · 2021-03-04T16:45:10Z

@ocelotl here goes my suggestion:

I think this PR has a good shape, instead of waiting for a "hypothesis" intern to boil the ocean, I think we should focus on making small steps. Once the merge conflict is resolved, we should get it merged.
Once this PR get merged, I can take over if you don't have time. I'll clean up the other docs, and put CI enforcement to make sure folks get CI break if they don't follow the rule.
Figuring out another format could take time, and I don't see it as a blocker. The main effort here is to have the spec written in a more organized way. Once it is better organized, converting it to another format is quite straightforward.

specification/metrics/api.md

yurishkuro · 2021-03-05T22:41:23Z

I feel like there's still a gap between the intention in the original ticket and the proposal in the PR. I think the intent is to support a workflow like this:

a language SIG wants to release a new version of the API/SDK
they run the script against specification that produces a checklist of all requirements
they certify that each requirement is satisfied by the language implementation, by
- having a test suite where there is an entry for each requirement, e.g. a unit test function with the name of the requirement id

The PR currently does the second step in this, but doesn't say anything more, specifically how it should be used, other than saying that it assigns unique IDs that can be references externally.

specification/requirements.md

MrAlias

This approach described here will require that normative requirements be manually consolidated into an additional section of the specification by a human instead of directly reading them from the specification. This suffers from the same problem discussed about the specification compliance matrix, it will form a derivative work. As such, it will compete for authority with the specification in situations where there are conflicts or stale statements.

My understanding was that we would try to generate a equivalent form of this requirements section using a program not having a human duplicate the specification into a more easily parsable form for a machine. I'm skeptical if this moves us closer to a document that is easier for implementers to check their implementation with or it is just rewrites the specification from English to a psudo-machine code and adds translators for other machine languages.

internal/tools/specification_parser/specification_parser.py

specification/requirements.md

MrAlias · 2021-03-08T16:19:30Z

specification/requirements.md

+Finally, it makes the specification developer follow a "testing mindset" while writing requirements. For example,
+when writing a requirement, the specification developers ask themselves "can a test be written for this statement?".
+This helps writing short, concise requirements that are clear for the implementation developers.


This is a recipe for constrained and poor specification writing. The specification is written in English not a programming language. If this is going to impose upon authors of the English language a restriction that will take away the expressiveness needed to communicate concepts and ideas it is wrong. It should be engineered the other way, the parser should parse English.

The intention of having separate specification sections is not to impose upon the authors a restriction that will take away any expressiveness to communicate ideas. Any resource to communicate ideas can be part of the document (text, images, diagrams, etc.). This PR only intends to define how certain specific sections of the specification are to be written, these sections are the "hard requirements" of the specification and it is convenient that they are defined in a clear manner. This is also necessary to define precise requirements that can be easily extracted from the specification. Everything else outside these sections can be expressed freely and the only limit is the imagination of the author.

ocelotl · 2021-03-10T03:28:19Z

I feel like there's still a gap between the intention in the original ticket and the proposal in the PR. I think the intent is to support a workflow like this:

a language SIG wants to release a new version of the API/SDK

they run the script against specification that produces a checklist of all requirements

they certify that each requirement is satisfied by the language implementation, by

having a test suite where there is an entry for each requirement, e.g. a unit test function with the name of the requirement id

The PR currently does the second step in this, but doesn't say anything more, specifically how it should be used, other than saying that it assigns unique IDs that can be references externally.

I added a paragraph to better explain what is to be done with the generated JSON files. I hope this makes the overall purpose of this PR more clear.

ocelotl · 2021-03-10T03:40:48Z

This approach described here will require that normative requirements be manually consolidated into an additional section of the specification by a human instead of directly reading them from the specification. This suffers from the same problem discussed about the specification compliance matrix, it will form a derivative work. As such, it will compete for authority with the specification in situations where there are conflicts or stale statements.

The requirement sections introduced in this PR won't compete for authority with the rest of the specification. The requirement sections will be the authority. This is because they will be the part of the specification that is written with a strict set of rules and is clearly delimited in a clearly marked section, that can be referenced as they will have an unique identifier. This is also necessary because implementation will be specifically looking at them to know what they should implement. Also, the requirement sections have advantages over the compliance matrix because they indicate if the feature is mandatory or not.

My understanding was that we would try to generate a equivalent form of this requirements section using a program not having a human duplicate the specification into a more easily parsable form for a machine. I'm skeptical if this moves us closer to a document that is easier for implementers to check their implementation with or it is just rewrites the specification from English to a psudo-machine code and adds translators for other machine languages.

There is no need to duplicate anything in the requirement sections nor for them to be a rewrite of the specification from English to another language. The requirement sections are meant to be the short, concise part of the specification that includes one or more BCP 14 keywords, nothing less, nothing more. Any example, explanation or clarification of intention will remain in the rest of the specification and they will complement each other, the former to clearly define what is to be implemented, the latter to make it clear for human beings why the specification is how it is.

Fixes open-telemetry#1210

Co-authored-by: Tyler Yahn <[email protected]>

Co-authored-by: Yuri Shkuro <[email protected]>

carlosalberto · 2021-03-22T22:48:59Z

@reyang Wondering whether a prototype for this is still on the works?

ocelotl requested review from a team November 9, 2020 06:17

github-actions bot assigned yurishkuro Nov 9, 2020

ocelotl force-pushed the test_driven_specification branch from bedcf27 to 6f6cbcd Compare November 9, 2020 06:29

Oberon00 previously requested changes Nov 9, 2020

View reviewed changes

ocelotl requested a review from Oberon00 November 11, 2020 04:07

github-actions bot added the Stale label Nov 20, 2020

github-actions bot closed this Nov 27, 2020

Oberon00 added the release:after-ga Not required before GA release, and not going to work on before GA label Dec 1, 2020

Oberon00 reopened this Dec 1, 2020

Oberon00 removed the Stale label Dec 1, 2020

Base automatically changed from master to main January 27, 2021 21:16

reyang added the area:miscellaneous For issues that don't match any other area label label Mar 5, 2021

yurishkuro reviewed Mar 5, 2021

View reviewed changes

specification/metrics/api.md Outdated Show resolved Hide resolved

yurishkuro reviewed Mar 5, 2021

View reviewed changes

specification/requirements.md Outdated Show resolved Hide resolved

MrAlias reviewed Mar 8, 2021

View reviewed changes

ocelotl requested review from MrAlias and yurishkuro March 10, 2021 03:41

ocelotl and others added 17 commits March 15, 2021 17:16

Add requirement section specification

77043a0

Fixes open-telemetry#1210

Add docstring

7232266

Move specification parser to internal tools folder

e7d911c

Fix lint

9bf0f88

Update specification/requirements.md

b81c3da

Co-authored-by: Tyler Yahn <[email protected]>

Add suggestions from MrAlias

e346dc4

Fix test cases

e07139c

Fix regex

3857025

WIP

5738cb9

Fix tests

fa3730f

Update specification/metrics/api.md

cb01482

Co-authored-by: Yuri Shkuro <[email protected]>

Rename markdown file

b77efe8

Fix lint

a2aa633

Fix docs

e9e35c9

Fix command

ecdf770

Fix JSON paths

d829004

Add explanatory paragraph

3a0621e

ocelotl force-pushed the test_driven_specification branch from 8c7b257 to 3a0621e Compare March 15, 2021 23:20

jmacd closed this Sep 1, 2021

ocelotl mentioned this pull request Oct 12, 2021

Framework to support W3C test-driven specification #2003

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add requirement section specification #1211

Add requirement section specification #1211

ocelotl commented Nov 9, 2020 •

edited

Loading

Oberon00 left a comment •

edited

Loading

ocelotl commented Nov 11, 2020

yurishkuro commented Nov 11, 2020

ocelotl commented Nov 11, 2020

ocelotl commented Nov 11, 2020

Oberon00 commented Nov 12, 2020 •

edited

Loading

github-actions bot commented Nov 20, 2020

github-actions bot commented Nov 27, 2020

ocelotl commented Dec 1, 2020

ocelotl commented Dec 1, 2020

ocelotl commented Dec 2, 2020

mattmccleary commented Mar 2, 2021 •

edited

Loading

Oberon00 commented Mar 2, 2021

reyang commented Mar 2, 2021

mattmccleary commented Mar 2, 2021 •

edited

Loading

reyang commented Mar 3, 2021

ocelotl commented Mar 4, 2021

reyang commented Mar 4, 2021

yurishkuro commented Mar 5, 2021

MrAlias left a comment

MrAlias Mar 8, 2021

ocelotl Mar 8, 2021

ocelotl commented Mar 10, 2021

ocelotl commented Mar 10, 2021 •

edited

Loading

carlosalberto commented Mar 22, 2021

Add requirement section specification #1211

Add requirement section specification #1211

Conversation

ocelotl commented Nov 9, 2020 • edited Loading

Changes

Oberon00 left a comment • edited Loading

Choose a reason for hiding this comment

ocelotl commented Nov 11, 2020

yurishkuro commented Nov 11, 2020

ocelotl commented Nov 11, 2020

ocelotl commented Nov 11, 2020

Oberon00 commented Nov 12, 2020 • edited Loading

github-actions bot commented Nov 20, 2020

github-actions bot commented Nov 27, 2020

ocelotl commented Dec 1, 2020

ocelotl commented Dec 1, 2020

ocelotl commented Dec 2, 2020

mattmccleary commented Mar 2, 2021 • edited Loading

Oberon00 commented Mar 2, 2021

reyang commented Mar 2, 2021

mattmccleary commented Mar 2, 2021 • edited Loading

reyang commented Mar 3, 2021

ocelotl commented Mar 4, 2021

reyang commented Mar 4, 2021

yurishkuro commented Mar 5, 2021

MrAlias left a comment

Choose a reason for hiding this comment

MrAlias Mar 8, 2021

Choose a reason for hiding this comment

ocelotl Mar 8, 2021

Choose a reason for hiding this comment

ocelotl commented Mar 10, 2021

ocelotl commented Mar 10, 2021 • edited Loading

carlosalberto commented Mar 22, 2021

ocelotl commented Nov 9, 2020 •

edited

Loading

Oberon00 left a comment •

edited

Loading

Oberon00 commented Nov 12, 2020 •

edited

Loading

mattmccleary commented Mar 2, 2021 •

edited

Loading

mattmccleary commented Mar 2, 2021 •

edited

Loading

ocelotl commented Mar 10, 2021 •

edited

Loading