Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add requirement section specification #1211

Closed
wants to merge 17 commits into from

Conversation

ocelotl
Copy link
Contributor

@ocelotl ocelotl commented Nov 9, 2020

Fixes #1210

Changes

Adds a new way of specifying requirements. More details here. Already added requirement sections for the metrics api and sdk md files, more can be added later.

@ocelotl ocelotl requested review from a team November 9, 2020 06:17
@ocelotl ocelotl force-pushed the test_driven_specification branch from bedcf27 to 6f6cbcd Compare November 9, 2020 06:29
Copy link
Member

@Oberon00 Oberon00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The requirement matrix is supposed to (mostly) solve this: https://github.com/open-telemetry/opentelemetry-specification/blob/master/spec-compliance-matrix.md I guess it would be important to add links from the rows to the corresponding spec item though.

While I think a bit smaller linkable sections would be nice, I think in the current form, this PR makes the spec harder to read (and write).

In my opinion, if we want such a highly structured format, Markdown is not really suitable and we would have to move to something else, e.g. Asciidoc, ReStructuredText, YAML and move the spec from being viewable as-is on GH to be displayed on GH pages after some generation process.

@ocelotl
Copy link
Contributor Author

ocelotl commented Nov 11, 2020

The requirement matrix is supposed to (mostly) solve this: https://github.com/open-telemetry/opentelemetry-specification/blob/master/spec-compliance-matrix.md I guess it would be important to add links from the rows to the corresponding spec item though.

I don't think the requirement matrix solves the issue this PR solves. How does the requirement matrix handle the different RFC 2119 keywords that can show up in the specification? That matrix it says nothing of the feature in the implementation being mandatory (I mean a MUST requirement) or not (I mean a MAY requirement). This makes it not possible for the reader to know how actually complete an implementation is because in a certain implementation, a MAY requirement can be marked as + while a MUST requirement can be marked as - but the requirement matrix reader can't tell that the implementation lacks a mandatory requirement because that information is not present there.

While I think a bit smaller linkable sections would be nice, I think in the current form, this PR makes the spec harder to read (and write).

I think it makes the specification easier to read. For example, let's consider someone who is looking for all the mandatory requirements in a certain document because that person wants to make sure a certain implementation has everything that MUST be implemented, anything that MAY be implemented is of no interest for this person. The way the specification is written right now makes it necessary for this person to read and understand the whole specification document when looking for MUST requirements because there is no check that makes sure that all the things that are mandatory are properly specified with a MUST keyword. The specification can say "a span MUST have a name" and the specification can also say "it is mandatory that every span has a name". Both statements are semantically equivalent but the fact that the second one can be present in the specification as it is now makes it necessary for the reader to read and understand the document of interest completely to make sure no mandatory requirement has been missed. If the specification was written how this PR suggests, it would be very easy for the specification reader to find the mandatory requirements, as it would only be a matter of looking for the occurrences of MUST.

I think it makes the specification a bit harder to write. I mean that it is an additional effort for the specification developer to add a requirement section with a proper RFC 2119 keyword in order to make sure that the requirement that is added is clear and specific. Nevertheless, that is how specifications should be 🙂

In my opinion, if we want such a highly structured format, Markdown is not really suitable and we would have to move to something else, e.g. Asciidoc, ReStructuredText, YAML and move the spec from being viewable as-is on GH to be displayed on GH pages after some generation process.

I also think Markdown is not the perfect file format for this. Nevertheless, the main change I want to introduce in this PR is the strict specification of requirements to make it possible to extract them programatically and also check them in the same way. The format we use is secondary, so I did not want to introduce a change that is not central to this PR here. I think Markdown is good enough if we use it as it is suggested here (it seems like it may be necessary to adapt the CI tests for these changes to be accepted, though). Github is capable of rendering RestructuredText and Asciidoc directly without any additional process.

@ocelotl ocelotl requested a review from Oberon00 November 11, 2020 04:07
@yurishkuro
Copy link
Member

"a span MUST have a name" ... "it is mandatory that every span has a name". Both statements are semantically equivalent

I think such instances MUST be corrected to use MUST.

strict specification of requirements to make it possible to extract them programmatically and also check them in the same way

You'd be able to enumerate them programmatically, but that's about it, isn't it? Someone still needs to interpret the text.

@ocelotl
Copy link
Contributor Author

ocelotl commented Nov 11, 2020

"a span MUST have a name" ... "it is mandatory that every span has a name". Both statements are semantically equivalent

I think such instances MUST be corrected to use MUST.

strict specification of requirements to make it possible to extract them programmatically and also check them in the same way

You'd be able to enumerate them programmatically, but that's about it, isn't it? Someone still needs to interpret the text.

Correct, someone still needs to interpret the text. I think enumerating the requirements is an advantage in itself as it allows for the implementation developers to have a clear list of requirements that can be used to measure compliance with the specification.

@ocelotl
Copy link
Contributor Author

ocelotl commented Nov 11, 2020

The requirement matrix is supposed to (mostly) solve this: https://github.com/open-telemetry/opentelemetry-specification/blob/master/spec-compliance-matrix.md I guess it would be important to add links from the rows to the corresponding spec item though.

I don't think the requirement matrix solves the issue this PR solves. How does the requirement matrix handle the different RFC 2119 keywords that can show up in the specification? That matrix it says nothing of the feature in the implementation being mandatory (I mean a MUST requirement) or not (I mean a MAY requirement). This makes it not possible for the reader to know how actually complete an implementation is because in a certain implementation, a MAY requirement can be marked as + while a MUST requirement can be marked as - but the requirement matrix reader can't tell that the implementation lacks a mandatory requirement because that information is not present there.

While I think a bit smaller linkable sections would be nice, I think in the current form, this PR makes the spec harder to read (and write).

I think it makes the specification easier to read. For example, let's consider someone who is looking for all the mandatory requirements in a certain document because that person wants to make sure a certain implementation has everything that MUST be implemented, anything that MAY be implemented is of no interest for this person. The way the specification is written right now makes it necessary for this person to read and understand the whole specification document when looking for MUST requirements because there is no check that makes sure that all the things that are mandatory are properly specified with a MUST keyword. The specification can say "a span MUST have a name" and the specification can also say "it is mandatory that every span has a name". Both statements are semantically equivalent but the fact that the second one can be present in the specification as it is now makes it necessary for the reader to read and understand the document of interest completely to make sure no mandatory requirement has been missed. If the specification was written how this PR suggests, it would be very easy for the specification reader to find the mandatory requirements, as it would only be a matter of looking for the occurrences of MUST.

I think it makes the specification a bit harder to write. I mean that it is an additional effort for the specification developer to add a requirement section with a proper RFC 2119 keyword in order to make sure that the requirement that is added is clear and specific. Nevertheless, that is how specifications should be

In my opinion, if we want such a highly structured format, Markdown is not really suitable and we would have to move to something else, e.g. Asciidoc, ReStructuredText, YAML and move the spec from being viewable as-is on GH to be displayed on GH pages after some generation process.

I also think Markdown is not the perfect file format for this. Nevertheless, the main change I want to introduce in this PR is the strict specification of requirements to make it possible to extract them programatically and also check them in the same way. The format we use is secondary, so I did not want to introduce a change that is not central to this PR here. I think Markdown is good enough if we use it as it is suggested here (it seems like it may be necessary to adapt the CI tests for these changes to be accepted, though). Github is capable of rendering RestructuredText and Asciidoc directly without any additional process.

For the sake of completeness, here are the supported Github markups, @Oberon00 👍

@Oberon00
Copy link
Member

Oberon00 commented Nov 12, 2020

Even though Github supports basic ReStructuredText, the real usefulness of rst would lie in the ability to write custom markup like .. requirement:: which I don't think GitHub supports.

@github-actions
Copy link

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Nov 20, 2020
@github-actions
Copy link

Closed as inactive. Feel free to reopen if this PR is still being worked on.

@github-actions github-actions bot closed this Nov 27, 2020
@ocelotl
Copy link
Contributor Author

ocelotl commented Dec 1, 2020

Even though Github supports basic ReStructuredText, the real usefulness of rst would lie in the ability to write custom markup like .. requirement:: which I don't think GitHub supports.

Yes, but that is actually not central to this PR. The way that we can display the requirement sections in Github is possible with the code in this PR (although it is less than ideal), but the main idea of this PR is to separate the hard requirements into their own sections from the rest of the specification so that they can be identified quickly and processed by a parser.

@ocelotl
Copy link
Contributor Author

ocelotl commented Dec 1, 2020

Is it possible to reopen this PR? I don't seem able to do so. @carlosalberto

@Oberon00 Oberon00 added the release:after-ga Not required before GA release, and not going to work on before GA label Dec 1, 2020
@Oberon00 Oberon00 reopened this Dec 1, 2020
@Oberon00 Oberon00 removed the Stale label Dec 1, 2020
@ocelotl
Copy link
Contributor Author

ocelotl commented Dec 2, 2020

Thanks @Oberon00 for reopening!

Looking around I found this. Funny thing, it pretty much includes everything I am suggesting in this PR. This W3C specification says: It is important for readers to be able to differentiate requirements in the specification from non-requirements in order to either implement or review them.

This is exactly what I am attempting to do here 😅 by separating the specification parts that include an RFC Keyword into their own separately-labeled sections.

It even mentions this: It will be easier to extract conformance requirements and better for accessibility. This is precisely what I want with the parsing tool that extracts the requirements into a JSON file.

I'll have to take a better look at this document, probably everything that I have imagined for this PR is already well defined there.

Base automatically changed from master to main January 27, 2021 21:16
@mattmccleary
Copy link
Contributor

mattmccleary commented Mar 2, 2021

+1 for aligning our "shalls", "musts", "required", "recommended", etc. to clearly defined standards.

Can we start here, which I think is the same suggestion as @ocelotl above?
https://www.ietf.org/rfc/rfc2119.txt

For a short time, I worked on construction specifications, and I used to get my hand slapped for using "must" instead of "shall". As an emerging discipline, software conventions seem less established, and we can learn from the traditional engineering disciplines in tightening up our language.

@Oberon00
Copy link
Member

Oberon00 commented Mar 2, 2021

+1 for aligning our "shalls", "musts", "required", "recommended", etc. to clearly defined standards.

We already have this in theory, see https://github.com/open-telemetry/opentelemetry-specification#notation-conventions-and-compliance. But there are some places where we use statements like "is", "has", "can" instead of MUST/SHOULD/MAY.

@reyang
Copy link
Member

reyang commented Mar 2, 2021

+1 for aligning our "shalls", "musts", "required", "recommended", etc. to clearly defined standards.

Can we start here, which I think is the same suggestion as @ocelotl above?
https://www.ietf.org/rfc/rfc2119.txt

For a short time, I worked on construction specifications, and I used to get my hand slapped for using "must" instead of "shall". As an emerging discipline, software conventions seem less established, and we can learn from the traditional engineering disciplines in tightening up our language.

@mattmccleary check this https://github.com/open-telemetry/opentelemetry-specification#notation-conventions-and-compliance.

@mattmccleary
Copy link
Contributor

mattmccleary commented Mar 2, 2021

We already have this in theory, see https://github.com/open-telemetry/opentelemetry-specification#notation-conventions-and-compliance. But there are some places where we use statements like "is", "has", "can" instead of MUST/SHOULD/MAY.

@Oberon00 @reyang Thank you for the clarification. Digging through the text to button-up the language where there's divergence and making the feature matrix auto-generable would be valuable experience for an intern. I will happily volunteer to be a reviewer for this work.

@reyang
Copy link
Member

reyang commented Mar 3, 2021

@ocelotl do you want to resurrect this? I'm willing to work with you to get this merged. We've discussed this during the 03/02/2021 Spec SIG Mtg that we do need something like this.

I proposed something like ECMA262 then I noticed your PR was doing the same thing.

@ocelotl
Copy link
Contributor Author

ocelotl commented Mar 4, 2021

@ocelotl do you want to resurrect this? I'm willing to work with you to get this merged. We've discussed this during the 03/02/2021 Spec SIG Mtg that we do need something like this.

I proposed something like ECMA262 then I noticed your PR was doing the same thing.

@reyang, yes, I personally would love to see this move forward. I'm not sure how much time I can dedicate to this PR now 😞 (sorry if this is a dumb question, but how does ECMA262 does the same thing as this PR?)

That being said, @reyang, @mattmccleary, @Oberon00:

My goal with this PR is to make OpenTelemetry compliant with this:

It is important for readers to be able to differentiate requirements in the specification from non-requirements in order to either implement or review them.

This means:

  1. Define a way to separately format the RFC2119-keyworded requirements in the OpenTelemetry documents.
  2. Review all the current specification to make sure everywhere there should be an RFC2119-keyworded requirement, it is written following the previously defined format.

Only reviewing and fixing the current document to make sure that we are using an RFC2119 keyword everywhere it should is not enough. With time, we will very likely make the same mistake of not using the correct RFC2119 keyword where it is needed again and will end up with the same problem that we have now. We need this separate format for what I call "requirement sections" so that we clearly separate what needs to be exactly written with an RFC2119 keyword from what does not so that our readers can easily tell what they need to implement to be compliant.

Ok, the previous point 1 is in my opinion not that long or time consuming, it is mostly a matter of finding the right way to format these "requirement sections". @Oberon00 has previously raised very valid points regarding the inability of Github to render nicely these sections in HTML when we use markdown. I have some hope that using a different markup language (RestructuredText, maybe?) may allow us to work around this problem. Of course, rewriting the entire specification in a different markup language can be something hard to sell to the whole OpenTelemetry community because it may require developers to learn the differences between a new markup language and Github-flavored markdown.

I think those are the less time consuming problems that need solving (which does not mean that they will consume little time). The bulk of the work (if we decide to move forward with this PR, of course) will be reading, understanding and rewriting every occurrence of a requirement in the "requirement section" format.

I believe the changes suggested in this PR will make OpenTelemetry significantly better. Requirements will be clearly defined, implementations would be able to compare their compliance against all the "requirement sections" (and so, they will be able to know how compliant they are) in this PR and it will be easier for them (and us) to refer to a specific requirement when we communicate with each other. I also want to be honest and tell you all that I understand that this PR aims a review and rewrite of the whole specification and maybe even using a new markup language to write it, and that is of course a big change that can impact this project.

Sorry for the long post 😅 If someone has any idea on how to minimize the impact of this PR (or can provide resources to do this work (an intern, as @mattmccleary mentioned, maybe?)), please leave a comment below 🙂

Thank you all!

@reyang
Copy link
Member

reyang commented Mar 4, 2021

@ocelotl here goes my suggestion:

  1. I think this PR has a good shape, instead of waiting for a "hypothesis" intern to boil the ocean, I think we should focus on making small steps. Once the merge conflict is resolved, we should get it merged.
  2. Once this PR get merged, I can take over if you don't have time. I'll clean up the other docs, and put CI enforcement to make sure folks get CI break if they don't follow the rule.
  3. Figuring out another format could take time, and I don't see it as a blocker. The main effort here is to have the spec written in a more organized way. Once it is better organized, converting it to another format is quite straightforward.

@reyang reyang added the area:miscellaneous For issues that don't match any other area label label Mar 5, 2021
@yurishkuro
Copy link
Member

I feel like there's still a gap between the intention in the original ticket and the proposal in the PR. I think the intent is to support a workflow like this:

  • a language SIG wants to release a new version of the API/SDK
  • they run the script against specification that produces a checklist of all requirements
  • they certify that each requirement is satisfied by the language implementation, by
    • having a test suite where there is an entry for each requirement, e.g. a unit test function with the name of the requirement id

The PR currently does the second step in this, but doesn't say anything more, specifically how it should be used, other than saying that it assigns unique IDs that can be references externally.

Copy link
Contributor

@MrAlias MrAlias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach described here will require that normative requirements be manually consolidated into an additional section of the specification by a human instead of directly reading them from the specification. This suffers from the same problem discussed about the specification compliance matrix, it will form a derivative work. As such, it will compete for authority with the specification in situations where there are conflicts or stale statements.

My understanding was that we would try to generate a equivalent form of this requirements section using a program not having a human duplicate the specification into a more easily parsable form for a machine. I'm skeptical if this moves us closer to a document that is easier for implementers to check their implementation with or it is just rewrites the specification from English to a psudo-machine code and adds translators for other machine languages.

specification/requirements.md Outdated Show resolved Hide resolved
Comment on lines 116 to 118
Finally, it makes the specification developer follow a "testing mindset" while writing requirements. For example,
when writing a requirement, the specification developers ask themselves "can a test be written for this statement?".
This helps writing short, concise requirements that are clear for the implementation developers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a recipe for constrained and poor specification writing. The specification is written in English not a programming language. If this is going to impose upon authors of the English language a restriction that will take away the expressiveness needed to communicate concepts and ideas it is wrong. It should be engineered the other way, the parser should parse English.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intention of having separate specification sections is not to impose upon the authors a restriction that will take away any expressiveness to communicate ideas. Any resource to communicate ideas can be part of the document (text, images, diagrams, etc.). This PR only intends to define how certain specific sections of the specification are to be written, these sections are the "hard requirements" of the specification and it is convenient that they are defined in a clear manner. This is also necessary to define precise requirements that can be easily extracted from the specification. Everything else outside these sections can be expressed freely and the only limit is the imagination of the author.

@ocelotl
Copy link
Contributor Author

ocelotl commented Mar 10, 2021

I feel like there's still a gap between the intention in the original ticket and the proposal in the PR. I think the intent is to support a workflow like this:

  • a language SIG wants to release a new version of the API/SDK

  • they run the script against specification that produces a checklist of all requirements

  • they certify that each requirement is satisfied by the language implementation, by

    • having a test suite where there is an entry for each requirement, e.g. a unit test function with the name of the requirement id

The PR currently does the second step in this, but doesn't say anything more, specifically how it should be used, other than saying that it assigns unique IDs that can be references externally.

I added a paragraph to better explain what is to be done with the generated JSON files. I hope this makes the overall purpose of this PR more clear.

@ocelotl
Copy link
Contributor Author

ocelotl commented Mar 10, 2021

This approach described here will require that normative requirements be manually consolidated into an additional section of the specification by a human instead of directly reading them from the specification. This suffers from the same problem discussed about the specification compliance matrix, it will form a derivative work. As such, it will compete for authority with the specification in situations where there are conflicts or stale statements.

The requirement sections introduced in this PR won't compete for authority with the rest of the specification. The requirement sections will be the authority. This is because they will be the part of the specification that is written with a strict set of rules and is clearly delimited in a clearly marked section, that can be referenced as they will have an unique identifier. This is also necessary because implementation will be specifically looking at them to know what they should implement. Also, the requirement sections have advantages over the compliance matrix because they indicate if the feature is mandatory or not.

My understanding was that we would try to generate a equivalent form of this requirements section using a program not having a human duplicate the specification into a more easily parsable form for a machine. I'm skeptical if this moves us closer to a document that is easier for implementers to check their implementation with or it is just rewrites the specification from English to a psudo-machine code and adds translators for other machine languages.

There is no need to duplicate anything in the requirement sections nor for them to be a rewrite of the specification from English to another language. The requirement sections are meant to be the short, concise part of the specification that includes one or more BCP 14 keywords, nothing less, nothing more. Any example, explanation or clarification of intention will remain in the rest of the specification and they will complement each other, the former to clearly define what is to be implemented, the latter to make it clear for human beings why the specification is how it is.

@ocelotl ocelotl requested review from MrAlias and yurishkuro March 10, 2021 03:41
@ocelotl ocelotl force-pushed the test_driven_specification branch from 8c7b257 to 3a0621e Compare March 15, 2021 23:20
@carlosalberto
Copy link
Contributor

@reyang Wondering whether a prototype for this is still on the works?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:miscellaneous For issues that don't match any other area label release:after-ga Not required before GA release, and not going to work on before GA
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make it possible to automatically list all specification requirements
8 participants