Add requirement section specification

Fixes open-telemetry#1210
ocelotl · Mar 4, 2021 · 28f2467 · 28f2467
1 parent 5554131
commit 28f2467
Show file tree

Hide file tree

Showing 7 changed files with 394 additions and 6 deletions.
diff --git a/.gitignore b/.gitignore
@@ -24,3 +24,10 @@ bin
 
 # Misspell binary
 internal/tools/bin
+.tools
+
+# Pytest cache
+__pycache__
+
+# JSON files generated by the specification parser
+*.json
diff --git a/README.md b/README.md
@@ -24,6 +24,7 @@ Technical committee holds regular meetings, notes are held
 ## Table of Contents
 
 - [Overview](specification/overview.md)
+- [Requirements](specification/requirements.md)
 - [Glossary](specification/glossary.md)
 - [Versioning and stability for OpenTelemetry clients](specification/versioning-and-stability.md)
 - [Library Guidelines](specification/library-guidelines.md)

diff --git a/specification/metrics/api.md b/specification/metrics/api.md
@@ -108,12 +108,19 @@ can be configured at run time.
 
 ### Behavior of the API in the absence of an installed SDK
 
-In the absence of an installed Metrics SDK, the Metrics API MUST consist only
-of no-ops. None of the calls on any part of the API can have any side effects
-or do anything meaningful. Meters MUST return no-op implementations of any
-instruments. From a user's perspective, calls to these should be ignored without raising errors
-(i.e., *no* `null` references MUST be returned in languages where accessing these results in errors).
-The API MUST NOT throw exceptions on any calls made to it.
+###### requirement: only_no_ops
+> The metrics API MUST consist only of no-ops. None of the calls on any part of the API can have any side effects or do anything meaningful.
+
+###### requirement: meters_return_no_ops
+> Meters MUST return no-op implementations of any instruments.
+
+From a user's perspective, calls to these should be ignored without raising errors.
+
+###### requirement: no_null_references
+> No null references MUST be returned in languages where accessing these results in errors.
+
+###### requirement: no_api_exceptions
+> The API MUST NOT throw exceptions on any calls made to it.
 
 ### Measurements
 

diff --git a/specification/requirements.md b/specification/requirements.md
@@ -0,0 +1,131 @@
+# Requirements
+
+<details>
+<summary>
+Table of Contents
+</summary>
+<!-- Re-generate TOC with `markdown-toc --no-first-h1 -i` -->
+
+<!-- toc -->
+
+- [Requirement Sections](#requirement-sections)
+  * [Requirement Sections Format](#requirement-sections-format)
+    + [Key Format](#key-format)
+    + [Description Format](#description-format)
+    + [Example](#example)
+  * [Purpose of the requirement sections](#purpose-of-the-requirement-sections)
+
+<!-- tocstop -->
+
+</details>
+
+This document explains how the OpenTelemetry specification requirements are written.
+
+## Requirement Sections
+
+This specification is written in several [Markdown](https://github.github.com/gfm/) files.
+
+Each of these files can contain any resource to explain and describe its part of the specification.
+Examples of these resources are images, diagrams, code, regular text, etc.
+
+Also, included in the same Markdown documents that make the OpenTelemetry specification
+are to be included specific sections named _requirement sections_ that follow a specific
+format. These sections are the part of the document that more formally describes each of
+the specific requirements that the OpenTelemetry specification has.
+
+Each of these requirement sections has 2 components:
+
+1. A unique **key**, a string that identifies the requirement section in a Markdown document
+2. A **description**, a string that MUST have one an only one of the [RFC 2119 keywords](https://tools.ietf.org/html/rfc2119).
+
+### Requirement Sections Format
+
+Each one of these sections are written also in Markdown syntax in order for them to integrate
+with the rest of the document.
+
+#### Key Format
+
+The key of every requirement section MUST be unique in the document that contains it. This key
+MUST be written in this manner:
+
+```
+###### requirement: unique_key_identifier
+```
+
+The first six `#` symbols create a Markdown heading for the requirement section. The `requirement: `
+string that follows indicates that this particular header is part of a requirement section and not
+just any Markdown six `#` heading. The following string indicated by `unique_key_indetifier` MUST be
+unique in the document that contains it. The characters that make this string SHOULD only be
+alphanumeric characters and underscores.
+
+#### Description Format
+
+The description of every requirement section MUST be written as a
+[block quote](https://github.github.com/gfm/#block-quotes) immediately following a requirement section key:
+
+```
+> The span MUST have an identifier.
+>
+> More text can be placed here as well.
+```
+
+In some part of the description one and only one RFC 2119 keyword MUST appear.
+
+#### Example
+
+Here is a small example that shows how a Markdown document can have requirement sections to describe
+its specific requirements:
+
+```
+# Some title for some OpenTelemetry concept
+
+This part describes some OpenTelemetry concept. It can include examples, images, diagrams, etc.
+
+After the concept is described, its specifc requirements are written in requirement sections:
+
+###### requirement: concept_identifier
+> The concept MUST have an identifier.
+
+###### requirement: concept_documentation
+> The concept SHOULD be documented in every implementation.
+```
+
+### Purpose of the requirement sections
+
+The idea behind writing the requirements in this manner is to make it easy for the reader to find all the
+requirements included in an OpenTelemetry specification document. In this way, it is also easy to find all
+the requirements a certain implementaiton must comply with. With all the requirements available for the
+implementation developer, it is easy to make a list of test cases, one for every requirement section, and
+to test the implementation against these test cases to measure compliance with the specification. This is
+why the key must be unique, so that it can be used to form a name for the particular testing function for
+that requirement.
+
+With the requirements specified in this way, it is also easier for the specification and implementation
+developers to refer to a certain requirement unequivocally, making communication between developers more
+clear.
+
+It is also possible to parse the Markdown documents and extract from them a list of the requirements in a
+certain format. It is provided a parser that does this and produces JSON documents for every Markdown document
+that includes at least one requirement section. With this JSON files, a testing schema can be produced for
+every implementation that can help developers know how compliant with the specification the implementation is.
+
+The parser can also work as a checker that makes sure that every requirement section is compliant with this
+specification. This can even be incorporated to the CI of the repo where the OpenTelemetry specification is
+in order to reject any change that adds a non-compliant requirement section.
+
+Finally, it makes the specification developer follow a "testing mindset" while writing requirements. For example,
+when writing a requirement, the specification developers ask themselves "can a test be written for this statement?".
+This helps writing short, concise requirements that are clear for the implementation developers.
+
+### Running the specification parser
+
+The included specification parser can be run from the root directory of the OpenTelemetry specification directory
+like this:
+
+```
+python specification_parser/specification_parser.py
+```
+
+This will recursively look for Markdown files in the `specification` directory. For every Markdown file that has at
+least one requirement section, it will generate a corresponding JSON file with the key, description and RFC 2119
+keyword or every requirement section.
diff --git a/specification_parser/specification_parser.py b/specification_parser/specification_parser.py
@@ -0,0 +1,118 @@
+from re import finditer, findall
+from json import dumps
+from os.path import curdir, abspath, join, splitext
+from os import walk
+
+
+def find_markdown_file_paths(root):
+    markdown_file_paths = []
+
+    for root_path, _, file_paths, in walk(root):
+        for file_path in file_paths:
+            absolute_file_path = join(root_path, file_path)
+
+            _, file_extension = splitext(absolute_file_path)
+
+            if file_extension == ".md":
+                markdown_file_paths.append(absolute_file_path)
+
+    return markdown_file_paths
+
+
+def parse_requirements(markdown_file_paths):
+    requirements = {}
+
+    for markdown_file_path in markdown_file_paths:
+
+        with open(markdown_file_path, "r") as markdown_file:
+
+            text = markdown_file.read()
+
+            requirement_matches = [
+                requirement_match.groupdict() for requirement_match in (
+                    finditer(
+                        r"###### requirement:\s(?P<key>[_\w]+)\n"
+                        r"(?P<description>(>.*\n?)+)",
+                        text
+                    )
+                )
+            ]
+
+        if not requirement_matches:
+            continue
+
+        json_file_path = "".join([splitext(markdown_file_path)[0], ".json"])
+
+        requirements[json_file_path] = {}
+
+        for requirement in requirement_matches:
+
+            requirement_key = requirement["key"]
+
+            assert (
+                requirement_key not in
+                requirements[json_file_path].keys()
+            ), "Repeated requirement key {} found in {}".format(
+                requirement_key, markdown_file_path
+            )
+
+            requirement_description = requirement["description"].strip()
+
+            RFC_2119_keyword_matches = []
+
+            for RFC_2119_keyword_regex in [
+                r"MUST NOT",
+                r"MUST(?! NOT)",
+                r"SHOULD NOT",
+                r"SHOULD(?! NOT)",
+                r"MAY"
+            ]:
+                RFC_2119_keyword_matches.extend(
+                    findall(
+                        RFC_2119_keyword_regex,
+                        requirement_description
+                    )
+                )
+
+            requirement_key_path = "{}:{}".format(
+                markdown_file_path, requirement_key
+            )
+
+            assert (
+                len(RFC_2119_keyword_matches) != 0
+            ), "No RFC 2119 keywords were found in {}".format(
+                requirement_key_path
+            )
+
+            assert (
+                len(RFC_2119_keyword_matches) == 1
+            ), "Repeated RFC 2119 keywords were found in {}".format(
+                requirement_key_path
+            )
+
+            requirements[json_file_path][requirement_key] = {}
+
+            requirements[json_file_path][requirement_key]["description"] = (
+                requirement_description
+            )
+            requirements[json_file_path][requirement_key][
+                "RFC 2119 Keyword"
+            ] = RFC_2119_keyword_matches[0]
+
+    return requirements
+
+
+def write_json_specifications(requirements):
+    for json_absolute_file_path, requirement_sections in requirements.items():
+
+        with open(json_absolute_file_path, "w") as json_file:
+            json_file.write(dumps(requirement_sections, indent=4))
+
+if __name__ == "__main__":
+    write_json_specifications(
+        parse_requirements(
+            find_markdown_file_paths(
+                join(abspath(curdir), "specification")
+            )
+        )
+    )
diff --git a/specification_parser/specification_parser_test.py b/specification_parser/specification_parser_test.py
@@ -0,0 +1,80 @@
+from unittest import TestCase
+from os.path import abspath, curdir, join
+
+from specification_parser import (
+    find_markdown_file_paths,
+    parse_requirements,
+)
+
+
+class TestSpecificationParser(TestCase):
+
+    @classmethod
+    def setUpClass(cls):
+        cls.current_directory_path = abspath(curdir)
+        cls.test_specification_md_path = join(
+            cls.current_directory_path, "test_specification.md"
+        )
+
+    def test_find_markdown_file_paths(self):
+
+        self.assertEqual(
+            find_markdown_file_paths(self.current_directory_path),
+            [self.test_specification_md_path]
+        )
+
+    def test_parse_requirements(self):
+
+        assert parse_requirements([self.test_specification_md_path]) == {
+            (
+                "/home/ocelotl/ocelotl/opentelemetry-specification/"
+                "specification_parser/test_specification.json"
+            ): {
+                "testable_section_0": {
+                    "description": "> This MUST be done.",
+                    "RFC 2119 Keyword": "MUST"
+                },
+                "testable_section_1": {
+                    "description": "> This MUST NOT be done.",
+                    "RFC 2119 Keyword": "MUST NOT"
+                },
+                "testable_section_2": {
+                    "description": "> This SHOULD be done.",
+                    "RFC 2119 Keyword": "SHOULD"
+                },
+                "testable_section_3": {
+                    "description": "> This SHOULD NOT be done.",
+                    "RFC 2119 Keyword": "SHOULD NOT"
+                },
+                "testable_section_4": {
+                    "description": "> This MAY be done.",
+                    "RFC 2119 Keyword": "MAY"
+                },
+                "testable_section_5": {
+                    "description": "> This **MAY** be done 5.",
+                    "RFC 2119 Keyword": "MAY"
+                },
+                "testable_section_6": {
+                    "description": "> This *MAY* be done 6.",
+                    "RFC 2119 Keyword": "MAY"
+                },
+                "testable_section_7": {
+                    "description": (
+                        "> This *MAY* be done 7.\n> This is section 7."
+                    ),
+                    "RFC 2119 Keyword": "MAY"
+                },
+                "testable_section_8": {
+                    "description": "> This *MAY* be done 8.",
+                    "RFC 2119 Keyword": "MAY"
+                },
+                "testable_section_9": {
+                    "description": (
+                        "> This *MAY* be done 9.\n>\n> This is section 9.\n"
+                        "> 1. Item 1\n> 2. Item 2\n>    1. Item 2.1\n"
+                        ">    2. Item 2.2"
+                    ),
+                    "RFC 2119 Keyword": "MAY"
+                }
+            }
+        }