diff --git a/docs/antora-playbook.yml b/docs/antora-playbook.yml index a45e8f6d0..456716c57 100644 --- a/docs/antora-playbook.yml +++ b/docs/antora-playbook.yml @@ -77,6 +77,11 @@ ui: antora: extensions: + - require: '@sntke/antora-mermaid-extension' # <1> + mermaid_library_url: https://cdn.jsdelivr.net/npm/mermaid@10/dist/mermaid.esm.min.mjs # <2> + script_stem: header-scripts # <3> + mermaid_initialize_options: # <4> + start_on_load: true - require: '@antora/lunr-extension' # https://gitlab.com/antora/antora-lunr-extension index_latest_only: true asciidoc: diff --git a/docs/local-antora-playbook.yml b/docs/local-antora-playbook.yml index ce3bae891..c0f2788ee 100644 --- a/docs/local-antora-playbook.yml +++ b/docs/local-antora-playbook.yml @@ -74,6 +74,11 @@ ui: antora: extensions: + - require: '@sntke/antora-mermaid-extension' # <1> + mermaid_library_url: https://cdn.jsdelivr.net/npm/mermaid@10/dist/mermaid.esm.min.mjs # <2> + script_stem: header-scripts # <3> + mermaid_initialize_options: # <4> + start_on_load: true - require: '@antora/lunr-extension' # https://gitlab.com/antora/antora-lunr-extension index_latest_only: true asciidoc: diff --git a/docs/modules/ROOT/pages/contribute.adoc b/docs/modules/ROOT/pages/contribute.adoc index 145fa50d6..8eea4e5e4 100644 --- a/docs/modules/ROOT/pages/contribute.adoc +++ b/docs/modules/ROOT/pages/contribute.adoc @@ -1,5 +1,275 @@ -= Contribute += Contributor's Guide +This page contains information for contributors to the MrDocs project. +It is intended to provide an overview of the codebase and the process of adding new features. +== Codebase Overview +The MrDocs codebase is divided into several modules: +[mermaid] +.... +graph TD + CL[Command Line Arguments] --> P + CF[Configuration File] --> P + P[Options] --> E + P --> CD + P --> G + CD[Compilation Database] --> E + E[Extract Symbols] -->|Corpus| G + G[Generator] --> D(Documentation) +.... + +This section provides an overview of each module and how they interact with each other in the MrDocs codebase. + +[#options] +=== Parsing options + +MrDocs options affect the behavior of the compilation database, how symbols are extracted, and how the documentation is generated. +They are parsed from the command line and configuration file. + +The main entry point of MrDocs is the `DoGenerateAction` function in `src/tool/GenerateAction.cpp`. +It loads the options, creates the compilation database, and runs the extraction and generation steps. +The options formed from a combination of command line arguments and configuration file settings. + +==== Command Line Options + +Command line and common options are defined in `src/tool/ToolArgs.hpp`. +The `ToolArgs` class uses the `llvm::cl` library to define and parse the command line arguments. + +==== Configuration File + +Common options are defined in `mrdocs/Config.hpp`. +The `Config` class represents all public options that could be defined in a configuration file. +It also provides a representation plugins can use to access public options from the command line or configuration file. + +The function `clang::mrdocs::loadConfig` is also provided to parse all public options from a YAML configuration file. + +Internally, MrDocs uses the derived `clang::mrdocs::ConfigImpl` class (`src/lib/Lib/ConfigImpl.hpp`) to also store the private representation of parsed options, such as filters. + +==== Finalizing Options + +Common options are stored in the `Config` class, while the `ToolArgs` class stores common options and the command line options. +For instance, the `config` option can only be set from the command line, as it would be illogical to expect the location of the configuration file to be defined in the configuration file itself. +On the other hand, the `output` option can be set from both the command line and the configuration file so that the user can define a default output location in the configuration file. + +Thus, after the command line and configuration file options are parsed, they are finalized in the `DoGenerateAction` function by calling `ToolArgs::apply`, which overrides the configuration file options in `Config` with the command line options, when applicable. + +As a last step, `DoGenerateAction` converts the public `Config` settings into a `ConfigImpl` object, which is used by the rest of the program with the parsed options. + +[#extract_symbols] +=== Extracting Symbols + +At this stage, the clang frontend is used to parse the source code and generate an AST. +The AST information is extracted and stored in a `Corpus` object (`mrdocs/Corpus.hpp`). + +[#compilation_database] +==== Compilation Database + +The second step in `DoGenerateAction` is to create a `CompilationDatabase` object, so we can extract symbols from its source files. +There are multiple possible sources for this file according to the configuration options: the file might be read directly from the path specified in the options, or it might be generated by MrDocs from build scripts. + +Whatever the source, a derived `MrDocsCompilationDatabase` object (`lib/Lib/MrDocsCompilationDatabase.hpp`) is created to represent the compilation database. +The difference between the original `CompilationDatabase` and the `MrDocsCompilationDatabase` is that the latter includes a number of pre-processing steps to filter and transform compilation commands. + +For each compilation command: + +* Command line arguments are adjusted +** Warnings are supressed +** Additional defines are added +** Implicit include directories are added +** Unrecognized arguments are removed +* Paths are normalized +* Non C++ files are filtered + +[#info_nodes] +==== Info Nodes + +MrDocs represents each C++ symbol or construct as an `Info` node (`mrdocs/Metadata/Info.hpp`). +MrDocs currently defines the following `Info` nodes: + +[c-preprocessor] +==== + +[cols="1,3,2"] +|=== +| Name | Description | Declaration + +#define INFO_PASCAL_AND_DESC(Type, Desc) | `pass:[Type]pass:[Info]` | Desc | `mrdocs/Metadata/pass:[Type].hpp` + +include::partial$InfoNodes.inc[] + +|=== +==== + +`Info` can not only represent direct AST symbols but also {cpp} constructs that need to be inferred from these symbols. +Nodes in the first category will typically be created in the initial extraction step, and nodes in the second category will be created in the finalization step. + +When defining a new `Info` type, it is important to consider how this type will be supported in all other modules of the codebase, including the AST visitor, the bitcode writer, generators, tests, and the documentation. +The script `.github/check_info_nodes_support.sh` will attempt to infer whether most of these features have been implemented for each node type. + +==== Clang LibTooling + +MrDocs uses Clang to extract `Info` objects from the {cpp} AST. +Clang offers two https://clang.llvm.org/docs/Tooling.html[interfaces] to access the C++ AST: the https://clang.llvm.org/doxygen/group__CINDEX.html[`LibClang`] and https://clang.llvm.org/docs/LibTooling.html[`LibTooling`] libraries. +MrDocs uses the latter, as it provides full control over the AST traversal process at the cost of an unstable API. + +In LibTooling, once we have a <>, we can create a `ClangTool` object to run the Clang frontend on a set of source files. + +[source,c++] +---- +clang::tooling::ClangTool Tool(compilationDatabase, sourceFiles); +newFrontendActionFactory actionFactory(); +return Tool.run(actionFactory.get()); +---- + +The `clang::tooling::ClangTool::run` method takes a `clang::tooling::ToolAction` object that defines how to process the AST. +The action object that usually comes from a `clang::tooling::FrontendActionFactory`. +In the example above, the `SyntaxOnlyAction` is used to parse the source code and generate the AST without any further processing. + +In MrDocs, this process happens in `clang::mrdocs::CorpusImpl::build` (`src/lib/Lib/CorpusImpl.cpp`), where we call `Tool.run` for each object in the database with our custom `ASTAction` action and `ASTActionFactory` factory (`src/lib/AST/ASTVisitor.cpp`). + +==== AST Traversal + +While `ASTAction` is the entry point for processing the AST, the real work is done by the `ASTVisitor` class. +As the AST is generated, it is traversed by the `ASTVisitor` class. + +The entry point of this class is `ASTVisitor::build`, which recursively calls `ASTVisitor::traverseDecl` for the root `clang::TranslationUnitDecl` node of the translation unit. +During the AST traversal stage, the complete AST generated by the clang frontend is walked beginning with this root `TranslationUnitDecl` node. + +Each `clang` node is converted into a `<>` node, which is then stored with any relevant information in a `mrdocs::Corpus` object. + +==== USR Generation + +It is during this stage that USRs (universal symbol references) are generated and hashed with SHA1 to form the 160 bit `SymbolID` for an entity. +Except for built-in types, *all* entities referenced in the corpus will be traversed and be assigned a `SymbolID`; including those from the standard library. +This is necessary to generate the full interface for user-defined types. + +==== Bitcode + +To maximize the size of the code base MrDocs is capable of processing, `Info` +types generated during traversal are serialized to a compressed bitcode representation. + +The `ASTVisitor` reports each new `Info` object to the `BitcodeExecutionContext` (`src/lib/Lib/ExecutionContext.cpp`) which serializes it to the bitcode file. + +==== Finalizing the Corpus + +After running the AST traversal on all translation units, `CorpusImpl::build` contains finalization steps for the `Corpus` object. +At this point, we process C++ constructs that are not directly represented in the AST. + +The first finalization step happens in `BitcodeExecutionContext::reportEnd` (`src/lib/Lib/ExecutionContext.cpp`), where the `Info` objects with the same `SymbolID` are merged. +The merging step is necessary as there may be multiple identical definitions of the same entity. +For instance, this represents the case where a function is declared at different points in the code base and might have different attributes or comments. +At this step, the doc comments are also finalized. +Each `Info` object has a pointer to its `Javadoc` object (`mrdocs/Metadata/Javadoc.hpp`), which is a representation of the documentation comments. + +After AST traversal and `Info` merging, the result is stored as a map of `Info` objects indexed by their respective `SymbolID`. +A second finalization step is then performed in `clang::mrdocs::finalize` (`src/lib/Metadata/Finalize.cpp`), where any references to `SymbolID` objects that don't exist are removed. +This is necessary because the AST traversal will generate references to entities that should be filtered and are not present in the corpus. + +At this point, the `Corpus` object contains representations of all entities in the code base and further semantic {cpp} constructs that are not directly represented in the AST can be inferred. + +=== Generators + +Documentation generators may traverse this structure by calling `Corpus::traverse` with a `Corpus::Visitor` derived visitor and the `SymbolID` of the entity to visit (e.g. the global namespace). + +Documentation generators are responsible for traversing the corpus and generating documentation in the desired format. + +The API for documentation generators is defined in `mrdocs/Generator.hpp`. + +=== Directory Layout + +The MrDocs codebase is organized as follows: + +==== `include/`—The main include directory + +This directory contains the public headers for the MrDocs library. + +* `include/mrdocs/`—The core library headers +** `include/mrdocs/ADT`—Data Structures +** `include/mrdocs/Dom`—The Document Object Model for Abstract Trees +** `include/mrdocs/Metadata`—`Info` nodes and metadata classes +** `include/mrdocs/Support`—Various utility classes + +==== `src/`—The main source directory + +This directory contains the source code for the MrDocs library and private headers. + +* `src/lib/`—The core library +** `src/lib/AST/`—The AST traversal code +** `src/lib/Dom/`—The Document Object Model for Abstract Trees +** `src/lib/Gen/`—Generators +** `src/lib/Lib/`—The core library classes +** `src/lib/Metadata/`—`Info` nodes and metadata classes +** `src/lib/Support/`—Various utility classes +* `src/test/`—The test directory +* `src/test_suite/`—The library used for testing +* `src/tool/`—The main program + +==== `share/`—Shared resources + +This directory contains shared resources for the documentation generators and utilities for developers. +Its subdirectories are installed in the `share` directory of the installation. + +* `share/`—Shared resources for the documentation generators +* `share/cmake/`—CMake modules to generate the documentation +* `share/gdb/`—GDB pretty printers +* `share/mrdocs/`—Shared resources for the documentation generators + +==== `docs`—Documentation + +This directory contains the documentation for the MrDocs project. +The documentation is written in AsciiDoc and can be built using the Antora tool. + +* `docs/`—Documentation configuration files and scripts +** `docs/modules/`—The documentation asciidoc files +** `docs/extensions`—Antora extensions for the documentation + +=== `third-party/`—Helpers for third-party libraries + +This directory contains build scripts and configuration files for third-party libraries. + +* `third-party/`—Third-party libraries +** `third-party/llvm/`—CMake Presets for LLVM +** `third-party/duktape/`—CMake scripts for Duktape +** `third-party/lua/`—A bundled Lua interpreter + +== Coding Standards + +=== Paths + +The AST visitor and metadata all use forward slashes to represent file pathnames, even on Windows. +This is so the generated reference documentation does not vary based on the platform. + +=== Exceptions + +Errors thrown by the program should always have type `Exception`. +Objects of this type are capable of transporting an `Error` object. +This is important for the scripting to work; exceptions are used to propagate errors from library code to scripts and back to the invoking code. +For exceptional cases, these thrown exceptions should be uncaught. +The tool installs an uncaught exception handler that prints a stack trace and exits the process immediately. + +=== Testing + +All new features should be accompanied by tests. +The `mrdocs-test` target is used to run the test suites. +This target has its entry point in `src/test/TestMain.cpp`, which can take two paths: + +* Golden testing: When input paths are provided to the test executable via the command line, the test suite will run the `DoTestAction()` that iterates all files in `test-files` comparing the input source files with the expected XML output files. +* Unit testing: When no input paths are provided, all unit tests will be run via `unit_test_main()`, defined by the our test-suite library in `src/test_suite/test_suite.cpp`. + +The fixtures for golden testing are defined in `test-files/golden-tests`, where files in each directory have the following format: + +* `mrdocs.yml`: Basic configuration options for all files in this directory. +* `.cpp`: The input source file to extract symbols from. +* `.xml`: The expected XML output file generated with the XML generator. +* `.bad.xml`: The test output file generated when the test fails. +* `.yml`: Extra configuration options for this specific file. + +== Contributing + +If you find a bug or have a feature request, please open an issue on the MrDocs GitHub repository: https://github.com/cppalliance/mrdocs/issues + +If you would like to contribute a feature or bug fix, please open a pull request on the MrDocs GitHub repository: https://github.com/cppalliance/mrdocs/pulls + +If you would like to discuss a feature or bug fix before opening a pull request, discussing happen in the `#mrdocs` channel on the Cpplang Slack: https://cpplang.slack.com/ \ No newline at end of file diff --git a/docs/modules/ROOT/partials/InfoNodes.inc b/docs/modules/ROOT/partials/InfoNodes.inc new file mode 120000 index 000000000..4c1f22215 --- /dev/null +++ b/docs/modules/ROOT/partials/InfoNodes.inc @@ -0,0 +1 @@ +../../../../include/mrdocs/Metadata/InfoNodes.inc \ No newline at end of file