Skip to content

Commit

Permalink
chore: fix absolute path algorithms
Browse files Browse the repository at this point in the history
  • Loading branch information
vinniefalco committed Jun 10, 2023
1 parent e427070 commit 9c0f220
Show file tree
Hide file tree
Showing 9 changed files with 205 additions and 206 deletions.
34 changes: 25 additions & 9 deletions docs/modules/ROOT/pages/design-notes.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -2,24 +2,40 @@

== AST Traversal

During the AST traversal stage, the complete AST (generated by the clang frontend) is walked beginning with the root `TranslationUnitDecl` node.
It is during this stage that USRs (universal symbol references) are generated and hashed with SHA1 to form the 160 bit `SymbolID` for an entity.
With the exception of built-in types, *all* entities referenced in the corpus will be traversed and be assigned a `SymbolID`; including those from the standard library.
This is necessary to generate the full interface for user-defined types.
During the AST traversal stage, the complete AST (generated by the clang frontend)
is walked beginning with the root `TranslationUnitDecl` node. It is during this
stage that USRs (universal symbol references) are generated and hashed with SHA1
to form the 160 bit `SymbolID` for an entity. With the exception of built-in types,
*all* entities referenced in the corpus will be traversed and be assigned a `SymbolID`;
including those from the standard library. This is necessary to generate the
full interface for user-defined types.

== Bitcode

AST traversal is performed in parallel on a per-translation-unit basis.
To maximize the size of the code base MrDox is capable of processing, `Info` types generated during traversal are serialized to a compressed bitcode representation.
Once AST traversal is complete for all translation units, the bitcode is deserialized back into `Info` types, and then merged to form the corpus.
The merging step is necessary as there may be multiple identical definitions of the same entity (e.g. for class types, templates, inline functions, etc), as well as functions declared in one translation unit & defined in another.
To maximize the size of the code base MrDox is capable of processing, `Info`
types generated during traversal are serialized to a compressed bitcode representation.
Once AST traversal is complete for all translation units, the bitcode is deserialized
back into `Info` types, and then merged to form the corpus. The merging step is necessar
as there may be multiple identical definitions of the same entity (e.g. for class types,
templates, inline functions, etc), as well as functions declared in one translation
unit & defined in another.

== The Corpus

After AST traversal and `Info` merging, the result is stored as a map of `Info`s indexed by their respective `SymbolID`s. Documentation generators may traverse this structure by calling `Corpus::traverse` with a `Corpus::Visitor` derived visitor and the `SymbolID` of the entity to visit (e.g. the global namespace).
After AST traversal and `Info` merging, the result is stored as a map of `Info`s
indexed by their respective `SymbolID`s. Documentation generators may traverse
this structure by calling `Corpus::traverse` with a `Corpus::Visitor` derived
visitor and the `SymbolID` of the entity to visit (e.g. the global namespace).

== Namespaces

Namespaces do not have a source location.
This is because there can be many namespaces.
We probably don't want to store any javadocs for namespaces either.
We probably don't want to store any javadocs for namespaces either.

== Paths

The AST visitor and metadata all use forward slashes to represent file
pathnames, even on Windows. This is so the generated reference documentation
does not vary based on the platform.
98 changes: 64 additions & 34 deletions include/mrdox/Support/Path.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -20,40 +20,6 @@
namespace clang {
namespace mrdox {

/** Append a trailing native separator if not already present.
*/
MRDOX_DECL
std::string_view
makeDirsy(
std::string& dirName);

/** Return a native absolute path representing a path.
This function returns an absolute path using
native separators given a relative or absolute
path.
If the input path is relative, it is first made
absolute by resolving it against the configuration's
working directory.
The input path can use native, POSIX, or Windows
separators.
The returned path will have a trailing separator.
*/
MRDOX_DECL
std::string
makeAbsoluteDirectory(
std::string_view dirName,
std::string_view workingDir);

MRDOX_DECL
std::string
makeFilePath(
std::string_view dirName,
std::string_view fileName);

//------------------------------------------------

struct AnyFileVisitor
Expand Down Expand Up @@ -105,6 +71,42 @@ forEachFile(

namespace files {

/** Return true if pathName ends in a separator.
*/
MRDOX_DECL
bool
isDirsy(
std::string_view pathName) noexcept;

/** Return a normalized path.
This function returns a new path based on
applying the following changes to the passed
path:
@li "." and ".." are resolved
@li Separators made uniform
@return The normalized path.
@param pathName The relative or absolute path.
*/
MRDOX_DECL
std::string
normalizePath(
std::string_view pathName);

/** Return the parent directory.
If the parent directory is defined, the returned
path will always have a trailing separator.
*/
MRDOX_DECL
std::string
getParentDir(
std::string_view pathName);

/** Return the filename part of the path.
*/
MRDOX_DECL
Expand All @@ -126,6 +128,34 @@ std::string
makeDirsy(
std::string_view pathName);

/** Return an absolute path from a possibly relative path.
Relative paths are resolved against the
current working directory of the process.
@return The absolute path, or an error if
any occurred.
*/
MRDOX_DECL
Expected<std::string>
makeAbsolute(
std::string_view pathName);

/** Return an absolute path from a possibly relative path.
*/
MRDOX_DECL
std::string
makeAbsolute(
std::string_view pathName,
std::string_view workingDir);

/** Convert all backward slashes to forward slashes.
*/
MRDOX_DECL
std::string
makePosixStyle(
std::string_view pathName);

MRDOX_DECL
std::string
appendPath(
Expand Down
3 changes: 1 addition & 2 deletions source/AST/ASTVisitor.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -551,8 +551,7 @@ shouldExtract(
FileFilter());

FileFilter& ff = it->second;
File_ = loc.getFilename();
convert_to_slash(File_);
File_ = files::makePosixStyle(loc.getFilename());

// file has not been previously visited
if(inserted)
Expand Down
2 changes: 1 addition & 1 deletion source/AST/ASTVisitor.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ class ASTVisitor
public:
struct FileFilter
{
llvm::SmallString<0> prefix;
std::string prefix;
bool include = true;
};

Expand Down
69 changes: 19 additions & 50 deletions source/ConfigImpl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
#include "Support/Error.hpp"
#include "Support/Path.hpp"
#include "Support/YamlFwd.hpp"
#include <mrdox/Support/Path.hpp>
#include <clang/Tooling/AllTUsExecution.h>
#include <llvm/Support/FileSystem.h>
#include <llvm/Support/Path.h>
Expand Down Expand Up @@ -100,41 +101,18 @@ construct(
if( concurrency == 0)
concurrency = llvm::thread::hardware_concurrency();

// fix source-root
auto temp = normalizedPath(sourceRoot_);
makeDirsy(temp, path::Style::posix);
sourceRoot_ = temp.str();
// This has to be forward slash style
sourceRoot_ = files::makePosixStyle(files::makeDirsy(
files::makeAbsolute(sourceRoot_, workingDir)));

// fix input files
// adjust input files
for(auto& name : inputFileIncludes_)
name = normalizedPath(name).str();
name = files::makePosixStyle(
files::makeAbsolute(name, workingDir));

return Error::success();
}

llvm::SmallString<0>
ConfigImpl::
normalizedPath(
llvm::StringRef pathName)
{
namespace path = llvm::sys::path;

llvm::SmallString<0> result;
if(! path::is_absolute(pathName))
{
result = workingDir;
path::append(result, path::Style::posix, pathName);
path::remove_dots(result, true, path::Style::posix);
}
else
{
result = pathName;
path::remove_dots(result, true);
convert_to_slash(result);
}
return result;
}

ConfigImpl::
ConfigImpl()
: threadPool_(
Expand Down Expand Up @@ -162,16 +140,16 @@ bool
ConfigImpl::
shouldVisitFile(
llvm::StringRef filePath,
llvm::SmallVectorImpl<char>& prefixPath) const noexcept
std::string& prefixPath) const noexcept
{
namespace path = llvm::sys::path;

llvm::SmallString<32> temp;
SmallPathString temp;
temp = filePath;
if(! path::replace_path_prefix(temp, sourceRoot_, "", path::Style::posix))
return false;
Assert(files::isDirsy(sourceRoot_));
prefixPath.assign(sourceRoot_.begin(), sourceRoot_.end());
makeDirsy(prefixPath);
return true;
}

Expand Down Expand Up @@ -218,31 +196,22 @@ loadConfigFile(
namespace fs = llvm::sys::fs;
namespace path = llvm::sys::path;

SmallPathString temp(configFilePath);
path::remove_dots(temp, true, path::Style::native);

// ensure configFilePath is a regular file
fs::file_status stat;
if(auto ec = fs::status(temp, stat))
return Error("fs::status(\"{}\") returned \"{}\"", temp, ec);
if(stat.type() != fs::file_type::regular_file)
return Error("\"{}\" is not a regular file", temp);
auto temp = files::normalizePath(configFilePath);

// load the file into a string
auto text = llvm::MemoryBuffer::getFile(temp);
// load the config file into a string
auto absPath = files::makeAbsolute(temp);
if(! absPath)
return absPath.getError();
auto text = files::getFileText(*absPath);
if(! text)
return Error("MemoryBuffer::getFile(\"{}\") returned \"{}\"", temp, text.getError());
return text.getError();

// calculate the working directory
SmallPathString workingDir(temp);
path::remove_filename(workingDir);
if(auto ec = fs::make_absolute(workingDir))
return Error("fs::make_absolute(\"{}\") returned \"{}\"", workingDir, ec);
makeDirsy(workingDir);
auto workingDir = files::getParentDir(*absPath);

// attempt to create the config
auto config = std::make_shared<ConfigImpl>();
if(auto err = config->construct(workingDir, (*text)->getBuffer(), extraYaml))
if(auto err = config->construct(workingDir, *text, extraYaml))
return err;
return config;
}
Expand Down
6 changes: 1 addition & 5 deletions source/ConfigImpl.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -63,10 +63,6 @@ class ConfigImpl
llvm::StringRef configYaml,
llvm::StringRef extraYaml);

llvm::SmallString<0>
normalizedPath(
llvm::StringRef pathName);

public:
ConfigImpl();

Expand Down Expand Up @@ -111,7 +107,7 @@ class ConfigImpl
bool
shouldVisitFile(
llvm::StringRef filePath,
llvm::SmallVectorImpl<char>& prefix) const noexcept;
std::string& prefix) const noexcept;

/** A diagnostic handler for reading YAML files.
*/
Expand Down
12 changes: 6 additions & 6 deletions source/GenerateAction.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,9 @@
#include "AST/AbsoluteCompilationDatabase.hpp"
#include <mrdox/Generators.hpp>
#include <mrdox/Support/Report.hpp>
#include <mrdox/Support/Path.hpp>
#include <clang/Tooling/AllTUsExecution.h>
#include <clang/Tooling/JSONCompilationDatabase.h>
#include <llvm/Support/Path.h>
#include <cstdlib>

namespace clang {
Expand All @@ -26,8 +26,6 @@ namespace mrdox {
Error
DoGenerateAction()
{
namespace path = llvm::sys::path;

auto& generators = getGenerators();

// Calculate additional YAML settings from command line options.
Expand All @@ -50,16 +48,18 @@ DoGenerateAction()
return Error("the compilation database path argument is missing");
if(InputPaths.size() > 1)
return Error("got {} input paths where 1 was expected", InputPaths.size());
llvm::StringRef compilationsPath = InputPaths.front();
auto compilationsPath = files::normalizePath(InputPaths.front());
std::string errorMessage;
auto jsonCompilations = tooling::JSONCompilationDatabase::loadFromFile(
compilationsPath, errorMessage, tooling::JSONCommandLineSyntax::AutoDetect);
if(! jsonCompilations)
return Error(std::move(errorMessage));

// Calculate the working directory
llvm::SmallString<240> workingDir(compilationsPath);
path::remove_filename(workingDir);
auto absPath = files::makeAbsolute(compilationsPath);
if(! absPath)
return absPath.getError();
auto workingDir = files::getParentDir(*absPath);

// Convert relative paths to absolute
AbsoluteCompilationDatabase compilations(
Expand Down
Loading

0 comments on commit 9c0f220

Please sign in to comment.