[C++20][Modules] Implement P1857R3 Modules Dependency Discovery#107168
[C++20][Modules] Implement P1857R3 Modules Dependency Discovery#107168
Conversation
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
…tructures to `IdentifierLoc` (#135808) I found this issue when I working on #107168. Currently we have many similiar data structures like: - `std::pair<IdentifierInfo *, SourceLocation>`. - Element type of `ModuleIdPath`. - `IdentifierLocPair`. - `IdentifierLoc`. This PR unify these data structures to `IdentifierLoc`, moved `IdentifierLoc` definition to SourceLocation.h, and deleted other similer data structures. --------- Signed-off-by: yronglin <yronglin777@gmail.com>
…like data structures to `IdentifierLoc` (#135808) I found this issue when I working on llvm/llvm-project#107168. Currently we have many similiar data structures like: - `std::pair<IdentifierInfo *, SourceLocation>`. - Element type of `ModuleIdPath`. - `IdentifierLocPair`. - `IdentifierLoc`. This PR unify these data structures to `IdentifierLoc`, moved `IdentifierLoc` definition to SourceLocation.h, and deleted other similer data structures. --------- Signed-off-by: yronglin <yronglin777@gmail.com>
… data structures to `IdentifierLoc` (#136077) This PR reland #135808, fixed some missed changes in LLDB. I found this issue when I working on #107168. Currently we have many similiar data structures like: - std::pair<IdentifierInfo *, SourceLocation>. - Element type of ModuleIdPath. - IdentifierLocPair. - IdentifierLoc. This PR unify these data structures to IdentifierLoc, moved IdentifierLoc definition to SourceLocation.h, and deleted other similer data structures. --------- Signed-off-by: yronglin <yronglin777@gmail.com>
…` pair-like data structures to `IdentifierLoc` (#136077) This PR reland llvm/llvm-project#135808, fixed some missed changes in LLDB. I found this issue when I working on llvm/llvm-project#107168. Currently we have many similiar data structures like: - std::pair<IdentifierInfo *, SourceLocation>. - Element type of ModuleIdPath. - IdentifierLocPair. - IdentifierLoc. This PR unify these data structures to IdentifierLoc, moved IdentifierLoc definition to SourceLocation.h, and deleted other similer data structures. --------- Signed-off-by: yronglin <yronglin777@gmail.com>
… data structures to `IdentifierLoc` (llvm#136077) This PR reland llvm#135808, fixed some missed changes in LLDB. I found this issue when I working on llvm#107168. Currently we have many similiar data structures like: - std::pair<IdentifierInfo *, SourceLocation>. - Element type of ModuleIdPath. - IdentifierLocPair. - IdentifierLoc. This PR unify these data structures to IdentifierLoc, moved IdentifierLoc definition to SourceLocation.h, and deleted other similer data structures. --------- Signed-off-by: yronglin <yronglin777@gmail.com>
Signed-off-by: yronglin <yronglin777@gmail.com>
dbc377e to
d300a2b
Compare
|
@llvm/pr-subscribers-clang-driver @llvm/pr-subscribers-clang-modules Author: None (yronglin) ChangesImplement P1857R3 Modules Dependency Discovery.
At the start of phase 4 an import or module token is treated as starting a directive and are converted to their respective keywords iff:
Otherwise the token is treated as an identifier. Additionally:
Need add more test Patch is 134.74 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/107168.diff 43 Files Affected:
diff --git a/clang/examples/AnnotateFunctions/AnnotateFunctions.cpp b/clang/examples/AnnotateFunctions/AnnotateFunctions.cpp
index d872020c2d8a3..22a3eb97f938b 100644
--- a/clang/examples/AnnotateFunctions/AnnotateFunctions.cpp
+++ b/clang/examples/AnnotateFunctions/AnnotateFunctions.cpp
@@ -65,7 +65,7 @@ class PragmaAnnotateHandler : public PragmaHandler {
Token Tok;
PP.LexUnexpandedToken(Tok);
if (Tok.isNot(tok::eod))
- PP.Diag(Tok, diag::ext_pp_extra_tokens_at_eol) << "pragma";
+ PP.Diag(Tok, diag::ext_pp_extra_tokens_at_eol) << "#pragma";
if (HandledDecl) {
DiagnosticsEngine &D = PP.getDiagnostics();
diff --git a/clang/include/clang/Basic/DiagnosticLexKinds.td b/clang/include/clang/Basic/DiagnosticLexKinds.td
index 723f5d48b4f5f..f975a63b369b5 100644
--- a/clang/include/clang/Basic/DiagnosticLexKinds.td
+++ b/clang/include/clang/Basic/DiagnosticLexKinds.td
@@ -466,6 +466,8 @@ def err_pp_embed_device_file : Error<
def ext_pp_extra_tokens_at_eol : ExtWarn<
"extra tokens at end of #%0 directive">, InGroup<ExtraTokens>;
+def ext_pp_extra_tokens_at_module_directive_eol : ExtWarn<
+ "extra tokens at end of '%0' directive">, InGroup<ExtraTokens>;
def ext_pp_comma_expr : Extension<"comma operator in operand of #if">;
def ext_pp_bad_vaargs_use : Extension<
@@ -496,7 +498,7 @@ def warn_cxx98_compat_variadic_macro : Warning<
def ext_named_variadic_macro : Extension<
"named variadic macros are a GNU extension">, InGroup<VariadicMacros>;
def err_embedded_directive : Error<
- "embedding a #%0 directive within macro arguments is not supported">;
+ "embedding a %select{#|C++ }0%1 directive within macro arguments is not supported">;
def ext_embedded_directive : Extension<
"embedding a directive within macro arguments has undefined behavior">,
InGroup<DiagGroup<"embedded-directive">>;
@@ -983,6 +985,18 @@ def warn_module_conflict : Warning<
InGroup<ModuleConflict>;
// C++20 modules
+def err_pp_expected_module_name_or_header_name : Error<
+ "expected module name or header name">;
+def err_pp_expected_semi_after_module_or_import : Error<
+ "'%select{module|import}0' directive must end with a ';' on the same line">;
+def err_module_decl_in_header : Error<
+ "module declaration must not come from an #include directive">;
+def err_pp_cond_span_module_decl : Error<
+ "preprocessor conditionals shall not span a module declaration">;
+def err_pp_module_expected_ident : Error<
+ "expected a module name after '%select{module|import}0'">;
+def err_pp_unsupported_module_partition : Error<
+ "module partitions are only supported for C++20 onwards">;
def err_header_import_semi_in_macro : Error<
"semicolon terminating header import declaration cannot be produced "
"by a macro">;
diff --git a/clang/include/clang/Basic/DiagnosticParseKinds.td b/clang/include/clang/Basic/DiagnosticParseKinds.td
index 3aa36ad59d0b9..c06e2f090b429 100644
--- a/clang/include/clang/Basic/DiagnosticParseKinds.td
+++ b/clang/include/clang/Basic/DiagnosticParseKinds.td
@@ -1760,8 +1760,8 @@ def ext_bit_int : Extension<
} // end of Parse Issue category.
let CategoryName = "Modules Issue" in {
-def err_unexpected_module_decl : Error<
- "module declaration can only appear at the top level">;
+def err_unexpected_module_import_decl : Error<
+ "%select{module|import}0 declaration can only appear at the top level">;
def err_module_expected_ident : Error<
"expected a module name after '%select{module|import}0'">;
def err_attribute_not_module_attr : Error<
@@ -1782,8 +1782,6 @@ def err_module_fragment_exported : Error<
def err_private_module_fragment_expected_semi : Error<
"expected ';' after private module fragment declaration">;
def err_missing_before_module_end : Error<"expected %0 at end of module">;
-def err_unsupported_module_partition : Error<
- "module partitions are only supported for C++20 onwards">;
def err_import_not_allowed_here : Error<
"imports must immediately follow the module declaration">;
def err_partition_import_outside_module : Error<
diff --git a/clang/include/clang/Basic/IdentifierTable.h b/clang/include/clang/Basic/IdentifierTable.h
index 54540193cfcc0..add6c6ac629a1 100644
--- a/clang/include/clang/Basic/IdentifierTable.h
+++ b/clang/include/clang/Basic/IdentifierTable.h
@@ -179,6 +179,10 @@ class alignas(IdentifierInfoAlignment) IdentifierInfo {
LLVM_PREFERRED_TYPE(bool)
unsigned IsModulesImport : 1;
+ // True if this is the 'module' contextual keyword.
+ LLVM_PREFERRED_TYPE(bool)
+ unsigned IsModulesDecl : 1;
+
// True if this is a mangled OpenMP variant name.
LLVM_PREFERRED_TYPE(bool)
unsigned IsMangledOpenMPVariantName : 1;
@@ -215,8 +219,9 @@ class alignas(IdentifierInfoAlignment) IdentifierInfo {
IsCPPOperatorKeyword(false), NeedsHandleIdentifier(false),
IsFromAST(false), ChangedAfterLoad(false), FEChangedAfterLoad(false),
RevertedTokenID(false), OutOfDate(false), IsModulesImport(false),
- IsMangledOpenMPVariantName(false), IsDeprecatedMacro(false),
- IsRestrictExpansion(false), IsFinal(false), IsKeywordInCpp(false) {}
+ IsModulesDecl(false), IsMangledOpenMPVariantName(false),
+ IsDeprecatedMacro(false), IsRestrictExpansion(false), IsFinal(false),
+ IsKeywordInCpp(false) {}
public:
IdentifierInfo(const IdentifierInfo &) = delete;
@@ -528,6 +533,18 @@ class alignas(IdentifierInfoAlignment) IdentifierInfo {
RecomputeNeedsHandleIdentifier();
}
+ /// Determine whether this is the contextual keyword \c module.
+ bool isModulesDeclaration() const { return IsModulesDecl; }
+
+ /// Set whether this identifier is the contextual keyword \c module.
+ void setModulesDeclaration(bool I) {
+ IsModulesDecl = I;
+ if (I)
+ NeedsHandleIdentifier = true;
+ else
+ RecomputeNeedsHandleIdentifier();
+ }
+
/// Determine whether this is the mangled name of an OpenMP variant.
bool isMangledOpenMPVariantName() const { return IsMangledOpenMPVariantName; }
@@ -745,10 +762,11 @@ class IdentifierTable {
// contents.
II->Entry = &Entry;
- // If this is the 'import' contextual keyword, mark it as such.
+ // If this is the 'import' or 'module' contextual keyword, mark it as such.
if (Name == "import")
II->setModulesImport(true);
-
+ else if (Name == "module")
+ II->setModulesDeclaration(true);
return *II;
}
diff --git a/clang/include/clang/Basic/TokenKinds.def b/clang/include/clang/Basic/TokenKinds.def
index 94e72fea56a68..7750c84dbef78 100644
--- a/clang/include/clang/Basic/TokenKinds.def
+++ b/clang/include/clang/Basic/TokenKinds.def
@@ -133,6 +133,9 @@ PPKEYWORD(pragma)
// C23 & C++26 #embed
PPKEYWORD(embed)
+// C++20 Module Directive
+PPKEYWORD(module)
+
// GNU Extensions.
PPKEYWORD(import)
PPKEYWORD(include_next)
@@ -1023,6 +1026,9 @@ ANNOTATION(module_include)
ANNOTATION(module_begin)
ANNOTATION(module_end)
+// Annotations for C++, Clang and Objective-C named modules.
+ANNOTATION(module_name)
+
// Annotation for a header_name token that has been looked up and transformed
// into the name of a header unit.
ANNOTATION(header_unit)
diff --git a/clang/include/clang/Frontend/CompilerInstance.h b/clang/include/clang/Frontend/CompilerInstance.h
index 0ae490f0e8073..112d3b00160fd 100644
--- a/clang/include/clang/Frontend/CompilerInstance.h
+++ b/clang/include/clang/Frontend/CompilerInstance.h
@@ -863,7 +863,7 @@ class CompilerInstance : public ModuleLoader {
/// load it.
ModuleLoadResult findOrCompileModuleAndReadAST(StringRef ModuleName,
SourceLocation ImportLoc,
- SourceLocation ModuleNameLoc,
+ SourceRange ModuleNameRange,
bool IsInclusionDirective);
/// Creates a \c CompilerInstance for compiling a module.
diff --git a/clang/include/clang/Lex/CodeCompletionHandler.h b/clang/include/clang/Lex/CodeCompletionHandler.h
index bd3e05a36bb33..2ef29743415ae 100644
--- a/clang/include/clang/Lex/CodeCompletionHandler.h
+++ b/clang/include/clang/Lex/CodeCompletionHandler.h
@@ -13,12 +13,15 @@
#ifndef LLVM_CLANG_LEX_CODECOMPLETIONHANDLER_H
#define LLVM_CLANG_LEX_CODECOMPLETIONHANDLER_H
+#include "clang/Basic/IdentifierTable.h"
+#include "clang/Basic/SourceLocation.h"
#include "llvm/ADT/StringRef.h"
namespace clang {
class IdentifierInfo;
class MacroInfo;
+using ModuleIdPath = ArrayRef<IdentifierLoc>;
/// Callback handler that receives notifications when performing code
/// completion within the preprocessor.
@@ -70,6 +73,11 @@ class CodeCompletionHandler {
/// file where we expect natural language, e.g., a comment, string, or
/// \#error directive.
virtual void CodeCompleteNaturalLanguage() { }
+
+ /// Callback invoked when performing code completion inside the module name
+ /// part of an import directive.
+ virtual void CodeCompleteModuleImport(SourceLocation ImportLoc,
+ ModuleIdPath Path) {}
};
}
diff --git a/clang/include/clang/Lex/Lexer.h b/clang/include/clang/Lex/Lexer.h
index bb65ae010cffa..a595cda1eaa77 100644
--- a/clang/include/clang/Lex/Lexer.h
+++ b/clang/include/clang/Lex/Lexer.h
@@ -124,7 +124,7 @@ class Lexer : public PreprocessorLexer {
//===--------------------------------------------------------------------===//
// Context that changes as the file is lexed.
// NOTE: any state that mutates when in raw mode must have save/restore code
- // in Lexer::isNextPPTokenLParen.
+ // in Lexer::peekNextPPToken.
// BufferPtr - Current pointer into the buffer. This is the next character
// to be lexed.
@@ -642,10 +642,10 @@ class Lexer : public PreprocessorLexer {
BufferPtr = TokEnd;
}
- /// isNextPPTokenLParen - Return 1 if the next unexpanded token will return a
- /// tok::l_paren token, 0 if it is something else and 2 if there are no more
- /// tokens in the buffer controlled by this lexer.
- unsigned isNextPPTokenLParen();
+ /// peekNextPPToken - Return std::nullopt if there are no more tokens in the
+ /// buffer controlled by this lexer, otherwise return the next unexpanded
+ /// token.
+ std::optional<Token> peekNextPPToken();
//===--------------------------------------------------------------------===//
// Lexer character reading interfaces.
diff --git a/clang/include/clang/Lex/Preprocessor.h b/clang/include/clang/Lex/Preprocessor.h
index f2dfd3a349b8b..79a75a116c418 100644
--- a/clang/include/clang/Lex/Preprocessor.h
+++ b/clang/include/clang/Lex/Preprocessor.h
@@ -48,6 +48,7 @@
#include "llvm/Support/Allocator.h"
#include "llvm/Support/Casting.h"
#include "llvm/Support/Registry.h"
+#include "llvm/Support/TrailingObjects.h"
#include <cassert>
#include <cstddef>
#include <cstdint>
@@ -82,6 +83,7 @@ class PreprocessorLexer;
class PreprocessorOptions;
class ScratchBuffer;
class TargetInfo;
+class ModuleNameLoc;
namespace Builtin {
class Context;
@@ -332,8 +334,9 @@ class Preprocessor {
/// lexed, if any.
SourceLocation ModuleImportLoc;
- /// The import path for named module that we're currently processing.
- SmallVector<IdentifierLoc, 2> NamedModuleImportPath;
+ /// The source location of the \c module contextual keyword we just
+ /// lexed, if any.
+ SourceLocation ModuleDeclLoc;
llvm::DenseMap<FileID, SmallVector<const char *>> CheckPoints;
unsigned CheckPointCounter = 0;
@@ -344,6 +347,21 @@ class Preprocessor {
/// Whether the last token we lexed was an '@'.
bool LastTokenWasAt = false;
+ /// Whether we're importing a standard C++20 named Modules.
+ bool ImportingCXXNamedModules = false;
+
+ /// Whether we're declaring a standard C++20 named Modules.
+ bool DeclaringCXXNamedModules = false;
+
+ struct ExportContextualKeywordInfo {
+ Token ExportTok;
+ bool TokAtPhysicalStartOfLine;
+ };
+
+ /// Whether the last token we lexed was an 'export' keyword.
+ std::optional<ExportContextualKeywordInfo> LastTokenWasExportKeyword =
+ std::nullopt;
+
/// A position within a C++20 import-seq.
class StdCXXImportSeq {
public:
@@ -547,12 +565,7 @@ class Preprocessor {
reset();
}
- void handleIdentifier(IdentifierInfo *Identifier) {
- if (isModuleCandidate() && Identifier)
- Name += Identifier->getName().str();
- else if (!isNamedModule())
- reset();
- }
+ void handleModuleName(ModuleNameLoc *Path);
void handleColon() {
if (isModuleCandidate())
@@ -561,13 +574,6 @@ class Preprocessor {
reset();
}
- void handlePeriod() {
- if (isModuleCandidate())
- Name += ".";
- else if (!isNamedModule())
- reset();
- }
-
void handleSemi() {
if (!Name.empty() && isModuleCandidate()) {
if (State == InterfaceCandidate)
@@ -622,10 +628,6 @@ class Preprocessor {
ModuleDeclSeq ModuleDeclState;
- /// Whether the module import expects an identifier next. Otherwise,
- /// it expects a '.' or ';'.
- bool ModuleImportExpectsIdentifier = false;
-
/// The identifier and source location of the currently-active
/// \#pragma clang arc_cf_code_audited begin.
IdentifierLoc PragmaARCCFCodeAuditedInfo;
@@ -1759,6 +1761,19 @@ class Preprocessor {
/// Lex the parameters for an #embed directive, returns nullopt on error.
std::optional<LexEmbedParametersResult> LexEmbedParameters(Token &Current,
bool ForHasEmbed);
+ bool LexModuleNameContinue(Token &Tok, SourceLocation UseLoc,
+ SmallVectorImpl<IdentifierLoc> &Path,
+ bool AllowMacroExpansion = true);
+ void HandleCXXImportDirective(Token Import);
+ void HandleCXXModuleDirective(Token Module);
+
+ /// Callback invoked when the lexer sees one of export, import or module token
+ /// at the start of a line.
+ ///
+ /// This consumes the import, module directive, modifies the
+ /// lexer/preprocessor state, and advances the lexer(s) so that the next token
+ /// read is the correct one.
+ bool HandleModuleContextualKeyword(Token &Result, bool TokAtPhysicalStartOfLine);
bool LexAfterModuleImport(Token &Result);
void CollectPpImportSuffix(SmallVectorImpl<Token> &Toks);
@@ -2282,7 +2297,9 @@ class Preprocessor {
/// Determine whether the next preprocessor token to be
/// lexed is a '('. If so, consume the token and return true, if not, this
/// method should have no observable side-effect on the lexed tokens.
- bool isNextPPTokenLParen();
+ bool isNextPPTokenLParen() {
+ return peekNextPPToken().value_or(Token{}).is(tok::l_paren);
+ }
private:
/// Identifiers used for SEH handling in Borland. These are only
@@ -2342,7 +2359,7 @@ class Preprocessor {
///
/// \return The location of the end of the directive (the terminating
/// newline).
- SourceLocation CheckEndOfDirective(const char *DirType,
+ SourceLocation CheckEndOfDirective(StringRef DirType,
bool EnableMacros = false);
/// Read and discard all tokens remaining on the current line until
@@ -2424,11 +2441,12 @@ class Preprocessor {
}
/// If we're importing a standard C++20 Named Modules.
- bool isInImportingCXXNamedModules() const {
- // NamedModuleImportPath will be non-empty only if we're importing
- // Standard C++ named modules.
- return !NamedModuleImportPath.empty() && getLangOpts().CPlusPlusModules &&
- !IsAtImport;
+ bool isImportingCXXNamedModules() const {
+ return getLangOpts().CPlusPlusModules && ImportingCXXNamedModules;
+ }
+
+ bool isDeclaringCXXNamedModules() const {
+ return getLangOpts().CPlusPlusModules && DeclaringCXXNamedModules;
}
/// Allocate a new MacroInfo object with the provided SourceLocation.
@@ -2661,6 +2679,10 @@ class Preprocessor {
void removeCachedMacroExpandedTokensOfLastLexer();
+ /// Peek the next token. If so, return the token, if not, this
+ /// method should have no observable side-effect on the lexed tokens.
+ std::optional<Token> peekNextPPToken();
+
/// After reading "MACRO(", this method is invoked to read all of the formal
/// arguments specified for the macro invocation. Returns null on error.
MacroArgs *ReadMacroCallArgumentList(Token &MacroName, MacroInfo *MI,
@@ -3078,6 +3100,53 @@ struct EmbedAnnotationData {
StringRef FileName;
};
+/// Represents module name annotation data.
+///
+/// module-name:
+/// module-name-qualifier[opt] identifier
+///
+/// partition-name: [C++20]
+/// : module-name-qualifier[opt] identifier
+///
+/// module-name-qualifier
+/// module-name-qualifier[opt] identifier .
+class ModuleNameLoc final
+ : llvm::TrailingObjects<ModuleNameLoc, IdentifierLoc> {
+ friend TrailingObjects;
+ unsigned NumIdentifierLocs;
+
+ unsigned numTrailingObjects(OverloadToken<IdentifierLoc>) const {
+ return getNumIdentifierLocs();
+ }
+
+ ModuleNameLoc(ModuleIdPath Path) : NumIdentifierLocs(Path.size()) {
+ (void)llvm::copy(Path, getTrailingObjects<IdentifierLoc>());
+ }
+
+public:
+ static std::string stringFromModuleIdPath(ModuleIdPath Path);
+ static ModuleNameLoc *Create(Preprocessor &PP, ModuleIdPath Path);
+ static Token CreateAnnotToken(Preprocessor &PP, ModuleIdPath Path);
+ unsigned getNumIdentifierLocs() const { return NumIdentifierLocs; }
+ ModuleIdPath getModuleIdPath() const {
+ return {getTrailingObjects<IdentifierLoc>(), getNumIdentifierLocs()};
+ }
+
+ SourceLocation getBeginLoc() const {
+ return getModuleIdPath().front().getLoc();
+ }
+ SourceLocation getEndLoc() const {
+ auto &Last = getModuleIdPath().back();
+ return Last.getLoc().getLocWithOffset(
+ Last.getIdentifierInfo()->getLength());
+ }
+ SourceRange getRange() const { return {getBeginLoc(), getEndLoc()}; }
+
+ std::string str() const;
+ void print(llvm::raw_ostream &OS) const;
+ void dump() const { print(llvm::errs()); }
+};
+
/// Registry of pragma handlers added by plugins
using PragmaHandlerRegistry = llvm::Registry<PragmaHandler>;
diff --git a/clang/include/clang/Lex/Token.h b/clang/include/clang/Lex/Token.h
index 4f29fb7d11415..8e81207ddf8d7 100644
--- a/clang/include/clang/Lex/Token.h
+++ b/clang/include/clang/Lex/Token.h
@@ -231,6 +231,9 @@ class Token {
PtrData = const_cast<char*>(Ptr);
}
+ template <class T> T getAnnotationValueAs() const {
+ return static_cast<T>(getAnnotationValue());
+ }
void *getAnnotationValue() const {
assert(isAnnotation() && "Used AnnotVal on non-annotation token");
return PtrData;
@@ -289,6 +292,10 @@ class Token {
/// Return the ObjC keyword kind.
tok::ObjCKeywordKind getObjCKeywordID() const;
+ /// Return true if we have an C++20 Modules contextual keyword(export, import
+ /// or module).
+ bool isModuleContextualKeyword(bool AllowExport = true) const;
+
bool isSimpleTypeSpecifier(const LangOptions &LangOpts) const;
/// Return true if this token has trigraphs or escaped newlines in it.
diff --git a/clang/include/clang/Lex/TokenLexer.h b/clang/include/clang/Lex/TokenLexer.h
index 4d229ae610674..777b4e6266c71 100644
--- a/clang/include/clang/Lex/TokenLexer.h
+++ b/clang/include/clang/Lex/TokenLexer.h
@@ -139,10 +139,9 @@ class TokenLexer {
void Init(const Token *TokArray, unsigned NumToks, bool DisableMacroExpansion,
bool OwnsTokens, bool IsReinject);
- /// If the next token lexed will pop this macro off the
- /// expansion stack, return 2. If the next unexpanded token is a '(', return
- /// 1, otherwise return 0.
- unsigned isNextTokenLParen() const;
+ /// If the next token lexed will pop this macro off the expansion stack,
+ /// return std::nullopt, otherwise return the next unexpanded token.
+ std::optional<Token> peekNextPPToken() const;
/// Lex and return a token from this macro stream.
bool Lex(Token &Tok);
diff --git a/clang/include/clang/Parse/Parser.h b/clang/include/clang/Parse/Parser.h
index c4bef4729fd36..a59a99bbac7c6 100644
--- a/clang/include/clang/Parse/Parser.h
+++ b/clang/include/clang/Parse/Parser.h
@@ -1079,6 +1079,8 @@ class Parser : public CodeCompletionHandler {
unsigned ArgumentIndex) override;
...
[truncated]
|
|
@llvm/pr-subscribers-clang Author: None (yronglin) ChangesImplement P1857R3 Modules Dependency Discovery.
At the start of phase 4 an import or module token is treated as starting a directive and are converted to their respective keywords iff:
Otherwise the token is treated as an identifier. Additionally:
Need add more test Patch is 134.74 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/107168.diff 43 Files Affected:
diff --git a/clang/examples/AnnotateFunctions/AnnotateFunctions.cpp b/clang/examples/AnnotateFunctions/AnnotateFunctions.cpp
index d872020c2d8a3..22a3eb97f938b 100644
--- a/clang/examples/AnnotateFunctions/AnnotateFunctions.cpp
+++ b/clang/examples/AnnotateFunctions/AnnotateFunctions.cpp
@@ -65,7 +65,7 @@ class PragmaAnnotateHandler : public PragmaHandler {
Token Tok;
PP.LexUnexpandedToken(Tok);
if (Tok.isNot(tok::eod))
- PP.Diag(Tok, diag::ext_pp_extra_tokens_at_eol) << "pragma";
+ PP.Diag(Tok, diag::ext_pp_extra_tokens_at_eol) << "#pragma";
if (HandledDecl) {
DiagnosticsEngine &D = PP.getDiagnostics();
diff --git a/clang/include/clang/Basic/DiagnosticLexKinds.td b/clang/include/clang/Basic/DiagnosticLexKinds.td
index 723f5d48b4f5f..f975a63b369b5 100644
--- a/clang/include/clang/Basic/DiagnosticLexKinds.td
+++ b/clang/include/clang/Basic/DiagnosticLexKinds.td
@@ -466,6 +466,8 @@ def err_pp_embed_device_file : Error<
def ext_pp_extra_tokens_at_eol : ExtWarn<
"extra tokens at end of #%0 directive">, InGroup<ExtraTokens>;
+def ext_pp_extra_tokens_at_module_directive_eol : ExtWarn<
+ "extra tokens at end of '%0' directive">, InGroup<ExtraTokens>;
def ext_pp_comma_expr : Extension<"comma operator in operand of #if">;
def ext_pp_bad_vaargs_use : Extension<
@@ -496,7 +498,7 @@ def warn_cxx98_compat_variadic_macro : Warning<
def ext_named_variadic_macro : Extension<
"named variadic macros are a GNU extension">, InGroup<VariadicMacros>;
def err_embedded_directive : Error<
- "embedding a #%0 directive within macro arguments is not supported">;
+ "embedding a %select{#|C++ }0%1 directive within macro arguments is not supported">;
def ext_embedded_directive : Extension<
"embedding a directive within macro arguments has undefined behavior">,
InGroup<DiagGroup<"embedded-directive">>;
@@ -983,6 +985,18 @@ def warn_module_conflict : Warning<
InGroup<ModuleConflict>;
// C++20 modules
+def err_pp_expected_module_name_or_header_name : Error<
+ "expected module name or header name">;
+def err_pp_expected_semi_after_module_or_import : Error<
+ "'%select{module|import}0' directive must end with a ';' on the same line">;
+def err_module_decl_in_header : Error<
+ "module declaration must not come from an #include directive">;
+def err_pp_cond_span_module_decl : Error<
+ "preprocessor conditionals shall not span a module declaration">;
+def err_pp_module_expected_ident : Error<
+ "expected a module name after '%select{module|import}0'">;
+def err_pp_unsupported_module_partition : Error<
+ "module partitions are only supported for C++20 onwards">;
def err_header_import_semi_in_macro : Error<
"semicolon terminating header import declaration cannot be produced "
"by a macro">;
diff --git a/clang/include/clang/Basic/DiagnosticParseKinds.td b/clang/include/clang/Basic/DiagnosticParseKinds.td
index 3aa36ad59d0b9..c06e2f090b429 100644
--- a/clang/include/clang/Basic/DiagnosticParseKinds.td
+++ b/clang/include/clang/Basic/DiagnosticParseKinds.td
@@ -1760,8 +1760,8 @@ def ext_bit_int : Extension<
} // end of Parse Issue category.
let CategoryName = "Modules Issue" in {
-def err_unexpected_module_decl : Error<
- "module declaration can only appear at the top level">;
+def err_unexpected_module_import_decl : Error<
+ "%select{module|import}0 declaration can only appear at the top level">;
def err_module_expected_ident : Error<
"expected a module name after '%select{module|import}0'">;
def err_attribute_not_module_attr : Error<
@@ -1782,8 +1782,6 @@ def err_module_fragment_exported : Error<
def err_private_module_fragment_expected_semi : Error<
"expected ';' after private module fragment declaration">;
def err_missing_before_module_end : Error<"expected %0 at end of module">;
-def err_unsupported_module_partition : Error<
- "module partitions are only supported for C++20 onwards">;
def err_import_not_allowed_here : Error<
"imports must immediately follow the module declaration">;
def err_partition_import_outside_module : Error<
diff --git a/clang/include/clang/Basic/IdentifierTable.h b/clang/include/clang/Basic/IdentifierTable.h
index 54540193cfcc0..add6c6ac629a1 100644
--- a/clang/include/clang/Basic/IdentifierTable.h
+++ b/clang/include/clang/Basic/IdentifierTable.h
@@ -179,6 +179,10 @@ class alignas(IdentifierInfoAlignment) IdentifierInfo {
LLVM_PREFERRED_TYPE(bool)
unsigned IsModulesImport : 1;
+ // True if this is the 'module' contextual keyword.
+ LLVM_PREFERRED_TYPE(bool)
+ unsigned IsModulesDecl : 1;
+
// True if this is a mangled OpenMP variant name.
LLVM_PREFERRED_TYPE(bool)
unsigned IsMangledOpenMPVariantName : 1;
@@ -215,8 +219,9 @@ class alignas(IdentifierInfoAlignment) IdentifierInfo {
IsCPPOperatorKeyword(false), NeedsHandleIdentifier(false),
IsFromAST(false), ChangedAfterLoad(false), FEChangedAfterLoad(false),
RevertedTokenID(false), OutOfDate(false), IsModulesImport(false),
- IsMangledOpenMPVariantName(false), IsDeprecatedMacro(false),
- IsRestrictExpansion(false), IsFinal(false), IsKeywordInCpp(false) {}
+ IsModulesDecl(false), IsMangledOpenMPVariantName(false),
+ IsDeprecatedMacro(false), IsRestrictExpansion(false), IsFinal(false),
+ IsKeywordInCpp(false) {}
public:
IdentifierInfo(const IdentifierInfo &) = delete;
@@ -528,6 +533,18 @@ class alignas(IdentifierInfoAlignment) IdentifierInfo {
RecomputeNeedsHandleIdentifier();
}
+ /// Determine whether this is the contextual keyword \c module.
+ bool isModulesDeclaration() const { return IsModulesDecl; }
+
+ /// Set whether this identifier is the contextual keyword \c module.
+ void setModulesDeclaration(bool I) {
+ IsModulesDecl = I;
+ if (I)
+ NeedsHandleIdentifier = true;
+ else
+ RecomputeNeedsHandleIdentifier();
+ }
+
/// Determine whether this is the mangled name of an OpenMP variant.
bool isMangledOpenMPVariantName() const { return IsMangledOpenMPVariantName; }
@@ -745,10 +762,11 @@ class IdentifierTable {
// contents.
II->Entry = &Entry;
- // If this is the 'import' contextual keyword, mark it as such.
+ // If this is the 'import' or 'module' contextual keyword, mark it as such.
if (Name == "import")
II->setModulesImport(true);
-
+ else if (Name == "module")
+ II->setModulesDeclaration(true);
return *II;
}
diff --git a/clang/include/clang/Basic/TokenKinds.def b/clang/include/clang/Basic/TokenKinds.def
index 94e72fea56a68..7750c84dbef78 100644
--- a/clang/include/clang/Basic/TokenKinds.def
+++ b/clang/include/clang/Basic/TokenKinds.def
@@ -133,6 +133,9 @@ PPKEYWORD(pragma)
// C23 & C++26 #embed
PPKEYWORD(embed)
+// C++20 Module Directive
+PPKEYWORD(module)
+
// GNU Extensions.
PPKEYWORD(import)
PPKEYWORD(include_next)
@@ -1023,6 +1026,9 @@ ANNOTATION(module_include)
ANNOTATION(module_begin)
ANNOTATION(module_end)
+// Annotations for C++, Clang and Objective-C named modules.
+ANNOTATION(module_name)
+
// Annotation for a header_name token that has been looked up and transformed
// into the name of a header unit.
ANNOTATION(header_unit)
diff --git a/clang/include/clang/Frontend/CompilerInstance.h b/clang/include/clang/Frontend/CompilerInstance.h
index 0ae490f0e8073..112d3b00160fd 100644
--- a/clang/include/clang/Frontend/CompilerInstance.h
+++ b/clang/include/clang/Frontend/CompilerInstance.h
@@ -863,7 +863,7 @@ class CompilerInstance : public ModuleLoader {
/// load it.
ModuleLoadResult findOrCompileModuleAndReadAST(StringRef ModuleName,
SourceLocation ImportLoc,
- SourceLocation ModuleNameLoc,
+ SourceRange ModuleNameRange,
bool IsInclusionDirective);
/// Creates a \c CompilerInstance for compiling a module.
diff --git a/clang/include/clang/Lex/CodeCompletionHandler.h b/clang/include/clang/Lex/CodeCompletionHandler.h
index bd3e05a36bb33..2ef29743415ae 100644
--- a/clang/include/clang/Lex/CodeCompletionHandler.h
+++ b/clang/include/clang/Lex/CodeCompletionHandler.h
@@ -13,12 +13,15 @@
#ifndef LLVM_CLANG_LEX_CODECOMPLETIONHANDLER_H
#define LLVM_CLANG_LEX_CODECOMPLETIONHANDLER_H
+#include "clang/Basic/IdentifierTable.h"
+#include "clang/Basic/SourceLocation.h"
#include "llvm/ADT/StringRef.h"
namespace clang {
class IdentifierInfo;
class MacroInfo;
+using ModuleIdPath = ArrayRef<IdentifierLoc>;
/// Callback handler that receives notifications when performing code
/// completion within the preprocessor.
@@ -70,6 +73,11 @@ class CodeCompletionHandler {
/// file where we expect natural language, e.g., a comment, string, or
/// \#error directive.
virtual void CodeCompleteNaturalLanguage() { }
+
+ /// Callback invoked when performing code completion inside the module name
+ /// part of an import directive.
+ virtual void CodeCompleteModuleImport(SourceLocation ImportLoc,
+ ModuleIdPath Path) {}
};
}
diff --git a/clang/include/clang/Lex/Lexer.h b/clang/include/clang/Lex/Lexer.h
index bb65ae010cffa..a595cda1eaa77 100644
--- a/clang/include/clang/Lex/Lexer.h
+++ b/clang/include/clang/Lex/Lexer.h
@@ -124,7 +124,7 @@ class Lexer : public PreprocessorLexer {
//===--------------------------------------------------------------------===//
// Context that changes as the file is lexed.
// NOTE: any state that mutates when in raw mode must have save/restore code
- // in Lexer::isNextPPTokenLParen.
+ // in Lexer::peekNextPPToken.
// BufferPtr - Current pointer into the buffer. This is the next character
// to be lexed.
@@ -642,10 +642,10 @@ class Lexer : public PreprocessorLexer {
BufferPtr = TokEnd;
}
- /// isNextPPTokenLParen - Return 1 if the next unexpanded token will return a
- /// tok::l_paren token, 0 if it is something else and 2 if there are no more
- /// tokens in the buffer controlled by this lexer.
- unsigned isNextPPTokenLParen();
+ /// peekNextPPToken - Return std::nullopt if there are no more tokens in the
+ /// buffer controlled by this lexer, otherwise return the next unexpanded
+ /// token.
+ std::optional<Token> peekNextPPToken();
//===--------------------------------------------------------------------===//
// Lexer character reading interfaces.
diff --git a/clang/include/clang/Lex/Preprocessor.h b/clang/include/clang/Lex/Preprocessor.h
index f2dfd3a349b8b..79a75a116c418 100644
--- a/clang/include/clang/Lex/Preprocessor.h
+++ b/clang/include/clang/Lex/Preprocessor.h
@@ -48,6 +48,7 @@
#include "llvm/Support/Allocator.h"
#include "llvm/Support/Casting.h"
#include "llvm/Support/Registry.h"
+#include "llvm/Support/TrailingObjects.h"
#include <cassert>
#include <cstddef>
#include <cstdint>
@@ -82,6 +83,7 @@ class PreprocessorLexer;
class PreprocessorOptions;
class ScratchBuffer;
class TargetInfo;
+class ModuleNameLoc;
namespace Builtin {
class Context;
@@ -332,8 +334,9 @@ class Preprocessor {
/// lexed, if any.
SourceLocation ModuleImportLoc;
- /// The import path for named module that we're currently processing.
- SmallVector<IdentifierLoc, 2> NamedModuleImportPath;
+ /// The source location of the \c module contextual keyword we just
+ /// lexed, if any.
+ SourceLocation ModuleDeclLoc;
llvm::DenseMap<FileID, SmallVector<const char *>> CheckPoints;
unsigned CheckPointCounter = 0;
@@ -344,6 +347,21 @@ class Preprocessor {
/// Whether the last token we lexed was an '@'.
bool LastTokenWasAt = false;
+ /// Whether we're importing a standard C++20 named Modules.
+ bool ImportingCXXNamedModules = false;
+
+ /// Whether we're declaring a standard C++20 named Modules.
+ bool DeclaringCXXNamedModules = false;
+
+ struct ExportContextualKeywordInfo {
+ Token ExportTok;
+ bool TokAtPhysicalStartOfLine;
+ };
+
+ /// Whether the last token we lexed was an 'export' keyword.
+ std::optional<ExportContextualKeywordInfo> LastTokenWasExportKeyword =
+ std::nullopt;
+
/// A position within a C++20 import-seq.
class StdCXXImportSeq {
public:
@@ -547,12 +565,7 @@ class Preprocessor {
reset();
}
- void handleIdentifier(IdentifierInfo *Identifier) {
- if (isModuleCandidate() && Identifier)
- Name += Identifier->getName().str();
- else if (!isNamedModule())
- reset();
- }
+ void handleModuleName(ModuleNameLoc *Path);
void handleColon() {
if (isModuleCandidate())
@@ -561,13 +574,6 @@ class Preprocessor {
reset();
}
- void handlePeriod() {
- if (isModuleCandidate())
- Name += ".";
- else if (!isNamedModule())
- reset();
- }
-
void handleSemi() {
if (!Name.empty() && isModuleCandidate()) {
if (State == InterfaceCandidate)
@@ -622,10 +628,6 @@ class Preprocessor {
ModuleDeclSeq ModuleDeclState;
- /// Whether the module import expects an identifier next. Otherwise,
- /// it expects a '.' or ';'.
- bool ModuleImportExpectsIdentifier = false;
-
/// The identifier and source location of the currently-active
/// \#pragma clang arc_cf_code_audited begin.
IdentifierLoc PragmaARCCFCodeAuditedInfo;
@@ -1759,6 +1761,19 @@ class Preprocessor {
/// Lex the parameters for an #embed directive, returns nullopt on error.
std::optional<LexEmbedParametersResult> LexEmbedParameters(Token &Current,
bool ForHasEmbed);
+ bool LexModuleNameContinue(Token &Tok, SourceLocation UseLoc,
+ SmallVectorImpl<IdentifierLoc> &Path,
+ bool AllowMacroExpansion = true);
+ void HandleCXXImportDirective(Token Import);
+ void HandleCXXModuleDirective(Token Module);
+
+ /// Callback invoked when the lexer sees one of export, import or module token
+ /// at the start of a line.
+ ///
+ /// This consumes the import, module directive, modifies the
+ /// lexer/preprocessor state, and advances the lexer(s) so that the next token
+ /// read is the correct one.
+ bool HandleModuleContextualKeyword(Token &Result, bool TokAtPhysicalStartOfLine);
bool LexAfterModuleImport(Token &Result);
void CollectPpImportSuffix(SmallVectorImpl<Token> &Toks);
@@ -2282,7 +2297,9 @@ class Preprocessor {
/// Determine whether the next preprocessor token to be
/// lexed is a '('. If so, consume the token and return true, if not, this
/// method should have no observable side-effect on the lexed tokens.
- bool isNextPPTokenLParen();
+ bool isNextPPTokenLParen() {
+ return peekNextPPToken().value_or(Token{}).is(tok::l_paren);
+ }
private:
/// Identifiers used for SEH handling in Borland. These are only
@@ -2342,7 +2359,7 @@ class Preprocessor {
///
/// \return The location of the end of the directive (the terminating
/// newline).
- SourceLocation CheckEndOfDirective(const char *DirType,
+ SourceLocation CheckEndOfDirective(StringRef DirType,
bool EnableMacros = false);
/// Read and discard all tokens remaining on the current line until
@@ -2424,11 +2441,12 @@ class Preprocessor {
}
/// If we're importing a standard C++20 Named Modules.
- bool isInImportingCXXNamedModules() const {
- // NamedModuleImportPath will be non-empty only if we're importing
- // Standard C++ named modules.
- return !NamedModuleImportPath.empty() && getLangOpts().CPlusPlusModules &&
- !IsAtImport;
+ bool isImportingCXXNamedModules() const {
+ return getLangOpts().CPlusPlusModules && ImportingCXXNamedModules;
+ }
+
+ bool isDeclaringCXXNamedModules() const {
+ return getLangOpts().CPlusPlusModules && DeclaringCXXNamedModules;
}
/// Allocate a new MacroInfo object with the provided SourceLocation.
@@ -2661,6 +2679,10 @@ class Preprocessor {
void removeCachedMacroExpandedTokensOfLastLexer();
+ /// Peek the next token. If so, return the token, if not, this
+ /// method should have no observable side-effect on the lexed tokens.
+ std::optional<Token> peekNextPPToken();
+
/// After reading "MACRO(", this method is invoked to read all of the formal
/// arguments specified for the macro invocation. Returns null on error.
MacroArgs *ReadMacroCallArgumentList(Token &MacroName, MacroInfo *MI,
@@ -3078,6 +3100,53 @@ struct EmbedAnnotationData {
StringRef FileName;
};
+/// Represents module name annotation data.
+///
+/// module-name:
+/// module-name-qualifier[opt] identifier
+///
+/// partition-name: [C++20]
+/// : module-name-qualifier[opt] identifier
+///
+/// module-name-qualifier
+/// module-name-qualifier[opt] identifier .
+class ModuleNameLoc final
+ : llvm::TrailingObjects<ModuleNameLoc, IdentifierLoc> {
+ friend TrailingObjects;
+ unsigned NumIdentifierLocs;
+
+ unsigned numTrailingObjects(OverloadToken<IdentifierLoc>) const {
+ return getNumIdentifierLocs();
+ }
+
+ ModuleNameLoc(ModuleIdPath Path) : NumIdentifierLocs(Path.size()) {
+ (void)llvm::copy(Path, getTrailingObjects<IdentifierLoc>());
+ }
+
+public:
+ static std::string stringFromModuleIdPath(ModuleIdPath Path);
+ static ModuleNameLoc *Create(Preprocessor &PP, ModuleIdPath Path);
+ static Token CreateAnnotToken(Preprocessor &PP, ModuleIdPath Path);
+ unsigned getNumIdentifierLocs() const { return NumIdentifierLocs; }
+ ModuleIdPath getModuleIdPath() const {
+ return {getTrailingObjects<IdentifierLoc>(), getNumIdentifierLocs()};
+ }
+
+ SourceLocation getBeginLoc() const {
+ return getModuleIdPath().front().getLoc();
+ }
+ SourceLocation getEndLoc() const {
+ auto &Last = getModuleIdPath().back();
+ return Last.getLoc().getLocWithOffset(
+ Last.getIdentifierInfo()->getLength());
+ }
+ SourceRange getRange() const { return {getBeginLoc(), getEndLoc()}; }
+
+ std::string str() const;
+ void print(llvm::raw_ostream &OS) const;
+ void dump() const { print(llvm::errs()); }
+};
+
/// Registry of pragma handlers added by plugins
using PragmaHandlerRegistry = llvm::Registry<PragmaHandler>;
diff --git a/clang/include/clang/Lex/Token.h b/clang/include/clang/Lex/Token.h
index 4f29fb7d11415..8e81207ddf8d7 100644
--- a/clang/include/clang/Lex/Token.h
+++ b/clang/include/clang/Lex/Token.h
@@ -231,6 +231,9 @@ class Token {
PtrData = const_cast<char*>(Ptr);
}
+ template <class T> T getAnnotationValueAs() const {
+ return static_cast<T>(getAnnotationValue());
+ }
void *getAnnotationValue() const {
assert(isAnnotation() && "Used AnnotVal on non-annotation token");
return PtrData;
@@ -289,6 +292,10 @@ class Token {
/// Return the ObjC keyword kind.
tok::ObjCKeywordKind getObjCKeywordID() const;
+ /// Return true if we have an C++20 Modules contextual keyword(export, import
+ /// or module).
+ bool isModuleContextualKeyword(bool AllowExport = true) const;
+
bool isSimpleTypeSpecifier(const LangOptions &LangOpts) const;
/// Return true if this token has trigraphs or escaped newlines in it.
diff --git a/clang/include/clang/Lex/TokenLexer.h b/clang/include/clang/Lex/TokenLexer.h
index 4d229ae610674..777b4e6266c71 100644
--- a/clang/include/clang/Lex/TokenLexer.h
+++ b/clang/include/clang/Lex/TokenLexer.h
@@ -139,10 +139,9 @@ class TokenLexer {
void Init(const Token *TokArray, unsigned NumToks, bool DisableMacroExpansion,
bool OwnsTokens, bool IsReinject);
- /// If the next token lexed will pop this macro off the
- /// expansion stack, return 2. If the next unexpanded token is a '(', return
- /// 1, otherwise return 0.
- unsigned isNextTokenLParen() const;
+ /// If the next token lexed will pop this macro off the expansion stack,
+ /// return std::nullopt, otherwise return the next unexpanded token.
+ std::optional<Token> peekNextPPToken() const;
/// Lex and return a token from this macro stream.
bool Lex(Token &Tok);
diff --git a/clang/include/clang/Parse/Parser.h b/clang/include/clang/Parse/Parser.h
index c4bef4729fd36..a59a99bbac7c6 100644
--- a/clang/include/clang/Parse/Parser.h
+++ b/clang/include/clang/Parse/Parser.h
@@ -1079,6 +1079,8 @@ class Parser : public CodeCompletionHandler {
unsigned ArgumentIndex) override;
...
[truncated]
|
Signed-off-by: yronglin <yronglin777@gmail.com>
d300a2b to
04ddbf6
Compare
Yes, I'm working on a fix PR. The '.' has an unexpected #define str(s) # s
#define xstr(s) str(s)
#define INCFILE(n) vers ## n
include xstr(INCFILE(2).h) |
|
Created a new PR(#173052) to fix this. |
|
Not entirely sure if this is the same issue, but we're seeing a crash on our Mac builders with this patch. Error Message: Crash reproducer: https://storage.cloud.google.com/fuchsia-artifacts/builds/8695039956773730673/emutls-ada33d.tar.gz |
|
@ilovepi can you verify of the fix PR fixes your issue? |
It would take me some time to access an appropriate piece of Mac hardware. I can look at launching a custom CI job but that isn't always simple. If the PR fixes the other failure, then I'd say land it and our CI will pick it up relatively soon (sooner than I can get access to such a machine). I'd also hope the linked crash reproducer would fail the same way, even on non-mac hardware. |
|
My Mac bot is also hitting a crash and might be the same issue as reported by @ilovepi. |
…#107168) This PR implement the following papers: [P1857R3 Modules Dependency Discovery](https://wg21.link/p1857r3). [P3034R1 Module Declarations Shouldn’t be Macros](https://wg21.link/P3034R1). [CWG2947](https://cplusplus.github.io/CWG/issues/2947.html). At the start of phase 4 an import or module token is treated as starting a directive and are converted to their respective keywords iff: - After skipping horizontal whitespace are - at the start of a logical line, or - preceded by an export at the start of the logical line. - Are followed by an identifier pp token (before macro expansion), or - <, ", or : (but not ::) pp tokens for import, or - ; for module Otherwise the token is treated as an identifier. Additionally: - The entire import or module directive (including the closing ;) must be on a single logical line and for module must not come from an #include. - The expansion of macros must not result in an import or module directive introducer that was not there prior to macro expansion. - A module directive may only appear as the first preprocessing tokens in a file (excluding the global module fragment.) - Preprocessor conditionals shall not span a module declaration. After this patch, we handle C++ module-import and module-declaration as a real pp-directive in preprocessor. Additionally, we refactor module name lexing, remove the complex state machine and read full module name during module/import directive handling. Possibly we can introduce a tok::annot_module_name token in the future, avoid duplicatly parsing module name in both preprocessor and parser, but it's makes error recovery much diffcult(eg. import a; import b; in same line). This patch also introduce 2 new keyword `__preprocessed_module` and `__preprocessed_import`. These 2 keyword was generated during `-E` mode. This is useful to avoid confusion with `module` and `import` keyword in preprocessed output: ```cpp export module m; struct import {}; #define EMPTY EMPTY import foo; ``` Fixes llvm#54047 --------- Signed-off-by: yronglin <yronglin777@gmail.com> Signed-off-by: Wang, Yihan <yronglin777@gmail.com>
|
It looks like this change has significant compile time impact (+0.5% at Is that expected? |
|
I managed to get ahold of an arm64 macbook. With this patch, it the build crashes building libunwind with the new compiler. If I revert this patch, the build completes just fine. I think this needs to be reverted, until you can investigate the cause of the failure. Maybe some Apple folks have an idea of whats going wrong. I'll try to see if I can use lldb to point you in the right direction, but our kernel CI is on fire and I'm investigating other potential miscompiles. My CMake invocation was: |
|
Seems like LangOpts is invalid on Mac builds? I don't understand why, but that seems to be the source of the error. |
|
Given the issues with this patch, unless some fixes are coming in soon, I also feel we need to revert this. |
|
Hi, +1 to revert this PR. Could you please take a look about the buildbot failures? https://lab.llvm.org/buildbot/#/builders/94/builds/13727 |
…ry" (#173118) Reverts #107168 This patch broke on bots: - https://lab.llvm.org/buildbot/#/builders/190/builds/33105 - https://lab.llvm.org/buildbot/#/builders/94/builds/13727 - https://lab.llvm.org/buildbot/#/builders/169/builds/18192 and on mac-aarch64 builds. see #107168 (comment)
…ncy Discovery" (#173118) Reverts llvm/llvm-project#107168 This patch broke on bots: - https://lab.llvm.org/buildbot/#/builders/190/builds/33105 - https://lab.llvm.org/buildbot/#/builders/94/builds/13727 - https://lab.llvm.org/buildbot/#/builders/169/builds/18192 and on mac-aarch64 builds. see llvm/llvm-project#107168 (comment)
|
Hi thanks for report this, do you know how to reproduce this sanitize build in local? |
I suspect the compilation time might increase because we've added more code to Lexer, which could potentially impact the critical path. Thank you for reporting this issue; I will continue to monitor it and try to reduce the compilation time. |
|
I know we don't have any official policy about this: but I'd like to point out that it's the holiday season for many people. Landing such a big patch that could have some teething problems (which I'm not criticizing, that's normal part of development) today, on a Friday, where a lot of folks will not be returning until the new year, isn't ideal. At Apple, and I'd guess in most other organizations, there'll be a skeleton crew trying to keep things ticking along over the break and we'd appreciate some stability while there are few eyes on quality. Just m2c. |
…#107168) This PR implement the following papers: [P1857R3 Modules Dependency Discovery](https://wg21.link/p1857r3). [P3034R1 Module Declarations Shouldn’t be Macros](https://wg21.link/P3034R1). [CWG2947](https://cplusplus.github.io/CWG/issues/2947.html). At the start of phase 4 an import or module token is treated as starting a directive and are converted to their respective keywords iff: - After skipping horizontal whitespace are - at the start of a logical line, or - preceded by an export at the start of the logical line. - Are followed by an identifier pp token (before macro expansion), or - <, ", or : (but not ::) pp tokens for import, or - ; for module Otherwise the token is treated as an identifier. Additionally: - The entire import or module directive (including the closing ;) must be on a single logical line and for module must not come from an #include. - The expansion of macros must not result in an import or module directive introducer that was not there prior to macro expansion. - A module directive may only appear as the first preprocessing tokens in a file (excluding the global module fragment.) - Preprocessor conditionals shall not span a module declaration. After this patch, we handle C++ module-import and module-declaration as a real pp-directive in preprocessor. Additionally, we refactor module name lexing, remove the complex state machine and read full module name during module/import directive handling. Possibly we can introduce a tok::annot_module_name token in the future, avoid duplicatly parsing module name in both preprocessor and parser, but it's makes error recovery much diffcult(eg. import a; import b; in same line). This patch also introduce 2 new keyword `__preprocessed_module` and `__preprocessed_import`. These 2 keyword was generated during `-E` mode. This is useful to avoid confusion with `module` and `import` keyword in preprocessed output: ```cpp export module m; struct import {}; #define EMPTY EMPTY import foo; ``` Fixes llvm#54047 --------- Signed-off-by: yronglin <yronglin777@gmail.com> Signed-off-by: Wang, Yihan <yronglin777@gmail.com>
…ry" (llvm#173118) Reverts llvm#107168 This patch broke on bots: - https://lab.llvm.org/buildbot/#/builders/190/builds/33105 - https://lab.llvm.org/buildbot/#/builders/94/builds/13727 - https://lab.llvm.org/buildbot/#/builders/169/builds/18192 and on mac-aarch64 builds. see llvm#107168 (comment)
…ency Discovery" (#173130) This PR reapply llvm/llvm-project#107168. --------- Signed-off-by: Wang, Yihan <yronglin777@gmail.com> Signed-off-by: yronglin <yronglin777@gmail.com>
…#107168) This PR implement the following papers: [P1857R3 Modules Dependency Discovery](https://wg21.link/p1857r3). [P3034R1 Module Declarations Shouldn’t be Macros](https://wg21.link/P3034R1). [CWG2947](https://cplusplus.github.io/CWG/issues/2947.html). At the start of phase 4 an import or module token is treated as starting a directive and are converted to their respective keywords iff: - After skipping horizontal whitespace are - at the start of a logical line, or - preceded by an export at the start of the logical line. - Are followed by an identifier pp token (before macro expansion), or - <, ", or : (but not ::) pp tokens for import, or - ; for module Otherwise the token is treated as an identifier. Additionally: - The entire import or module directive (including the closing ;) must be on a single logical line and for module must not come from an #include. - The expansion of macros must not result in an import or module directive introducer that was not there prior to macro expansion. - A module directive may only appear as the first preprocessing tokens in a file (excluding the global module fragment.) - Preprocessor conditionals shall not span a module declaration. After this patch, we handle C++ module-import and module-declaration as a real pp-directive in preprocessor. Additionally, we refactor module name lexing, remove the complex state machine and read full module name during module/import directive handling. Possibly we can introduce a tok::annot_module_name token in the future, avoid duplicatly parsing module name in both preprocessor and parser, but it's makes error recovery much diffcult(eg. import a; import b; in same line). This patch also introduce 2 new keyword `__preprocessed_module` and `__preprocessed_import`. These 2 keyword was generated during `-E` mode. This is useful to avoid confusion with `module` and `import` keyword in preprocessed output: ```cpp export module m; struct import {}; #define EMPTY EMPTY import foo; ``` Fixes llvm#54047 --------- Signed-off-by: yronglin <yronglin777@gmail.com> Signed-off-by: Wang, Yihan <yronglin777@gmail.com>
…ry" (llvm#173118) Reverts llvm#107168 This patch broke on bots: - https://lab.llvm.org/buildbot/#/builders/190/builds/33105 - https://lab.llvm.org/buildbot/#/builders/94/builds/13727 - https://lab.llvm.org/buildbot/#/builders/169/builds/18192 and on mac-aarch64 builds. see llvm#107168 (comment)
…ery" (llvm#173130) This PR reapply llvm#107168. --------- Signed-off-by: Wang, Yihan <yronglin777@gmail.com> Signed-off-by: yronglin <yronglin777@gmail.com>
…ery" (#173130) This PR reapply llvm/llvm-project#107168. --------- Signed-off-by: Wang, Yihan <yronglin777@gmail.com> Signed-off-by: yronglin <yronglin777@gmail.com>
…ry" (#173118) Reverts llvm/llvm-project#107168 This patch broke on bots: - https://lab.llvm.org/buildbot/#/builders/190/builds/33105 - https://lab.llvm.org/buildbot/#/builders/94/builds/13727 - https://lab.llvm.org/buildbot/#/builders/169/builds/18192 and on mac-aarch64 builds. see llvm/llvm-project#107168 (comment) (cherry picked from commit 2b8b305)
…ery" (#173130) This PR reapply llvm/llvm-project#107168. --------- Signed-off-by: Wang, Yihan <yronglin777@gmail.com> Signed-off-by: yronglin <yronglin777@gmail.com> (cherry picked from commit 0d1c396)
This PR implement the following papers:
P1857R3 Modules Dependency Discovery.
P3034R1 Module Declarations Shouldn’t be Macros.
CWG2947.
At the start of phase 4 an import or module token is treated as starting a directive and are converted to their respective keywords iff:
Otherwise the token is treated as an identifier.
Additionally:
After this patch, we handle C++ module-import and module-declaration as a real pp-directive in preprocessor. Additionally, we refactor module name lexing, remove the complex state machine and read full module name during module/import directive handling. Possibly we can introduce a tok::annot_module_name token in the future, avoid duplicatly parsing module name in both preprocessor and parser, but it's makes error recovery much diffcult(eg. import a; import b; in same line).
This patch also introduce 2 new keyword
__preprocessed_moduleand__preprocessed_import. These 2 keyword was generated during-Emode. This is useful to avoid confusion withmoduleandimportkeyword in preprocessed output:Fixes #54047