[clang] Optimize Lexer hot path to reduce compile time#177153
[clang] Optimize Lexer hot path to reduce compile time#177153
Conversation
|
@llvm/pr-subscribers-clang Author: None (yronglin) ChangesThis patch fix this compile time regression that introduced in #173789. The regression was fixed by add an early-exit optimization in the hot Lexer path. Full diff: https://github.com/llvm/llvm-project/pull/177153.diff 2 Files Affected:
diff --git a/clang/include/clang/Lex/Preprocessor.h b/clang/include/clang/Lex/Preprocessor.h
index 5adc45a19ca79..f286e0d8bb348 100644
--- a/clang/include/clang/Lex/Preprocessor.h
+++ b/clang/include/clang/Lex/Preprocessor.h
@@ -1871,6 +1871,18 @@ class Preprocessor {
/// read is the correct one.
bool HandleModuleContextualKeyword(Token &Result,
bool TokAtPhysicalStartOfLine);
+ /// Quick check whether current token at physical start of line or previous
+ /// export tok was at physical start of line. This is used as an early-exit
+ /// optimization in the hot Lexer::Lex path.
+ //
+ // Returns true if the current token could potentially be a module directive
+ // introducer.
+ bool isModuleDirectiveIntroducerAtPhysicalStartOfLine(
+ bool TokAtPhysicalStartOfLine) {
+ return TokAtPhysicalStartOfLine ||
+ (LastTokenWasExportKeyword.isValid() &&
+ LastTokenWasExportKeyword.isAtPhysicalStartOfLine());
+ }
/// Get the start location of the first pp-token in main file.
SourceLocation getMainFileFirstPPTokenLoc() const {
diff --git a/clang/lib/Lex/Lexer.cpp b/clang/lib/Lex/Lexer.cpp
index 2c4ba70551fab..c10ca8925586e 100644
--- a/clang/lib/Lex/Lexer.cpp
+++ b/clang/lib/Lex/Lexer.cpp
@@ -4058,10 +4058,14 @@ bool Lexer::LexTokenInternal(Token &Result, bool TokAtPhysicalStartOfLine) {
// so it's safe to access member variables after this call returns.
bool returnedToken = LexIdentifierContinue(Result, CurPtr);
- if (returnedToken && !LexingRawMode && !Is_PragmaLexer &&
- !ParsingPreprocessorDirective && LangOpts.CPlusPlusModules &&
- Result.isModuleContextualKeyword() &&
- PP->HandleModuleContextualKeyword(Result, TokAtPhysicalStartOfLine))
+ if (LLVM_UNLIKELY(returnedToken && !LexingRawMode && !Is_PragmaLexer &&
+ !ParsingPreprocessorDirective &&
+ LangOpts.CPlusPlusModules &&
+ PP->isModuleDirectiveIntroducerAtPhysicalStartOfLine(
+ TokAtPhysicalStartOfLine) &&
+ Result.isModuleContextualKeyword() &&
+ PP->HandleModuleContextualKeyword(
+ Result, TokAtPhysicalStartOfLine)))
goto HandleDirective;
return returnedToken;
}
@@ -4637,8 +4641,12 @@ bool Lexer::LexDependencyDirectiveToken(Token &Result) {
Result.setRawIdentifierData(TokPtr);
if (!isLexingRawMode()) {
const IdentifierInfo *II = PP->LookUpIdentifierInfo(Result);
- if (LangOpts.CPlusPlusModules && Result.isModuleContextualKeyword() &&
- PP->HandleModuleContextualKeyword(Result, Result.isAtStartOfLine())) {
+ if (LLVM_UNLIKELY(LangOpts.CPlusPlusModules &&
+ PP->isModuleDirectiveIntroducerAtPhysicalStartOfLine(
+ Result.isAtStartOfLine()) &&
+ Result.isModuleContextualKeyword() &&
+ PP->HandleModuleContextualKeyword(
+ Result, Result.isAtStartOfLine()))) {
PP->HandleDirective(Result);
return false;
}
|
|
It doesn't seem to make any difference: https://llvm-compile-time-tracker.com/compare.php?from=b9760dc6fbd65a29fb6773539f19b7c964da3e85&to=b577b0ceeb7e301c3e926f4d1976cf13af4bbcdc&stat=instructions:u |
Many thanks! Let me try other approach. |
…ontextual keyword in HandleIdentifier Signed-off-by: yronglin <yronglin777@gmail.com>
528d0a6 to
0e6ee2c
Compare
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
|
@nikic Could you help verify that this PR can reduce this compile time? Many thanks! |
|
This doesn't recover the original regression, but is an improvement. |
cor3ntin
left a comment
There was a problem hiding this comment.
Independently of the small performance improvement, i think the code is much cleaner that way. Thanks!
Signed-off-by: yronglin <yronglin777@gmail.com>
🐧 Linux x64 Test Results
✅ The build succeeded and all tests passed. |
|
Thanks for the reivew! I'll land the patch first and look for further optimizations in future patches. |
This patch fix this compile time regression that introduced in llvm#173789. - Introduce a `TokenFlag::PhysicalStartOfLine` flag to replace `IsAtPhysicalStartOfLine` in a brunch of `Lexer` member functions and remove `ExportContextualKeywordInfo` struct. - Handle `import`, `module` and `export` keyword in `HandleIdentifier` instead of in a `Lexer` hot path. --------- Signed-off-by: yronglin <yronglin777@gmail.com>
This patch fix this compile time regression that introduced in #173789.
TokenFlag::PhysicalStartOfLineflag to replaceIsAtPhysicalStartOfLinein a brunch ofLexermember functions and removeExportContextualKeywordInfostruct.import,moduleandexportkeyword inHandleIdentifierinstead of in aLexerhot path.