Skip to content

Conversation

@vitalybuka
Copy link
Collaborator

@vitalybuka vitalybuka commented Oct 22, 2025

This commit optimizes SpecialCaseList by using a RadixTree to filter
glob patterns based on their prefixes. When matching a query, the
RadixTree quickly identifies all glob patterns whose prefixes match
the query's prefix. This significantly reduces the number of glob
patterns that need to be fully evaluated, leading to performance
improvements, especially when dealing with a large number of patterns.

According to SpecialCaseListBM:

Lookup benchmarks (significant improvements):

OVERALL_GEOMEAN                       -0.8177

Lookup like prefix* benchmarks (huge improvements):

OVERALL_GEOMEAN                       -0.9819

https://gist.github.com/vitalybuka/824884bcbc1713e815068c279159dafe

Created using spr 1.3.7
@llvmbot
Copy link
Member

llvmbot commented Oct 22, 2025

@llvm/pr-subscribers-llvm-support

Author: Vitaly Buka (vitalybuka)

Changes

This commit optimizes SpecialCaseList by using a RadixTree to filter
glob patterns based on their prefixes. When matching a query, the
RadixTree quickly identifies all glob patterns whose prefixes match
the query's prefix. This significantly reduces the number of glob
patterns that need to be fully evaluated, leading to performance
improvements, especially when dealing with a large number of patterns.


Full diff: https://github.com/llvm/llvm-project/pull/164531.diff

2 Files Affected:

  • (modified) llvm/include/llvm/Support/SpecialCaseList.h (+7)
  • (modified) llvm/lib/Support/SpecialCaseList.cpp (+17-3)
diff --git a/llvm/include/llvm/Support/SpecialCaseList.h b/llvm/include/llvm/Support/SpecialCaseList.h
index ead765562504d..16f309329a0b5 100644
--- a/llvm/include/llvm/Support/SpecialCaseList.h
+++ b/llvm/include/llvm/Support/SpecialCaseList.h
@@ -13,10 +13,13 @@
 #define LLVM_SUPPORT_SPECIALCASELIST_H
 
 #include "llvm/ADT/ArrayRef.h"
+#include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringMap.h"
+#include "llvm/ADT/iterator_range.h"
 #include "llvm/Support/Allocator.h"
 #include "llvm/Support/Compiler.h"
 #include "llvm/Support/GlobPattern.h"
+#include "llvm/Support/RadixTree.h"
 #include "llvm/Support/Regex.h"
 #include <memory>
 #include <string>
@@ -162,6 +165,10 @@ class SpecialCaseList {
     };
 
     std::vector<GlobMatcher::Glob> Globs;
+
+    RadixTree<iterator_range<StringRef::const_iterator>,
+              SmallVector<const GlobMatcher::Glob *, 1>>
+        PrefixToGlob;
   };
 
   /// Represents a set of patterns and their line numbers
diff --git a/llvm/lib/Support/SpecialCaseList.cpp b/llvm/lib/Support/SpecialCaseList.cpp
index f74e52a3a7fa9..2a86cc37b6000 100644
--- a/llvm/lib/Support/SpecialCaseList.cpp
+++ b/llvm/lib/Support/SpecialCaseList.cpp
@@ -89,14 +89,28 @@ void SpecialCaseList::GlobMatcher::preprocess(bool BySize) {
       return A.Name.size() < B.Name.size();
     });
   }
+
+  for (auto &G : Globs) {
+    StringRef Prefix = G.Pattern.prefix();
+
+    auto &V = PrefixToGlob.emplace(Prefix).first->second;
+    V.emplace_back(&G);
+  }
 }
 
 void SpecialCaseList::GlobMatcher::match(
     StringRef Query,
     llvm::function_ref<void(StringRef Rule, unsigned LineNo)> Cb) const {
-  for (const auto &G : reverse(Globs))
-    if (G.Pattern.match(Query))
-      return Cb(G.Name, G.LineNo);
+  if (!PrefixToGlob.empty()) {
+    for (const auto &[_, V] : PrefixToGlob.find_prefixes(Query)) {
+      for (const auto *G : reverse(V)) {
+        if (G->Pattern.match(Query)) {
+          Cb(G->Name, G->LineNo);
+          break;
+        }
+      }
+    }
+  }
 }
 
 SpecialCaseList::Matcher::Matcher(bool UseGlobs, bool RemoveDotSlash)

Created using spr 1.3.7
Created using spr 1.3.7
return Cb(G.Name, G.LineNo);
if (!PrefixToGlob.empty()) {
for (const auto &[_, V] : PrefixToGlob.find_prefixes(Query)) {
for (const auto *G : reverse(V)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a comment to explain the reverse

Created using spr 1.3.7
Created using spr 1.3.7
vitalybuka added a commit that referenced this pull request Oct 25, 2025
This commit introduces a RadixTree implementation to LLVM.

RadixTree, as a Trie, is very efficient by searching for prefixes.

A Radix Tree is more efficient implementation of Trie.

The tree will be used to optimize Glob matching in SpecialCaseList:
* #164531 
* #164543 
* #164545

---------

Co-authored-by: Kazu Hirata <[email protected]>
Co-authored-by: Copilot <[email protected]>
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Oct 25, 2025
This commit introduces a RadixTree implementation to LLVM.

RadixTree, as a Trie, is very efficient by searching for prefixes.

A Radix Tree is more efficient implementation of Trie.

The tree will be used to optimize Glob matching in SpecialCaseList:
* llvm/llvm-project#164531
* llvm/llvm-project#164543
* llvm/llvm-project#164545

---------

Co-authored-by: Kazu Hirata <[email protected]>
Co-authored-by: Copilot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants