-
-
Notifications
You must be signed in to change notification settings - Fork 60
Description
👓 What did you see?
On my project with about 150 stepdefs and about 400 test scenarios, the IntelliJ profiler says the CucumberExpression.<init>
method takes 25.9% of the total CPU time. This is because the method is called for all step defs and for all test scenarios. I think the performance could be better.
✅ What did you expect to see?
I expect CucumberExpression.<init>
to avoid unnecessary processing (contributes to #2035).
I understand that cucumber-java8
can introduce dynamic behavior which requires parsing the expressions for each test scenario. However, I think we can safely cache everything that is constant and does not depend on cucumber-java8
. I identitifed the following performance improvement points in CucumberExpression
:
-
TreeRegex
creation: inCucumberExpression
constructor, this object serves to get some "metadata" about a regular expression itself (i.e. not depending on context). Thus, two identical regular expressions will lead to the sameTreeRegp
, so the creation is cacheable.The original code:
this.treeRegexp = new TreeRegexp(pattern);
could be replaced by (
treeRegexps
is a staticMap<String, TreeRegexp>
):this.treeRegexp = treeRegexps.computeIfAbsent(pattern, TreeRegexp::new);
-
calls to
escapeRegex
in therewriteToRegex
method are done on theNode.text()
content: two identicalNode.text()
will lead to the same escaped result, independently of the context. Thus, the result ofescapeRegex
is cacheable.The original code:
return escapeRegex(node.text());
can be replaced by (
escapedTexts
is a staticMap<String, String>
):return escapedTexts.computeIfAbsent(node.text(), CucumberExpression::escapeRegex);
These two optimization points lead to four combinations to be benchmarked (original version is createExpression0
). The benchmark consists in creating 400 times five different expressions:
Benchmark | cached calls to escapeRegex | cached TreeRegex creation | ops/s |
---|---|---|---|
CucumberExpressionBenchmark.createExpression0 | no | no | 153,024 ± 13,800 |
CucumberExpressionBenchmark.createExpression1 | yes | no | 181,960 ± 12,133 |
CucumberExpressionBenchmark.createExpression2 | no | yes | 186,236 ± 11,232 |
CucumberExpressionBenchmark.createExpression3 | yes | yes | 219,890 ± 12,365 |
Caching the TreeRegex
creation lead to 22% performance improvement and using both methods lead to 44% performance improvement.
On a real project with about 150 stepdefs and 400 test scenarios, the IntelliJ Profiler runs is about 7700 ms and says that CucumberExpression.<init>
is:
- 25.9% of the total CPU time with the original version (1994 ms)
- 15.7% of the total CPU time with both optimizations enabled (1209 ms, i.e. that's a 785 ms improvement on total time, or 10%)
I suggest to use the variant createExpression3 and I would be happy to propose a PR.
📦 Which tool/library version are you using?
Cucumber 7.10.1
🔬 How could we reproduce it?
The benchmark with the four variants is in
cucumberexpressions.zip
Steps to reproduce the behavior:
-
Create a Maven project with the following dependencies:
<dependency> <groupId>io.cucumber</groupId> <artifactId>cucumber-java</artifactId> <version>${cucumber.version}</version> <scope>test</scope> </dependency> <dependency> <groupId>io.cucumber</groupId> <artifactId>cucumber-junit-platform-engine</artifactId> <version>${cucumber.version}</version> <scope>test</scope> </dependency> <dependency> <groupId>io.cucumber</groupId> <artifactId>cucumber-picocontainer</artifactId> <version>${cucumber.version}</version> <scope>test</scope> </dependency> <dependency> <groupId>org.openjdk.jmh</groupId> <artifactId>jmh-generator-annprocess</artifactId> <version>1.36</version> <scope>test</scope> </dependency>
-
Run the benchmark