Investigate string allocations and array generation across gherkin flavours #351

luke-hill · 2025-01-07T09:49:55Z

jkronegg · 2025-01-31T16:31:51Z

Java

Reading the very_long.feature using OpenJDK21 with JMH micro-benchmark gives the following result (the parser receives the file content as a String):

Benchmark                  Mode  Cnt     Score     Error  Units
MyClassBenchmark.original  avgt   25  6601.674 ± 433.332  us/op

When reading 1000 times the very_long.feature, IntelliJ's Profiler gives the following flame graph:

Most of time is passed on String trimming (about 50% of total duration). I already worked on the point in the past (#84), but some improvement still needs to be done (I have some ideas on how to improve that😉). Otherwise there is no noticeable performance hot spot.

On one of my real-life project with about 100 rules and 1000 test scenarios, the Parser.parse() takes 340 ms (with about 100 ms of String trimming, that is: only about 30% of the parsing duration).

I'll create an issue on that point.

JMH benchmark code:

public class MyClassBenchmark {
    @Benchmark
    @BenchmarkMode(Mode.AverageTime)
    @OutputTimeUnit(TimeUnit.MICROSECONDS)
    public GherkinDocument original(MyClassPlan plan) {
        return plan.parser.parse(plan.featureContent, plan.matcher, "very_long.feature");
    }

    @Test
    void test_for_profiler() {
        MyClassPlan plan = new MyClassPlan();
        for (int i=0; i<1000; i++) plan.parser.parse(plan.featureContent, plan.matcher, "very_long.feature");
    }
}

@State(Scope.Benchmark)
public class MyClassPlan {
    TokenMatcher matcher = new TokenMatcher("en");
    IdGenerator idGenerator = new IncrementingIdGenerator();
    Path path = Paths.get("../testdata/good/very_long.feature");
    Parser<GherkinDocument> parser = new Parser<>(new GherkinDocumentBuilder(idGenerator, path.toString()));
    String featureContent;
    {
        try {
            featureContent = new String(Files.readAllBytes(path));
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
    }
}

luke-hill added the 🏦 debt Tech debt label Jan 7, 2025

luke-hill mentioned this issue Jan 7, 2025

[.NET] Improved parsing time #336

Merged

5 tasks

jkronegg mentioned this issue Jan 31, 2025

Performance issue in StringUtils #361

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate string allocations and array generation across gherkin flavours #351

Investigate string allocations and array generation across gherkin flavours #351

luke-hill commented Jan 7, 2025 •

edited by jkronegg

Loading

jkronegg commented Jan 31, 2025

Investigate string allocations and array generation across gherkin flavours #351

Investigate string allocations and array generation across gherkin flavours #351

Comments

luke-hill commented Jan 7, 2025 • edited by jkronegg Loading

🤔 What's the problem you've observed?

✨ Do you have a proposal for making it better?

📚 Any additional context?

jkronegg commented Jan 31, 2025

Java

luke-hill commented Jan 7, 2025 •

edited by jkronegg

Loading