Skip to content

ES|QL - Add parsing, preanalysis and analysis timing information to profile#139540

Merged
carlosdelest merged 24 commits intoelastic:mainfrom
carlosdelest:enhancement/esql-analysis-preanalysis-profiling
Dec 18, 2025
Merged

ES|QL - Add parsing, preanalysis and analysis timing information to profile#139540
carlosdelest merged 24 commits intoelastic:mainfrom
carlosdelest:enhancement/esql-analysis-preanalysis-profiling

Conversation

@carlosdelest
Copy link
Member

@carlosdelest carlosdelest commented Dec 15, 2025

Adds parsing, preanalysis and analysis timing information to the profile output:

{
    "profile": {
        "query": {
            "start_millis": 1765995832363,
            "stop_millis": 1765995832370,
            "took_millis": 7,
            "took_nanos": 7435667
        },
        "planning": {
            "start_millis": 1765995832363,
            "stop_millis": 1765995832366,
            "took_millis": 3,
            "took_nanos": 2865708
        },
        "parsing": {
            "start_millis": 1765995832363,
            "stop_millis": 1765995832364,
            "took_millis": 1,
            "took_nanos": 750125
        },
        "preanalysis": {
            "start_millis": 1765995832364,
            "stop_millis": 1765995832364,
            "took_millis": 0,
            "took_nanos": 31791
        },
        "dependency_resolution": {
            "start_millis": 1765995832364,
            "stop_millis": 1765995832365,
            "took_millis": 1,
            "took_nanos": 1165791
        },
        "analysis": {
            "start_millis": 1765995832365,
            "stop_millis": 1765995832365,
            "took_millis": 0,
            "took_nanos": 403458
        }
}

dependency_resolution include a number of operations related to dependency resolution:

  • Retrieving field caps information
  • Resolve enrich policies
  • Resolve inference IDs

This helps understanding the amount of time ES|QL spends doing non planning operations, including specifically field_caps checks.

@carlosdelest carlosdelest added >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) :Analytics/ES|QL AKA ESQL v9.3.0 labels Dec 15, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @carlosdelest, I've created a changelog YAML for you.

@carlosdelest carlosdelest changed the title ES|QL - Add analysis and preanalysis timing information to profile ES|QL - Add parsing, preanalysis and analysis timing information to profile Dec 16, 2025
@carlosdelest carlosdelest marked this pull request as ready for review December 16, 2025 14:49
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@carlosdelest carlosdelest requested review from a team and idegtiarenko December 16, 2025 14:50
assert ThreadPool.assertCurrentThreadPool(ThreadPool.Names.SEARCH);

executionInfo.markBeginPreAnalysis();
PreAnalyzer.PreAnalysis preAnalysis = preAnalyzer.preAnalyze(parsed);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to suggest we split preanalysis into preanalysis as above and dependency-resolution (as FC calls) as it becoming a significant dependency for us.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

private transient TimeSpan.Builder preAnalysisTimeSpanBuilder; // Builder for preAnalysisTimeSpan
private transient TimeSpan analysisTimeSpan; // time elapsed for plan analysis
private transient TimeSpan.Builder analysisTimeSpanBuilder; // Builder for analysisTimeSpan
private TimeValue overallTook;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EsqlExecutionInfo has grown to almost 800 lines and has multiple responsibilities now (such as ccs info tracking, profiling, status reportin, isPartial, isStopped). Currently we are thinking about a ways to manage that.

I wonder if that would make sense to group all transient timers and define them in a new private final transient PlanningProfile field? This would allow to move some of this class complexity elsewhere.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense - I've created a PlanningProfile class and moved planning related profiling there.


public TimeSpan parsingTimeSpan() {
return parsingTimeSpan;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the markBegin** and markEnd** do the same things: verify the state and perform the action.
I wonder if we could introduce a component with Builder and Span that verifies those state transition.
I imagine something like:

class TimeSpanRecorder {
    private TimeSpan.Builder builder;
    private TimeSpan span;

    public void start() {/*assert  is called first time and start a builder*/}
    public void stop() {/*assert is called first time and create a span*/}
    public TimeSpan get() {..}   
}

All usages could be simplified to:

executionInfo.planningProfile().analysis().start();
...
executionInfo.planningProfile().analysis().stop();

parsingMarker = new TimeSpanMarker(PARSING, false, parsing);
preAnalysisMarker = new TimeSpanMarker(PRE_ANALYSIS, false, preAnalysis);
dependencyResolutionMarker = new TimeSpanMarker(DEPENDENCY_RESOLUTION, true, dependencyResolution);
analysisMarker = new TimeSpanMarker(ANALYSIS, true, analysis);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe dependencyResolutionMarker and analysisMarker are having allowMultipleCalls=true because of the possible analysis retry. I wonder if in this case we should sum the the duration instead of discarding the first attempt? The query is getting longer because of extra attempts.

Possibly we want to record the fact of retrying and total FieldCaps call count as well, but I believe this is out of scope of this change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe dependencyResolutionMarker and analysisMarker are having allowMultipleCalls=true because of the possible analysis retry.

Correct

I wonder if in this case we should sum the the duration instead of discarding the first attempt?

We're doing that - in case start() is invoked again when multiple invocations are allowed, we keep the original starting time.

We could count etc but let's get this change in and we can refine later if needed - at the very least we will get a longer time in dependency resolution / analysis that will hint about having multiple retriees.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're doing that - in case start() is invoked again when multiple invocations are allowed, we keep the original starting time.

I am not sure this is correct:

maker1.start();
maker1.stop();
...
maker2.start();
maker2.stop();
...
maker1.start();
maker1.stop();

This way the maker1 would keep the entire duration from first start to the last stop call, not the combined duration of first and second intervals.

This PR is a big improvement on its own, I do not mind changing repeated start/stop behavior later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point - I was thinking on total duration, not in summing up the individual steps.

Yep, let's refine that behavior later 👍

Copy link
Contributor

@idegtiarenko idegtiarenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left one suggestion regarding summing duration in case of retries.

Overall, looks good, thanks for the contribution!

…ysis-preanalysis-profiling

# Conflicts:
#	server/src/main/resources/transport/upper_bounds/9.3.csv
@carlosdelest carlosdelest merged commit 23cb7c4 into elastic:main Dec 18, 2025
35 checks passed
szybia added a commit to szybia/elasticsearch that referenced this pull request Dec 19, 2025
* upstream/main: (253 commits)
  Adds ST_SIMPLIFY geo spatial function (elastic#136309)
  Take control of max clause count verification in Lucene searcher (elastic#139752)
  [ML] Unmute Inference Test (elastic#139765)
  Parameterize the vector operation benchmark tests (elastic#139735)
  Fix node reduction pushdown tests for release tests (elastic#139548)
  Fix flakiness in TSDataGenerationHelper (elastic#139759)
  CPS: Copy existing resolved index expressions when constructing a new `SearchRequest` from an existing one (elastic#139596)
  Add release notes for v9.1.9 release (elastic#139674)
  Add lucene query for wildcards on high cardinality keyword fields. (elastic#139746)
  Suppress Tika entitlement warnings from AWT (elastic#139711)
  Check field storage when synthetic source is enabled, in tests (elastic#139715)
  Refactor VectorSimilarityType to know about its corresponding Function (elastic#139678)
  Merge fixes from patch branch back into main (elastic#139721)
  Define native bulk operations for vector square distance (elastic#139198)
  Use LongUpDownCounter for Linked Project Error Metrics (elastic#139657)
  ESQL: Add javadoc that explains version-aware planning (elastic#139706)
  Add helper to pick node for reindex relocation (elastic#139081)
  Fix auth serialization randomized version test (elastic#139182)
  ES|QL - Add parsing, preanalysis and analysis timing information to profile (elastic#139540)
  Mute org.elasticsearch.persistent.ClusterPersistentTasksCustomMetadataTests testMinVersionSerialization elastic#139741
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants