perf(native): Avoid `LIKE` rewrites for prefix/suffix patterns in native execution by pramodsatya · Pull Request #27363 · prestodb/presto

pramodsatya · 2026-03-18T04:58:30Z

Description

This change conditionally disables Presto's coordinator-side LIKE pattern rewrites when native-execution-enabled=true. Velox's OptimizedLike implementation provides superior performance for simple LIKE patterns compared to Presto's rewrites that decompose LIKE into SUBSTR/STRPOS calls.

Motivation and Context

Presto currently optimizes SQL LIKE patterns at the coordinator level:

'foo%' → SUBSTR(x, 1, len) = 'foo'
'%foo' → SUBSTR(x, -len) = 'foo'
'%foo%' → STRPOS(x, 'foo') != 0

When native execution is enabled, Velox's engine natively evaluates LIKE expressions using the OptimizedLike template class, which provides fast-path implementations:

Prefix/suffix matching: Direct memcmp on byte ranges (no character indexing overhead)
Substring matching: std::string_view::find for O(n) scanning

In contrast, Presto's SUBSTR/STRPOS rewrites require:

Character-position counting via UTF-8 codepoint iteration
Byte-range lookups with explicit index conversion
Generalized string function evaluation overhead

For constant patterns, Velox's optimized paths are better performant. By letting Velox evaluate LIKE natively, we unlock Velox's optimization by eliminating unnecessary rewriting.

Impact

Coordinators without native execution enabled (default): LIKE rewrites continue as before
No semantic changes: Native LIKE evaluation is equivalent to the coordinator-side rewrites

Test Plan

Validated with existing TestRowExpressionTranslator test suite

Release Notes

== NO RELEASE NOTE ==

Summary by Sourcery

Guard LIKE prefix/suffix rewrite logic in the SQL-to-row-expression translator with the native execution enablement flag to preserve expected behavior under native execution.

Enhancements:

Extend SQL-to-row-expression translation APIs and visitor to be aware of native execution enablement.
Skip LIKE prefix/suffix optimization rewrites when native execution is enabled while preserving existing behavior otherwise.

Summary by Sourcery

Gate coordinator-side LIKE prefix/suffix pattern rewrites on the native execution flag so that native execution uses Velox’s built-in LIKE evaluation.

Enhancements:

Extend SQL-to-row-expression translation APIs and visitor to propagate whether native execution is enabled.
Skip LIKE prefix/suffix optimization rewrites in the LIKE predicate visitor when native execution is enabled, retaining existing behavior otherwise.

…ive execution

sourcery-ai · 2026-03-18T04:58:38Z

Reviewer's Guide

Makes SQL-to-row-expression translation aware of native execution mode and disables LIKE prefix/suffix rewrites when native execution is enabled so that Velox can apply its own optimized LIKE evaluation.

Sequence diagram for LIKE translation with nativeExecutionEnabled flag

sequenceDiagram
    actor Client
    participant Coordinator
    participant SqlToRowExpressionTranslator
    participant Visitor
    participant VeloxEngine

    Client->>Coordinator: Submit query with LIKE predicate
    Coordinator->>Coordinator: Determine nativeExecutionEnabled
    Coordinator->>SqlToRowExpressionTranslator: translate(expression, types, layout, functionAndTypeManager, session, context)
    SqlToRowExpressionTranslator->>SqlToRowExpressionTranslator: translate(..., user, transactionId, sqlFunctionProperties, sessionFunctions, context, nativeExecutionEnabled)
    SqlToRowExpressionTranslator->>Visitor: new Visitor(..., nativeExecutionEnabled)
    SqlToRowExpressionTranslator->>Visitor: process(expression, context)
    Visitor->>Visitor: visitLikePredicate(node, context)
    alt nativeExecutionEnabled == false
        Visitor->>Visitor: generateLikePrefixOrSuffixMatch(value, pattern)
        Visitor-->>SqlToRowExpressionTranslator: rewritten SUBSTR/STRPOS RowExpression
    else nativeExecutionEnabled == true
        Visitor->>Visitor: skip generateLikePrefixOrSuffixMatch
        Visitor-->>SqlToRowExpressionTranslator: LIKE RowExpression without rewrite
        SqlToRowExpressionTranslator-->>VeloxEngine: pass LIKE RowExpression
        VeloxEngine-->>Coordinator: evaluate LIKE via OptimizedLike
    end
    Coordinator-->>Client: Return query results

Class diagram for SqlToRowExpressionTranslator nativeExecutionEnabled wiring

classDiagram
    class SqlToRowExpressionTranslator {
        +RowExpression translate(Expression expression, Map~NodeRef_Expression_, Type~ types, Map~VariableReferenceExpression, Integer~ layout, FunctionAndTypeManager functionAndTypeManager, Session session, Context context)
        +RowExpression translate(Expression expression, Map~NodeRef_Expression_, Type~ types, Map~VariableReferenceExpression, Integer~ layout, FunctionAndTypeManager functionAndTypeManager, Optional~String~ user, Optional~TransactionId~ transactionId, SqlFunctionProperties sqlFunctionProperties, Map~SqlFunctionId, SqlInvokedFunction~ sessionFunctions, Context context)
        -RowExpression translate(Expression expression, Map~NodeRef_Expression_, Type~ types, Map~VariableReferenceExpression, Integer~ layout, FunctionAndTypeManager functionAndTypeManager, Optional~String~ user, Optional~TransactionId~ transactionId, SqlFunctionProperties sqlFunctionProperties, Map~SqlFunctionId, SqlInvokedFunction~ sessionFunctions, Context context, boolean nativeExecutionEnabled)
    }

    class Visitor {
        -Map~NodeRef_Expression_, Type~ types
        -Map~VariableReferenceExpression, Integer~ layout
        -FunctionAndTypeResolver functionAndTypeResolver
        -Optional~String~ user
        -Optional~TransactionId~ transactionId
        -SqlFunctionProperties sqlFunctionProperties
        -Map~SqlFunctionId, SqlInvokedFunction~ sessionFunctions
        -FunctionResolution functionResolution
        -boolean nativeExecutionEnabled

        +Visitor(Map~NodeRef_Expression_, Type~ types, Map~VariableReferenceExpression, Integer~ layout, FunctionAndTypeManager functionAndTypeManager, Optional~String~ user, Optional~TransactionId~ transactionId, SqlFunctionProperties sqlFunctionProperties, Map~SqlFunctionId, SqlInvokedFunction~ sessionFunctions, boolean nativeExecutionEnabled)
        +RowExpression visitLikePredicate(LikePredicate node, Context context)
        -RowExpression generateLikePrefixOrSuffixMatch(RowExpression value, RowExpression pattern)
    }

    SqlToRowExpressionTranslator ..> Visitor : creates

File-Level Changes

Change	Details	Files
Thread native-execution enablement through SqlToRowExpressionTranslator and its Visitor so translation can behave differently under native execution.	Extend the public translate(...) overload that takes a Session to pass a native-execution-enabled flag derived from the session. Introduce a new private translate(...) overload that accepts a nativeExecutionEnabled boolean and delegate the existing public non-Session translate(...) method to it with a default of false. Update the Visitor constructor to accept and store a nativeExecutionEnabled flag for use during expression translation.	`presto-main-base/src/main/java/com/facebook/presto/sql/relational/SqlToRowExpressionTranslator.java`
Guard LIKE prefix/suffix rewrite logic so it is only applied when native execution is disabled.	Extend Visitor state with a nativeExecutionEnabled field used during LIKE predicate translation. Wrap the generateLikePrefixOrSuffixMatch(...) call in visitLikePredicate with a check that only executes the rewrite when native execution is not enabled. Preserve existing behavior for non-native execution by leaving the LIKE rewrite path unchanged when nativeExecutionEnabled is false.	`presto-main-base/src/main/java/com/facebook/presto/sql/relational/SqlToRowExpressionTranslator.java`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey - I've left some high level feedback:

Consider avoiding the extra boolean parameter on the private translate/Visitor constructor by threading native-execution awareness through Context or a small options object, which will scale better if additional execution-mode flags are added later.
It would be helpful to document in the visitLikePredicate logic (e.g., a short comment) that disabling the prefix/suffix rewrite under native execution intentionally defers to Velox's OptimizedLike implementation, so future maintainers understand why this early-return is gated.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- Consider avoiding the extra boolean parameter on the private `translate`/`Visitor` constructor by threading native-execution awareness through `Context` or a small options object, which will scale better if additional execution-mode flags are added later.
- It would be helpful to document in the `visitLikePredicate` logic (e.g., a short comment) that disabling the prefix/suffix rewrite under native execution intentionally defers to Velox's `OptimizedLike` implementation, so future maintainers understand why this early-return is gated.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

pramodsatya · 2026-03-19T19:26:10Z

@majetideepak, sharing results from standalone microbenchmark comparing Velox's native LIKE evaluation (OptimizedLike in Re2Functions.cpp) against Presto's coordinator-side LIKE rewrites (SUBSTR/STRPOS decompositions). Built and run in Release mode on Apple M1, 10K rows × 50 iterations.

Results

LikeVsRewrite Microbenchmark
Vector size: 10000, Iterations: 50

======================================================================
  PREFIX ASCII:  like(x, 'hello_world%') vs substr(x,1,11)='hello_world'
======================================================================
  Expression                                    ns/row      relative
----------------------------------------------------------------------
  like_native (memcmp)                          9.75        1.00x
  substr_eq (Presto rewrite)                   28.95        0.34x

======================================================================
  PREFIX UTF-8:  like(x, 'élève%') vs substr(x,1,5)='élève'
======================================================================
  Expression                                    ns/row      relative
----------------------------------------------------------------------
  like_native (memcmp)                         42.85        1.00x
  substr_eq (Presto rewrite)                   87.11        0.49x

======================================================================
  SUFFIX ASCII:  like(x, '%hello_world') vs substr(x,-11)='hello_world'
======================================================================
  Expression                                    ns/row      relative
----------------------------------------------------------------------
  like_native (memcmp)                          7.18        1.00x
  substr_eq (Presto rewrite)                56449.70        0.00x

======================================================================
  SUFFIX UTF-8:  like(x, '%élève') vs substr(x,-5)='élève'
======================================================================
  Expression                                    ns/row      relative
----------------------------------------------------------------------
  like_native (memcmp)                          6.40        1.00x
  substr_eq (Presto rewrite)                35907.14        0.00x

======================================================================
  SUBSTRING ASCII:  like(x, '%hello_world%') vs strpos(x,'hello_world')>0
======================================================================
  Expression                                    ns/row      relative
----------------------------------------------------------------------
  like_native (sv::find)                     2999.87        1.00x
  strpos_gt0 (Presto rewrite)                1354.90        2.21x

======================================================================
  SUBSTRING UTF-8:  like(x, '%élève%') vs strpos(x,'élève')>0
======================================================================
  Expression                                    ns/row      relative
----------------------------------------------------------------------
  like_native (sv::find)                     1732.29        1.00x
  strpos_gt0 (Presto rewrite)                1494.59        1.16x

======================================================================
  EXACT ASCII:  like(x, 'hello_world') vs x='hello_world'
======================================================================
  Expression                                    ns/row      relative
----------------------------------------------------------------------
  like_native (memcmp)                          2.99        1.00x
  equality (std::string==)                      6.93        0.43x

======================================================================
  relative > 1.0  →  like_native is faster
  relative < 1.0  →  Presto rewrite is faster
======================================================================

Analysis

Pattern	Velox approach	Presto rewrite	Velox (ns/row)	Presto (ns/row)	Speedup	Why
`'prefix%'` ASCII	`memcmp` first N bytes	`SUBSTR(x,1,len)=needle`	9.75	28.95	3.0x	Velox does fixed `memcmp`; Presto walks N UTF-8 codepoints to find byte offset
`'prefix%'` UTF-8	`memcmp` first N bytes	`SUBSTR(x,1,len)=needle`	42.85	87.11	2.0x	Same as above, larger strings reduce relative gap
`'%suffix'` ASCII	`memcmp` last N bytes	`SUBSTR(x,-len)=needle`	7.18	56,449.70	7,862x	Velox is O(needle_len); Presto walks entire string counting codepoints to find total length
`'%suffix'` UTF-8	`memcmp` last N bytes	`SUBSTR(x,-len)=needle`	6.40	35,907.14	5,611x	Same — Presto's full-string UTF-8 length scan dominates
`'%substr%'` ASCII	`string_view::find`	`STRPOS(x,needle)>0`	2,999.87	1,354.90	~1x (tie)	Both do identical byte-level substring search
`'%substr%'` UTF-8	`string_view::find`	`STRPOS(x,needle)>0`	1,732.29	1,494.59	~1x (tie)	Same byte-level search, run-to-run noise
`'exact'`	`size()` + `memcmp`	`std::string==`	2.99	6.93	2.3x	Velox avoids string construction overhead

Suffix matching is the dominant win — Velox's O(needle_len) memcmp vs Presto's O(string_len) UTF-8 codepoint walk yields 3–4 orders of magnitude improvement. Prefix and exact patterns show solid 2–3x wins. Substring matching is an identical underlying operation and has same performance. No regressions in any pattern class.

aditi-pandit

Thanks @pramodsatya. Looks good.

perf(native): Avoid LIKE rewrites for prefix/suffix patterns in nat…

68a5374

…ive execution

prestodb-ci added the from:IBM PR from IBM label Mar 18, 2026

pramodsatya marked this pull request as ready for review March 18, 2026 20:16

pramodsatya requested review from a team, feilong-liu and jaystarshot as code owners March 18, 2026 20:16

prestodb-ci requested review from a team, Dilli-Babu-Godari and Joe-Abraham and removed request for a team March 18, 2026 20:16

pramodsatya requested review from aditi-pandit, majetideepak and tdcmeehan and removed request for Dilli-Babu-Godari and Joe-Abraham March 18, 2026 20:16

sourcery-ai bot reviewed Mar 18, 2026

View reviewed changes

aditi-pandit approved these changes Mar 25, 2026

View reviewed changes

tdcmeehan approved these changes Mar 25, 2026

View reviewed changes

pramodsatya merged commit e2d8bc9 into prestodb:master Mar 26, 2026
155 of 160 checks passed

pramodsatya deleted the like_cpp branch March 26, 2026 00:18

This was referenced Mar 31, 2026

docs: Add release notes for 0.297 unix280/presto#51

Closed

docs: Add release notes for 0.297 unix280/presto#52

Open

prestodb-ci mentioned this pull request Apr 1, 2026

docs: Add release notes for 0.297 #27484

Open

15 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(native): Avoid `LIKE` rewrites for prefix/suffix patterns in native execution#27363

perf(native): Avoid `LIKE` rewrites for prefix/suffix patterns in native execution#27363
pramodsatya merged 1 commit intoprestodb:masterfrom
pramodsatya:like_cpp

pramodsatya commented Mar 18, 2026 •

edited

Loading

Uh oh!

sourcery-ai bot commented Mar 18, 2026 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai bot left a comment

Uh oh!

pramodsatya commented Mar 19, 2026

Uh oh!

aditi-pandit left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

pramodsatya commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Impact

Test Plan

Release Notes

Summary by Sourcery

Summary by Sourcery

Uh oh!

sourcery-ai bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Sequence diagram for LIKE translation with nativeExecutionEnabled flag

Class diagram for SqlToRowExpressionTranslator nativeExecutionEnabled wiring

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

pramodsatya commented Mar 19, 2026

Results

Analysis

Uh oh!

aditi-pandit left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pramodsatya commented Mar 18, 2026 •

edited

Loading

sourcery-ai bot commented Mar 18, 2026 •

edited

Loading