Skip to content

docs: Add to Presto C++ limitations doc#27120

Merged
steveburnett merged 1 commit intoprestodb:masterfrom
steveburnett:steveburnett-prestocpp-query-workarounds
Feb 17, 2026
Merged

docs: Add to Presto C++ limitations doc#27120
steveburnett merged 1 commit intoprestodb:masterfrom
steveburnett:steveburnett-prestocpp-query-workarounds

Conversation

@steveburnett
Copy link
Copy Markdown
Contributor

@steveburnett steveburnett commented Feb 10, 2026

Description

I edited a draft document at the request of @amitkdutta and @kgpai to help prepare that document for publication as the blog post Presto vs Prestissimo – Known differences and workarounds.

I thought the content in the blog post was valuable and should be added to the Presto documentation. In this PR I have revised the blog post to follow the format and style of Presto documentation to add to the Presto docs.

Following feedback I incorporated the content of the blog post into presto_cpp/limitations.rst.

Motivation and Context

Improves Presto documentation of Presto C++, helping readers to be aware of limitations when running Presto queries in C++, and advise how to rewrite Presto queries to run successfully in Presto C++.

Impact

Documentation.

Test Plan

Local doc builds.

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.
  • If adding new dependencies, verified they have an OpenSSF Scorecard score of 5.0 or higher (or obtained explicit TSC approval for lower scores).

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

General Changes
* Add documentation for Presto queries to run in Presto C++ to :doc:`/presto_cpp/limitations`.

Summary by Sourcery

Documentation:

  • Introduce a Presto C++ queries documentation page covering known behavioral differences and recommended query workarounds.

@steveburnett steveburnett self-assigned this Feb 10, 2026
@steveburnett steveburnett requested review from a team and elharo as code owners February 10, 2026 17:55
@prestodb-ci prestodb-ci added the from:IBM PR from IBM label Feb 10, 2026
@github-project-automation github-project-automation bot moved this to 🆕 Unprioritized in Presto Documentation Feb 10, 2026
@prestodb-ci prestodb-ci requested review from a team, ShahimSharafudeen and namya28 and removed request for a team February 10, 2026 17:55
@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai bot commented Feb 10, 2026

Reviewer's guide (collapsed on small PRs)

Reviewer's Guide

Adds a new Presto C++ documentation page documenting known query differences and workarounds between legacy Presto and Presto C++, and wires it into the existing Presto C++ docs hierarchy.

File-Level Changes

Change Details Files
Add a dedicated Presto C++ queries documentation page describing known differences and workarounds versus Presto.
  • Create a new Sphinx documentation page under the Presto C++ docs section focused on query behavior and compatibility
  • Adapt content from the “Presto vs Prestissimo – Known differences and workarounds” blog post into documentation style, including structure, headings, and examples
  • Organize the page with a table of contents suitable for navigation within the Presto C++ documentation section
presto-docs/src/main/sphinx/presto_cpp/queries.rst
Integrate the new queries page into the Presto C++ documentation navigation.
  • Update the main Presto C++ Sphinx index to include and cross‑link the new queries documentation page
  • Ensure the new page participates correctly in the sidebar/table-of-contents navigation for Presto C++ docs
presto-docs/src/main/sphinx/presto-cpp.rst

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 5 issues

Prompt for AI Agents
Please address the comments from this code review:

## Individual Comments

### Comment 1
<location> `presto-docs/src/main/sphinx/presto_cpp/queries.rst:61` </location>
<code_context>
+  For example, in the above case ``event_time`` is used for comparison throughout the lambda. 
+  If we rewrote the expression as following, where ``x`` and ``y`` have different fields, it will fail: 
+  ``(x, y) -> if (x.event_time < y.event_start_time, -1, if (x.event_time > y.event_start_time, 1, 0))``
+* Any additional nesting other from the two ``if`` uses shown above will fail.
+
+``Array_sort`` can support any transformation lambda that returns a comparable type. This example is not supported:
</code_context>

<issue_to_address>
**issue (typo):** Fix wording in "other from" phrase.

"Other from" is ungrammatical here. Consider rephrasing to: `Any additional nesting other than the two ``if`` uses shown above will fail.`

```suggestion
* Any additional nesting other than the two ``if`` uses shown above will fail.
```
</issue_to_address>

### Comment 2
<location> `presto-docs/src/main/sphinx/presto_cpp/queries.rst:137-140` </location>
<code_context>
+
+Canonicalization
+----------------
+* **Issue**: Presto ``json_extract`` `can return json that is not canonicalized <https://github.com/prestodb/presto/issues/24563#issue-2852506643>`_.
+* **Workaround**: ``json_extract`` has been rewritten in Prestissimo to always return canonical JSON.
+
</code_context>

<issue_to_address>
**nitpick (typo):** Capitalize JSON consistently.

In the bullet point, change "json" to "JSON" in the phrase "can return json that is not canonicalized".

```suggestion
Canonicalization
----------------
* **Issue**: Presto ``json_extract`` `can return JSON that is not canonicalized <https://github.com/prestodb/presto/issues/24563#issue-2852506643>`_.
* **Workaround**: ``json_extract`` has been rewritten in Prestissimo to always return canonical JSON.
```
</issue_to_address>

### Comment 3
<location> `presto-docs/src/main/sphinx/presto_cpp/queries.rst:150` </location>
<code_context>
+* Presto C++ uses `RE2 <https://github.com/google/re2>`_, a widely adopted modern regular expression parsing library. 
+  RE2 provides most of the functionality of PCRE using a C++ interface very close to 
+  that of PCRE, which guarantees `linear time execution and a fixed stack footprint <https://swtch.com/~rsc/regexp/regexp3.html#caveats>`_. 
+* Presto Java uses `JONI <https://github.com/jruby/joni>`_ which is a deprecated port of ONI. 
+
+While both frameworks support almost all regular expression syntaxes, RE2 differs from JONI and PCRE in 
</code_context>

<issue_to_address>
**issue (typo):** Correct the name of the regex engine JONI is based on.

"Deprecated port of ONI" appears to be a typo. JONI is a port of Oniguruma (often abbreviated ONIG). Please update the text to refer to Oniguruma/ONIG instead of ONI.

```suggestion
* Presto Java uses `JONI <https://github.com/jruby/joni>`_ which is a deprecated port of Oniguruma (ONIG). 
```
</issue_to_address>

### Comment 4
<location> `presto-docs/src/main/sphinx/presto_cpp/queries.rst:221` </location>
<code_context>
+URL Functions
+=============
+
+Presto and Prestissimo implement different URL functions specs which can lead to 
+some URL function mismatches. Prestissimo implements `RFC-3986 <https://datatracker.ietf.org/doc/html/rfc3986>`_ whereas Presto 
+implements `RFC-2396 <https://datatracker.ietf.org/doc/html/rfc2396>`_. This can lead to subtle differences as presented in 
</code_context>

<issue_to_address>
**suggestion (typo):** Tighten phrasing of "URL functions specs".

"URL functions specs" reads a bit awkwardly. Consider "URL function specs" or "URL function specifications" instead.

```suggestion
Presto and Prestissimo implement different URL function specifications which can lead to 
```
</issue_to_address>

### Comment 5
<location> `presto-docs/src/main/sphinx/presto_cpp/queries.rst:289-291` </location>
<code_context>
+
+In Presto, the result of CAST(TIMESTAMP AS TIME) or CAST(TIMESTAMP AS TIME WITH TIME ZONE) would change based on the 
+session property ``legacy_timestamp`` (true by default) when applied to the user's time zone. 
+In Presto C++ for TIME/TIME WITH TIMEZONE the behavior will be equivalent to the property being false.
+
+Note: ``TIMESTAMP`` behavior in Presto and Presto C++ is unchanged.
</code_context>

<issue_to_address>
**issue (typo):** Align TIME WITH TIME ZONE spelling with the rest of the document.

Here this is written as "TIME WITH TIMEZONE"; elsewhere you use "TIME WITH TIME ZONE". Please update this occurrence to "TIME/TIME WITH TIME ZONE" for consistency.

```suggestion
In Presto, the result of CAST(TIMESTAMP AS TIME) or CAST(TIMESTAMP AS TIME WITH TIME ZONE) would change based on the 
session property ``legacy_timestamp`` (true by default) when applied to the user's time zone. 
In Presto C++ for TIME/TIME WITH TIME ZONE the behavior will be equivalent to the property being false.
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@steveburnett
Copy link
Copy Markdown
Contributor Author

steveburnett commented Feb 10, 2026

Okay, this is interesting. The release note entry appears to be correct, but the CI check fails with the following error:

2026-02-10T17:56:52.281Z	ERROR	main	com.facebook.presto.release.tasks.GenerateReleaseNotesTask	

Bad release notes for PR #0: expect section header, found [Summary by Sourcery]

It appears that the Sourcery edit of the PR description is being parsed by the release-note CI test?

This doesn't block merge but I wanted to mention this as I hadn't seen this before in other PRs. If there's anything I can do to fix this - other than simply delete the Sourcery summary manually - let me know.

Update: CI check now passes after I pushed an update. Don't understand why it failed the first time.

@steveburnett steveburnett force-pushed the steveburnett-prestocpp-query-workarounds branch from c225b0f to 967bdc8 Compare February 11, 2026 17:39
@steveburnett steveburnett moved this from 🆕 Unprioritized to 👀 Review in Presto Documentation Feb 11, 2026
@steveburnett
Copy link
Copy Markdown
Contributor Author

Addressed Sourcery suggested edits and did another beginning-to-end edit for consistency, phrasing, and structure.

amitkdutta
amitkdutta previously approved these changes Feb 13, 2026
Copy link
Copy Markdown
Contributor

@amitkdutta amitkdutta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks @steveburnett

Copy link
Copy Markdown
Contributor

@tdcmeehan tdcmeehan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of these sound more like limitations. What do you think about merging this doc with the limitations doc?

@steveburnett
Copy link
Copy Markdown
Contributor Author

A lot of these sound more like limitations. What do you think about merging this doc with the limitations doc?

@tdcmeehan Okay, I'll work on rewriting this. Do you want me to merge the entire queries page into limitations.rst?

@tdcmeehan
Copy link
Copy Markdown
Contributor

@steveburnett yes. I think we can just add these details to that page, since it's all about the differences between C++ and Java clusters which we won't fix (at least any time soon).

@steveburnett
Copy link
Copy Markdown
Contributor Author

@steveburnett yes. I think we can just add these details to that page, since it's all about the differences between C++ and Java clusters which we won't fix (at least any time soon).

Will do!

@steveburnett steveburnett force-pushed the steveburnett-prestocpp-query-workarounds branch from 967bdc8 to e8db284 Compare February 17, 2026 19:31
@steveburnett steveburnett changed the title docs: Add Presto C++ query workarounds doc docs: Add to Presto C++ limitations doc Feb 17, 2026
@steveburnett steveburnett force-pushed the steveburnett-prestocpp-query-workarounds branch from e8db284 to 84bd688 Compare February 17, 2026 19:46
Co-authored-by: Amit Dutta <amit.kolorob@gmail.com>
Co-authored-by: Krishna Pai <kgpai@meta.com>
@steveburnett steveburnett force-pushed the steveburnett-prestocpp-query-workarounds branch from 84bd688 to c0c1d1a Compare February 17, 2026 19:52
Copy link
Copy Markdown
Contributor

@tdcmeehan tdcmeehan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great addition. Thank you!

@github-project-automation github-project-automation bot moved this from 👀 Review to ✅ Done in Presto Documentation Feb 17, 2026
@steveburnett steveburnett merged commit 69b959d into prestodb:master Feb 17, 2026
78 of 79 checks passed
@steveburnett steveburnett deleted the steveburnett-prestocpp-query-workarounds branch February 17, 2026 23:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs from:IBM PR from IBM

Projects

Status: ✅ Done

Development

Successfully merging this pull request may close these issues.

4 participants