feat: Add timestamp() KQL filter pushdown; sync DeprecatedDateString rename with clp-s upstream.#148
Conversation
…catedDateString rename with CLP-S upstream. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
📝 WalkthroughWalkthroughMaps two CLP node types (DeprecatedDateString and Timestamp) to TIMESTAMP, renames DateString → DeprecatedDateString, adds TIMESTAMP literal formatting in KQL translation, and extends tests with timestamp push-down and timezone-aware test sessions. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
presto-clp/src/main/java/com/facebook/presto/plugin/clp/optimization/ClpFilterToKqlConverter.java (1)
777-798:⚠️ Potential issue | 🟠 MajorTimestamp formatting is still skipped for
IN (...)literals.
formatLiteral(...)added at Line 902 is not used inhandleIn; at Line 797 non-string literals are appended raw. This leaves timestampINpushdown inconsistent with the updated comparison/BETWEEN behaviour.🔧 Proposed fix
@@ private ClpExpression handleIn(SpecialFormExpression node) { @@ if (literal.getType() instanceof VarcharType) { queryBuilder.append("\"").append(literalString).append("\""); } else { - queryBuilder.append(literalString); + queryBuilder.append(formatLiteral(literal.getType(), literalString)); } queryBuilder.append(" OR "); }Also applies to: 902-908
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@presto-clp/src/main/java/com/facebook/presto/plugin/clp/optimization/ClpFilterToKqlConverter.java` around lines 777 - 798, The handleIn(SpecialFormExpression) implementation currently appends non-string literals raw, skipping timestamp formatting; update handleIn to call the shared formatLiteral(...) helper (used elsewhere) when building each literal in the IN list so timestamps and other special types are formatted consistently; ensure you still wrap Varchar values in quotes (or let formatLiteral handle quoting if it does) and replace the existing getLiteralString()/raw append logic in handleIn with a call to formatLiteral(literal) when constructing queryBuilder entries.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@presto-clp/src/main/java/com/facebook/presto/plugin/clp/optimization/ClpFilterToKqlConverter.java`:
- Around line 902-908: The code formats only TIMESTAMP in formatLiteral(Type
literalType, String literalString) but marks TIMESTAMP_MICROSECONDS as
pushdown-compatible elsewhere, causing inconsistent output; update formatLiteral
to treat TIMESTAMP_MICROSECONDS the same as TIMESTAMP (wrap with
format("timestamp(\"%s\", \"\\L\")")), or alternatively remove
TIMESTAMP_MICROSECONDS from the pushdown set where pushdown eligibility is
declared, and add/adjust unit tests to cover TIMESTAMP_MICROSECONDS pushdown and
literal formatting to prevent regressions.
In
`@presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpFilterToKql.java`:
- Around line 275-298: Add two assertions in
TestClpFilterToKql.testTimestampPushDown using testPushDown to cover the IN and
!= code paths: for IN call testPushDown(sessionHolder, "clpTimestamp IN
(TIMESTAMP '2023-01-01 00:00:00.000', TIMESTAMP '2023-01-02 00:00:00.000')",
"<expected-kql-for-IN>", null) where the expected KQL uses the same timestamp
literal formatting as other assertions (timestamp("1672531200000", "\\L") and
timestamp("1672617600000", "\\L") combined in the IN list); and for != call
testPushDown(sessionHolder, "clpTimestamp != TIMESTAMP '2023-01-01
00:00:00.000'", "<expected-kql-for-not-equals>", null) using the same
timestamp("1672531200000", "\\L") representation. Ensure the new assertions are
added inside testTimestampPushDown alongside the existing testPushDown calls so
the IN and != comparison builders are exercised.
---
Outside diff comments:
In
`@presto-clp/src/main/java/com/facebook/presto/plugin/clp/optimization/ClpFilterToKqlConverter.java`:
- Around line 777-798: The handleIn(SpecialFormExpression) implementation
currently appends non-string literals raw, skipping timestamp formatting; update
handleIn to call the shared formatLiteral(...) helper (used elsewhere) when
building each literal in the IN list so timestamps and other special types are
formatted consistently; ensure you still wrap Varchar values in quotes (or let
formatLiteral handle quoting if it does) and replace the existing
getLiteralString()/raw append logic in handleIn with a call to
formatLiteral(literal) when constructing queryBuilder entries.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (5)
presto-clp/src/main/java/com/facebook/presto/plugin/clp/metadata/ClpSchemaTree.javapresto-clp/src/main/java/com/facebook/presto/plugin/clp/metadata/ClpSchemaTreeNodeType.javapresto-clp/src/main/java/com/facebook/presto/plugin/clp/optimization/ClpFilterToKqlConverter.javapresto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpFilterToKql.javapresto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpQueryBase.java
...o-clp/src/main/java/com/facebook/presto/plugin/clp/optimization/ClpFilterToKqlConverter.java
Show resolved
Hide resolved
presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpFilterToKql.java
Show resolved
Hide resolved
…PrestoNativeClpGeneralQueries. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…onstructor. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpQueryBase.java`:
- Around line 146-149: The SessionHolder constructors set connectorSession =
SESSION which uses a default timezone; change both constructors to set
connectorSession = session.toConnectorSession() so the connectorSession's
timezone matches the built session (use Session.toConnectorSession() to
propagate timezone via SqlFunctionProperties); update any references in the
SessionHolder class (constructors and getConnectorSession()) accordingly.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpQueryBase.java
| connectorSession = SESSION; | ||
| session = TestingSession.testSessionBuilder(createTestingSessionPropertyManager(new SystemSessionProperties().getSessionProperties())) | ||
| .setTimeZoneKey(timeZoneKey) | ||
| .build(); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# First, let's find and read the file to understand the context
fd -e java "TestClpQueryBase.java" -x cat -n {}Repository: y-scope/presto
Length of output: 1694
🌐 Web query:
Presto Session toConnectorSession method API
💡 Result:
In PrestoDB (server-side, presto-main), com.facebook.presto.Session exposes two overloads that convert an engine Session into the SPI-facing ConnectorSession used by connectors:
public ConnectorSession toConnectorSession()public ConnectorSession toConnectorSession(ConnectorId connectorId)(use this when you need connector-specific session properties for a particular connector/catalog) [1]
ConnectorSession is the session object passed into connector SPI entry points (for example, many methods on ConnectorMetadata take a ConnectorSession session parameter) [2].
Sources:
[1] Presto Session Javadoc (presto-main) https://javadoc.io/static/com.facebook.presto/presto-main/0.256/com/facebook/presto/Session.html
[2] Presto SPI ConnectorMetadata Javadoc (example of ConnectorSession usage) https://javadoc.io/static/com.facebook.presto/presto-spi/0.236/com/facebook/presto/spi/connector/ConnectorMetadata.html
🌐 Web query:
Presto ConnectorSession timezone properties
💡 Result:
In Presto connector code, the session time zone is not a generic getProperty("...") value. It’s a first-class attribute on ConnectorSession:
- Read it with
ConnectorSession#getTimeZoneKey()(returns aTimeZoneKey). [1]
How it gets set (what drives getTimeZoneKey()):
- Presto client protocol: clients can send
X-Presto-Time-Zone; if they don’t, Presto uses the coordinator/engine default time zone. [2] - JDBC driver: you can set the JDBC parameter
timeZoneId(and historically viaPrestoConnection.setTimeZoneId(String)). [3][4]
Related SQL-side visibility:
current_timezone()returns the current session time zone (IANA name or fixed offset). [5]
Sources:
[1] https://javadoc.io/static/com.facebook.presto/presto-main/0.285.1/com/facebook/presto/testing/TestingConnectorSession.html
[2] https://prestodb.github.io/docs/current/develop/client-protocol.html
[3] https://prestodb.io/docs/current/clients/java.html
[4] https://prestodb.github.io/docs/current/release/release-0.66.html
[5] https://prestodb.github.io/docs/current/functions/datetime.html
🌐 Web query:
Presto TestingConnectorSession.SESSION default timezone
💡 Result:
TestingConnectorSession.SESSION is just a static convenience ConnectorSession instance; the Javadoc does not document what time zone it uses by default (it only exposes getTimeZoneKey()). [1]
So the only reliable ways to know the “default timezone” are to inspect the actual TestingConnectorSession implementation for your Presto version or check it at runtime:
TimeZoneKey tz = TestingConnectorSession.SESSION.getTimeZoneKey();More generally (outside this test helper), Presto’s session time zone typically comes from the client, and if not available, from the coordinator/server. [2]
Sources:
[1] Presto TestingConnectorSession Javadoc (shows SESSION and getTimeZoneKey()) https://javadoc.io/static/com.facebook.presto/presto-main/0.285.1/com/facebook/presto/testing/TestingConnectorSession.html
[2] Presto docs (session time zone behavior) https://prestodb.github.io/docs/current/release/release-0.66.html
🌐 Web query:
Presto Session setTimeZoneKey toConnectorSession propagation
💡 Result:
In Presto, Session.SessionBuilder#setTimeZoneKey(...) sets the timeZoneKey field that becomes part of the immutable Session object when you call build(). The Session then propagates that value to ConnectorSession via toConnectorSession(...) like this:
- Builder stores TZ:
SessionBuilder#setTimeZoneKey(TimeZoneKey timeZoneKey)assignsthis.timeZoneKey = timeZoneKey. [1] - Session stores TZ:
Sessionkeeps it asprivate final TimeZoneKey timeZoneKey;and exposesgetTimeZoneKey(). [1] - Session bakes TZ into
SqlFunctionProperties:Session#getSqlFunctionProperties()buildsSqlFunctionPropertieswith.setTimeZoneKey(timeZoneKey). [1] toConnectorSession()wraps the Session:Session#toConnectorSession()returnsnew FullConnectorSession(this, ...). [1]FullConnectorSessioncaptures the Session’s function properties: its constructor setsthis.sqlFunctionProperties = session.getSqlFunctionProperties();and returns them fromgetSqlFunctionProperties(). [2]
So the effective propagation path is:
SessionBuilder.setTimeZoneKey → Session.timeZoneKey → Session.getSqlFunctionProperties().timeZoneKey → FullConnectorSession.getSqlFunctionProperties().timeZoneKey → connector code reads it from ConnectorSession (typically via getSqlFunctionProperties()). [1][2]
Sources
[1] Presto Session.java (0.256) showing setTimeZoneKey, getSqlFunctionProperties().setTimeZoneKey(...), and toConnectorSession() creating FullConnectorSession. (raw.githubusercontent.com)
[2] Presto FullConnectorSession.java (0.256) showing it stores session.getSqlFunctionProperties() and exposes it via getSqlFunctionProperties(). (raw.githubusercontent.com)
Citations:
- 1: https://raw.githubusercontent.com/prestodb/presto/0.256/presto-main/src/main/java/com/facebook/presto/Session.java
- 2: https://raw.githubusercontent.com/prestodb/presto/0.256/presto-main/src/main/java/com/facebook/presto/FullConnectorSession.java
Align connectorSession timezone with session timezone in SessionHolder constructors.
The connectorSession is set to the static TestingConnectorSession.SESSION (which has a default timezone), while session is built with an explicit setTimeZoneKey(). When callers invoke getConnectorSession(), they receive a connector session with a misaligned timezone, causing non-deterministic behavior in timezone-sensitive assertions.
Replace connectorSession = SESSION; with connectorSession = session.toConnectorSession(); in both constructors (lines 133–135 and 146–149). The Session.toConnectorSession() method properly propagates the timezone through SqlFunctionProperties to the returned FullConnectorSession.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpQueryBase.java`
around lines 146 - 149, The SessionHolder constructors set connectorSession =
SESSION which uses a default timezone; change both constructors to set
connectorSession = session.toConnectorSession() so the connectorSession's
timezone matches the built session (use Session.toConnectorSession() to
propagate timezone via SqlFunctionProperties); update any references in the
SessionHolder class (constructors and getConnectorSession()) accordingly.
gibber9809
left a comment
There was a problem hiding this comment.
LGTM. For the PR title how about:
feat: Add `timestamp()` KQL filter pushdown; sync `DeprecatedDateString` rename with clp-s upstream.
timestamp() KQL filter pushdown; sync DeprecatedDateString rename with clp-s upstream.
…stamp-kql-pushdown
… support Includes Velox PR y-scope#54 which adds searching and marshalling for the new CLP-S Timestamp column type (byte 14), enabling e2e timestamp pushdown. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
dea6fee
into
y-scope:release-0.297-edge-10-clp-connector
Description
Summary
Motivation/Background
CLP-S introduced a new schema node type Timestamp to store timestamps in the archive, replacing the older DateString format for better precision handling. The older type has since been renamed DeprecatedDateString in CLP-S source.
CLP-S's KQL engine requires timestamp literals to be expressed as timestamp("epochMs", "\L"), where \L signals epoch-millisecond precision. Previously, the connector pushed down timestamp filters using raw epoch-ms integers (e.g., ts > 1672531200000), which is not valid KQL for the new Timestamp-typed columns.
Note, the timestamp("epochMs", "\L") covers kql push down for both the old DeprecatedDateString and the new Timestamp type. Such backward compatibility is considered at CLP-S, so Presto does not need to address such problem.
Checklist
breaking change.
Validation performed
Timestampcolumn type to replaceDateStringcolumn type; Bump the archive version to0.5.0. clp#1788The result is column: timestamp type: timestamp
The push down kql shown in the log is:
KQL query: timestamp >= timestamp("1679892099863", "\L") AND timestamp <= timestamp("1679892099880", "\L")The push down kql shown in the log is
KQL query: timestamp >= timestamp("1679892099863", "\L")Summary by CodeRabbit
New Features
Bug Fixes
Tests