Skip to content

feat: Add timestamp() KQL filter pushdown; sync DeprecatedDateString rename with clp-s upstream.#148

Merged
20001020ycx merged 5 commits intoy-scope:release-0.297-edge-10-clp-connectorfrom
20001020ycx:feat/clp-timestamp-kql-pushdown
Mar 3, 2026
Merged

feat: Add timestamp() KQL filter pushdown; sync DeprecatedDateString rename with clp-s upstream.#148
20001020ycx merged 5 commits intoy-scope:release-0.297-edge-10-clp-connectorfrom
20001020ycx:feat/clp-timestamp-kql-pushdown

Conversation

@20001020ycx
Copy link

@20001020ycx 20001020ycx commented Feb 25, 2026

Description

Summary

  • Update timestamp KQL filter pushdown to conform the new API required by CLP-s - wrapping epochMs constant provided by the Presto TypeTimestampType.TIMESTAMP in timestamp("epochMs", "\L").
  • Add Timestamp (byte 14) to the CLP-S schema node type enum and map it to TimestampType.TIMESTAMP.
  • Rename DateString → DeprecatedDateString in the Java connector enum to match the CLP-S upstream rename.
  • Upgrade velox submodule to y-scope/velox@739a3aa.

Motivation/Background

CLP-S introduced a new schema node type Timestamp to store timestamps in the archive, replacing the older DateString format for better precision handling. The older type has since been renamed DeprecatedDateString in CLP-S source.

CLP-S's KQL engine requires timestamp literals to be expressed as timestamp("epochMs", "\L"), where \L signals epoch-millisecond precision. Previously, the connector pushed down timestamp filters using raw epoch-ms integers (e.g., ts > 1672531200000), which is not valid KQL for the new Timestamp-typed columns.

Note, the timestamp("epochMs", "\L") covers kql push down for both the old DeprecatedDateString and the new Timestamp type. Such backward compatibility is considered at CLP-S, so Presto does not need to address such problem.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

show columns in clp.default.default

The result is column: timestamp type: timestamp

SELECT CLP_GET_JSON_STRING() from clp.default.default  
WHERE "timestamp" BETWEEN TIMESTAMP '2023-03-27 00:41:39.863'
                  AND TIMESTAMP '2023-03-27 00:41:39.880' 
limit 100

The push down kql shown in the log is: KQL query: timestamp >= timestamp("1679892099863", "\L") AND timestamp <= timestamp("1679892099880", "\L")

 SELECT CLP_GET_JSON_STRING()
  FROM clp.default.default
  WHERE "timestamp" >= TIMESTAMP '2023-03-27 00:41:39.863'
  LIMIT 100

The push down kql shown in the log is KQL query: timestamp >= timestamp("1679892099863", "\L")

Summary by CodeRabbit

  • New Features

    • Improved TIMESTAMP handling and literal formatting for comparisons and BETWEEN/range translations to improve pushdown accuracy.
  • Bug Fixes

    • Updated handling of legacy date-string types so they map consistently to TIMESTAMP.
  • Tests

    • Added timestamp push-down tests (including BETWEEN) and deterministic timezone-aware test setup for reliable epoch-ms assertions.

…catedDateString rename with CLP-S upstream.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link

coderabbitai bot commented Feb 25, 2026

📝 Walkthrough

Walkthrough

Maps two CLP node types (DeprecatedDateString and Timestamp) to TIMESTAMP, renames DateString → DeprecatedDateString, adds TIMESTAMP literal formatting in KQL translation, and extends tests with timestamp push-down and timezone-aware test sessions.

Changes

Cohort / File(s) Summary
Type System Updates
presto-clp/src/main/java/com/facebook/presto/plugin/clp/metadata/ClpSchemaTreeNodeType.java, presto-clp/src/main/java/com/facebook/presto/plugin/clp/metadata/ClpSchemaTree.java
Renames enum constant DateStringDeprecatedDateString, adds Timestamp enum constant, and maps DeprecatedDateString and Timestamp to TimestampType.TIMESTAMP.
Filter → KQL Conversion
presto-clp/src/main/java/com/facebook/presto/plugin/clp/optimization/ClpFilterToKqlConverter.java
Adds formatLiteral(Type, String) helper to wrap TIMESTAMP literals (e.g., timestamp("...", "\L")) and applies it to BETWEEN bounds and non-string literal comparisons when generating KQL/CLP expressions.
Unit / Integration Tests
presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpFilterToKql.java, presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpQueryBase.java, presto-native-execution/src/test/java/com/facebook/presto/nativeworker/TestPrestoNativeClpGeneralQueries.java
Adds testTimestampPushDown() and a clpTimestamp column handle, introduces SessionHolder(TimeZoneKey) constructor for timezone-fixed sessions, and updates tests to use DeprecatedDateString where DateString was used.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The PR title accurately and concisely captures the main changes: adding timestamp KQL filter pushdown and syncing the DeprecatedDateString rename with upstream CLP-S.
Description check ✅ Passed The PR description provides comprehensive details covering all template requirements including summary, motivation, impact, test plan, and contributor checklist items.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
presto-clp/src/main/java/com/facebook/presto/plugin/clp/optimization/ClpFilterToKqlConverter.java (1)

777-798: ⚠️ Potential issue | 🟠 Major

Timestamp formatting is still skipped for IN (...) literals.

formatLiteral(...) added at Line 902 is not used in handleIn; at Line 797 non-string literals are appended raw. This leaves timestamp IN pushdown inconsistent with the updated comparison/BETWEEN behaviour.

🔧 Proposed fix
@@
     private ClpExpression handleIn(SpecialFormExpression node)
     {
@@
             if (literal.getType() instanceof VarcharType) {
                 queryBuilder.append("\"").append(literalString).append("\"");
             }
             else {
-                queryBuilder.append(literalString);
+                queryBuilder.append(formatLiteral(literal.getType(), literalString));
             }
             queryBuilder.append(" OR ");
         }

Also applies to: 902-908

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@presto-clp/src/main/java/com/facebook/presto/plugin/clp/optimization/ClpFilterToKqlConverter.java`
around lines 777 - 798, The handleIn(SpecialFormExpression) implementation
currently appends non-string literals raw, skipping timestamp formatting; update
handleIn to call the shared formatLiteral(...) helper (used elsewhere) when
building each literal in the IN list so timestamps and other special types are
formatted consistently; ensure you still wrap Varchar values in quotes (or let
formatLiteral handle quoting if it does) and replace the existing
getLiteralString()/raw append logic in handleIn with a call to
formatLiteral(literal) when constructing queryBuilder entries.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@presto-clp/src/main/java/com/facebook/presto/plugin/clp/optimization/ClpFilterToKqlConverter.java`:
- Around line 902-908: The code formats only TIMESTAMP in formatLiteral(Type
literalType, String literalString) but marks TIMESTAMP_MICROSECONDS as
pushdown-compatible elsewhere, causing inconsistent output; update formatLiteral
to treat TIMESTAMP_MICROSECONDS the same as TIMESTAMP (wrap with
format("timestamp(\"%s\", \"\\L\")")), or alternatively remove
TIMESTAMP_MICROSECONDS from the pushdown set where pushdown eligibility is
declared, and add/adjust unit tests to cover TIMESTAMP_MICROSECONDS pushdown and
literal formatting to prevent regressions.

In
`@presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpFilterToKql.java`:
- Around line 275-298: Add two assertions in
TestClpFilterToKql.testTimestampPushDown using testPushDown to cover the IN and
!= code paths: for IN call testPushDown(sessionHolder, "clpTimestamp IN
(TIMESTAMP '2023-01-01 00:00:00.000', TIMESTAMP '2023-01-02 00:00:00.000')",
"<expected-kql-for-IN>", null) where the expected KQL uses the same timestamp
literal formatting as other assertions (timestamp("1672531200000", "\\L") and
timestamp("1672617600000", "\\L") combined in the IN list); and for != call
testPushDown(sessionHolder, "clpTimestamp != TIMESTAMP '2023-01-01
00:00:00.000'", "<expected-kql-for-not-equals>", null) using the same
timestamp("1672531200000", "\\L") representation. Ensure the new assertions are
added inside testTimestampPushDown alongside the existing testPushDown calls so
the IN and != comparison builders are exercised.

---

Outside diff comments:
In
`@presto-clp/src/main/java/com/facebook/presto/plugin/clp/optimization/ClpFilterToKqlConverter.java`:
- Around line 777-798: The handleIn(SpecialFormExpression) implementation
currently appends non-string literals raw, skipping timestamp formatting; update
handleIn to call the shared formatLiteral(...) helper (used elsewhere) when
building each literal in the IN list so timestamps and other special types are
formatted consistently; ensure you still wrap Varchar values in quotes (or let
formatLiteral handle quoting if it does) and replace the existing
getLiteralString()/raw append logic in handleIn with a call to
formatLiteral(literal) when constructing queryBuilder entries.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3543aa9 and 4aab91f.

📒 Files selected for processing (5)
  • presto-clp/src/main/java/com/facebook/presto/plugin/clp/metadata/ClpSchemaTree.java
  • presto-clp/src/main/java/com/facebook/presto/plugin/clp/metadata/ClpSchemaTreeNodeType.java
  • presto-clp/src/main/java/com/facebook/presto/plugin/clp/optimization/ClpFilterToKqlConverter.java
  • presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpFilterToKql.java
  • presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpQueryBase.java

…PrestoNativeClpGeneralQueries.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…onstructor.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpQueryBase.java`:
- Around line 146-149: The SessionHolder constructors set connectorSession =
SESSION which uses a default timezone; change both constructors to set
connectorSession = session.toConnectorSession() so the connectorSession's
timezone matches the built session (use Session.toConnectorSession() to
propagate timezone via SqlFunctionProperties); update any references in the
SessionHolder class (constructors and getConnectorSession()) accordingly.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8d6a12e and 8e9b126.

📒 Files selected for processing (1)
  • presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpQueryBase.java

Comment on lines +146 to +149
connectorSession = SESSION;
session = TestingSession.testSessionBuilder(createTestingSessionPropertyManager(new SystemSessionProperties().getSessionProperties()))
.setTimeZoneKey(timeZoneKey)
.build();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, let's find and read the file to understand the context
fd -e java "TestClpQueryBase.java" -x cat -n {}

Repository: y-scope/presto

Length of output: 1694


🌐 Web query:

Presto Session toConnectorSession method API

💡 Result:

In PrestoDB (server-side, presto-main), com.facebook.presto.Session exposes two overloads that convert an engine Session into the SPI-facing ConnectorSession used by connectors:

  • public ConnectorSession toConnectorSession()
  • public ConnectorSession toConnectorSession(ConnectorId connectorId) (use this when you need connector-specific session properties for a particular connector/catalog) [1]

ConnectorSession is the session object passed into connector SPI entry points (for example, many methods on ConnectorMetadata take a ConnectorSession session parameter) [2].

Sources:
[1] Presto Session Javadoc (presto-main) https://javadoc.io/static/com.facebook.presto/presto-main/0.256/com/facebook/presto/Session.html
[2] Presto SPI ConnectorMetadata Javadoc (example of ConnectorSession usage) https://javadoc.io/static/com.facebook.presto/presto-spi/0.236/com/facebook/presto/spi/connector/ConnectorMetadata.html


🌐 Web query:

Presto ConnectorSession timezone properties

💡 Result:

In Presto connector code, the session time zone is not a generic getProperty("...") value. It’s a first-class attribute on ConnectorSession:

  • Read it with ConnectorSession#getTimeZoneKey() (returns a TimeZoneKey). [1]

How it gets set (what drives getTimeZoneKey()):

  • Presto client protocol: clients can send X-Presto-Time-Zone; if they don’t, Presto uses the coordinator/engine default time zone. [2]
  • JDBC driver: you can set the JDBC parameter timeZoneId (and historically via PrestoConnection.setTimeZoneId(String)). [3][4]

Related SQL-side visibility:

  • current_timezone() returns the current session time zone (IANA name or fixed offset). [5]

Sources:
[1] https://javadoc.io/static/com.facebook.presto/presto-main/0.285.1/com/facebook/presto/testing/TestingConnectorSession.html
[2] https://prestodb.github.io/docs/current/develop/client-protocol.html
[3] https://prestodb.io/docs/current/clients/java.html
[4] https://prestodb.github.io/docs/current/release/release-0.66.html
[5] https://prestodb.github.io/docs/current/functions/datetime.html


🌐 Web query:

Presto TestingConnectorSession.SESSION default timezone

💡 Result:

TestingConnectorSession.SESSION is just a static convenience ConnectorSession instance; the Javadoc does not document what time zone it uses by default (it only exposes getTimeZoneKey()). [1]

So the only reliable ways to know the “default timezone” are to inspect the actual TestingConnectorSession implementation for your Presto version or check it at runtime:

TimeZoneKey tz = TestingConnectorSession.SESSION.getTimeZoneKey();

More generally (outside this test helper), Presto’s session time zone typically comes from the client, and if not available, from the coordinator/server. [2]

Sources:
[1] Presto TestingConnectorSession Javadoc (shows SESSION and getTimeZoneKey()) https://javadoc.io/static/com.facebook.presto/presto-main/0.285.1/com/facebook/presto/testing/TestingConnectorSession.html
[2] Presto docs (session time zone behavior) https://prestodb.github.io/docs/current/release/release-0.66.html


🌐 Web query:

Presto Session setTimeZoneKey toConnectorSession propagation

💡 Result:

In Presto, Session.SessionBuilder#setTimeZoneKey(...) sets the timeZoneKey field that becomes part of the immutable Session object when you call build(). The Session then propagates that value to ConnectorSession via toConnectorSession(...) like this:

  1. Builder stores TZ: SessionBuilder#setTimeZoneKey(TimeZoneKey timeZoneKey) assigns this.timeZoneKey = timeZoneKey. [1]
  2. Session stores TZ: Session keeps it as private final TimeZoneKey timeZoneKey; and exposes getTimeZoneKey(). [1]
  3. Session bakes TZ into SqlFunctionProperties: Session#getSqlFunctionProperties() builds SqlFunctionProperties with .setTimeZoneKey(timeZoneKey). [1]
  4. toConnectorSession() wraps the Session: Session#toConnectorSession() returns new FullConnectorSession(this, ...). [1]
  5. FullConnectorSession captures the Session’s function properties: its constructor sets this.sqlFunctionProperties = session.getSqlFunctionProperties(); and returns them from getSqlFunctionProperties(). [2]

So the effective propagation path is:

SessionBuilder.setTimeZoneKeySession.timeZoneKeySession.getSqlFunctionProperties().timeZoneKeyFullConnectorSession.getSqlFunctionProperties().timeZoneKey → connector code reads it from ConnectorSession (typically via getSqlFunctionProperties()). [1][2]

Sources
[1] Presto Session.java (0.256) showing setTimeZoneKey, getSqlFunctionProperties().setTimeZoneKey(...), and toConnectorSession() creating FullConnectorSession. (raw.githubusercontent.com)
[2] Presto FullConnectorSession.java (0.256) showing it stores session.getSqlFunctionProperties() and exposes it via getSqlFunctionProperties(). (raw.githubusercontent.com)

Citations:


Align connectorSession timezone with session timezone in SessionHolder constructors.

The connectorSession is set to the static TestingConnectorSession.SESSION (which has a default timezone), while session is built with an explicit setTimeZoneKey(). When callers invoke getConnectorSession(), they receive a connector session with a misaligned timezone, causing non-deterministic behavior in timezone-sensitive assertions.

Replace connectorSession = SESSION; with connectorSession = session.toConnectorSession(); in both constructors (lines 133–135 and 146–149). The Session.toConnectorSession() method properly propagates the timezone through SqlFunctionProperties to the returned FullConnectorSession.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpQueryBase.java`
around lines 146 - 149, The SessionHolder constructors set connectorSession =
SESSION which uses a default timezone; change both constructors to set
connectorSession = session.toConnectorSession() so the connectorSession's
timezone matches the built session (use Session.toConnectorSession() to
propagate timezone via SqlFunctionProperties); update any references in the
SessionHolder class (constructors and getConnectorSession()) accordingly.

Copy link

@gibber9809 gibber9809 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. For the PR title how about:

feat: Add `timestamp()` KQL filter pushdown; sync `DeprecatedDateString` rename with clp-s upstream.

@20001020ycx 20001020ycx changed the title feat: Add Timestamp KQL filter pushdown and sync DeprecatedDateString rename with CLP-S upstream. feat: Add timestamp() KQL filter pushdown; sync DeprecatedDateString rename with clp-s upstream. Feb 26, 2026
@20001020ycx 20001020ycx changed the base branch from release-0.293-clp-connector to release-0.297-edge-10-clp-connector March 2, 2026 22:27
20001020ycx and others added 2 commits March 2, 2026 17:27
… support

Includes Velox PR y-scope#54 which adds searching and marshalling for the new
CLP-S Timestamp column type (byte 14), enabling e2e timestamp pushdown.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@20001020ycx 20001020ycx merged commit dea6fee into y-scope:release-0.297-edge-10-clp-connector Mar 3, 2026
10 checks passed
@20001020ycx 20001020ycx deleted the feat/clp-timestamp-kql-pushdown branch March 3, 2026 15:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants