feat(plugin-native-sidecar): Pass down session properties in NativeExpressionOptimizer#27304
Conversation
Reviewer's GuidePropagates Presto session system properties through the native sidecar expression optimizer to the native server, refactors query-config construction to work from raw session properties, and updates the native optimizer HTTP contract and tests accordingly. Sequence diagram for session properties propagation in native expression optimizationsequenceDiagram
actor QueryClient
participant NativeSidecarExpressionInterpreter
participant PrestoServer
participant PrestoToVeloxQueryConfig
participant VeloxQueryCtx
QueryClient->>NativeSidecarExpressionInterpreter: optimizeBatch(session, expressions, level)
NativeSidecarExpressionInterpreter->>NativeSidecarExpressionInterpreter: getSidecarRequest(session, level, resolvedExpressions)
NativeSidecarExpressionInterpreter->>ConnectorSession: getSystemProperties()
ConnectorSession-->>NativeSidecarExpressionInterpreter: Map<String,String> systemProperties
NativeSidecarExpressionInterpreter->>NativeSidecarExpressionInterpreter: create ExpressionOptimizationRequest(expressions, systemProperties)
NativeSidecarExpressionInterpreter->>PrestoServer: HTTP POST /v1/expressions/optimized
PrestoServer->>PrestoServer: parse JSON body to ExpressionOptimizationRequest
PrestoServer->>PrestoServer: validate sessionProperties and expressions
PrestoServer->>PrestoToVeloxQueryConfig: toVeloxConfigsFromSessionProperties(sessionProperties)
PrestoToVeloxQueryConfig-->>PrestoServer: Map<String,String> veloxConfigs
PrestoServer->>PrestoServer: add timezone and adjustTimestampToTimezone configs
PrestoServer->>VeloxQueryCtx: create(QueryConfig(veloxConfigs))
VeloxQueryCtx-->>PrestoServer: QueryCtx
PrestoServer->>PrestoServer: optimize expressions using native engine
PrestoServer-->>NativeSidecarExpressionInterpreter: JSON RowExpressionOptimizationResult list
NativeSidecarExpressionInterpreter-->>QueryClient: Map<RowExpression, RowExpression> optimizedExpressions
Class diagram for ExpressionOptimizationRequest and session property propagationclassDiagram
class ConnectorSession {
<<interface>>
+String getQueryId()
+String getUser()
+String getSource()
+String getCatalog()
+String getSchema()
+String getRemoteUserAddress()
+String getUserAgent()
+String getClientInfo()
+String getClientTags()
+String getTimeZoneKey()
+Locale getLocale()
+long getStartTime()
+boolean isClientTransactionSupport()
+String getTraceToken()
+Optional getIdentity()
+Optional getConnectorId()
+Map~String,String~ getSystemProperties()
}
class FullConnectorSession {
-Session session
-Identity identity
+FullConnectorSession(Session session, Identity identity)
+ConnectorSession forCatalog(String catalog)
+ConnectorSession forSchema(String schema)
+ConnectorSession forSessionProperty(String name, String value)
+Map~String,String~ getSystemProperties()
}
class NativeSidecarExpressionInterpreter {
-NodeManager nodeManager
-HttpClient httpClient
-JsonCodec~ExpressionOptimizationRequest~ expressionOptimizationRequestCodec
-JsonCodec~List~RowExpressionOptimizationResult~~ rowExpressionOptimizationResultJsonCodec
+NativeSidecarExpressionInterpreter(HttpClient httpClient, NodeManager nodeManager, JsonCodec~List~RowExpressionOptimizationResult~~ rowExpressionOptimizationResultJsonCodec, JsonCodec~ExpressionOptimizationRequest~ expressionOptimizationRequestCodec)
+Map~RowExpression,RowExpression~ optimizeBatch(ConnectorSession session, Map~RowExpression,RowExpression~ expressions, Level level)
+List~RowExpressionOptimizationResult~ optimize(ConnectorSession session, List~RowExpression~ expressions, Level level)
-Request getSidecarRequest(ConnectorSession session, Level level, List~RowExpression~ resolvedExpressions)
}
class ExpressionOptimizationRequest {
-List~RowExpression~ expressions
-Map~String,String~ sessionProperties
+ExpressionOptimizationRequest(List~RowExpression~ expressions, Map~String,String~ sessionProperties)
+List~RowExpression~ getExpressions()
+Map~String,String~ getSessionProperties()
}
class PrestoToVeloxQueryConfig {
+static unordered_map~String,String~ toVeloxConfigsFromSessionProperties(map~String,String~ sessionProperties)
+static velox::core::QueryConfig toVeloxConfigs(protocol::SessionRepresentation session)
+static velox::core::QueryConfig toVeloxConfigs(protocol::SessionRepresentation session, map~String,String~ extraCredentials)
}
ConnectorSession <|.. FullConnectorSession
FullConnectorSession --> Session : wraps
NativeSidecarExpressionInterpreter ..> ExpressionOptimizationRequest : creates and serializes
ExpressionOptimizationRequest o--> RowExpression : contains
ExpressionOptimizationRequest o--> "1" Map~String,String~ : sessionProperties
PrestoToVeloxQueryConfig ..> ExpressionOptimizationRequest : consumes sessionProperties
FullConnectorSession ..> PrestoToVeloxQueryConfig : provides systemProperties via getSystemProperties()
File-Level Changes
Assessment against linked issues
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
7837074 to
9b09f99
Compare
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- In
PrestoServer::getOptimizedExpressions, the JSONsessionPropertiesmap is blindly converted tostd::map<std::string, std::string>; consider validating that all values are strings (or coercing them explicitly) to avoid surprising runtime failures if a client sends non-string session property values. - You now require
sessionPropertiesin the optimizer request body and construct theQueryCtxbefore validating theexpressionsfield; if you expect non-upgraded clients or malformed payloads, you might want to validate both keys up front before creating theQueryCtxto fail faster and avoid unnecessary work.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `PrestoServer::getOptimizedExpressions`, the JSON `sessionProperties` map is blindly converted to `std::map<std::string, std::string>`; consider validating that all values are strings (or coercing them explicitly) to avoid surprising runtime failures if a client sends non-string session property values.
- You now require `sessionProperties` in the optimizer request body and construct the `QueryCtx` before validating the `expressions` field; if you expect non-upgraded clients or malformed payloads, you might want to validate both keys up front before creating the `QueryCtx` to fail faster and avoid unnecessary work.
## Individual Comments
### Comment 1
<location path="presto-native-execution/presto_cpp/main/PrestoToVeloxQueryConfig.cpp" line_range="77" />
<code_context>
+ // Construct query tracing regex and pass to Velox config.
+ // It replaces the given native_query_trace_task_reg_exp if also set.
+ if (traceFragmentId.has_value() || traceShardId.has_value()) {
+ queryConfigs.emplace(
+ velox::core::QueryConfig::kQueryTraceTaskRegExp,
</code_context>
<issue_to_address>
**issue (bug_risk):** Using emplace for kQueryTraceTaskRegExp prevents overriding an existing config value
Because `emplace` is a no-op when the key already exists, this won’t override a previously set `kQueryTraceTaskRegExp`, which contradicts the comment about replacing `native_query_trace_task_reg_exp`. Consider using `queryConfigs[velox::core::QueryConfig::kQueryTraceTaskRegExp] = ...` or `insert_or_assign` so the session-derived regex reliably overrides any existing value.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- Changing the native optimizer HTTP endpoint from accepting a bare JSON array to a JSON object with
expressionsandsessionPropertieswill break any existing callers; consider keeping backward compatibility by accepting both formats for a deprecation period (e.g., handle the old array-only body ifexpressionsis missing). - The construction of Velox QueryConfig is now split between
toVeloxConfigsFromSessionPropertiesand manual additions inPrestoServer(timezone and adjust-timestamp flags); consider folding the timezone/adjustment handling into a single helper to keep all query-config derivation from session/sessionProperties centralized. - The native sidecar now relies on
ConnectorSession.getSystemProperties()returning real values, but the SPI default returns an empty map; consider either tightening the type used inNativeSidecarExpressionInterpreter(e.g., requiringFullConnectorSession) or clearly documenting/enforcing that connectors using native optimization must overridegetSystemPropertiesto avoid silent misconfiguration.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Changing the native optimizer HTTP endpoint from accepting a bare JSON array to a JSON object with `expressions` and `sessionProperties` will break any existing callers; consider keeping backward compatibility by accepting both formats for a deprecation period (e.g., handle the old array-only body if `expressions` is missing).
- The construction of Velox QueryConfig is now split between `toVeloxConfigsFromSessionProperties` and manual additions in `PrestoServer` (timezone and adjust-timestamp flags); consider folding the timezone/adjustment handling into a single helper to keep all query-config derivation from session/sessionProperties centralized.
- The native sidecar now relies on `ConnectorSession.getSystemProperties()` returning real values, but the SPI default returns an empty map; consider either tightening the type used in `NativeSidecarExpressionInterpreter` (e.g., requiring `FullConnectorSession`) or clearly documenting/enforcing that connectors using native optimization must override `getSystemProperties` to avoid silent misconfiguration.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
pramodsatya
left a comment
There was a problem hiding this comment.
Thanks @pdabre12, LGTM overall % one comment.
| auto queryConfig = velox::core::QueryConfig{std::move(config)}; | ||
|
|
||
| json input = json::parse(util::extractMessageBody(body)); | ||
| VELOX_USER_CHECK( |
There was a problem hiding this comment.
Could you translate input json to ExpressionOptimizationRequest via presto_protocol ? That way all these checks wont be necessary and protocol can ensure serde happens in the right format.
| } | ||
|
|
||
| @DataProvider(name = "fieldNamesInJsonCastEnabled") | ||
| private Object[][] fieldNamesInJsonCastEnabled() |
There was a problem hiding this comment.
nit: data provider can be reverted and these params can be maintained in a list in the test?
| } | ||
|
|
||
| @Test(dataProvider = "fieldNamesInJsonCastEnabled") | ||
| public void testSessionPropertiesPropagatingThroughOptimizer(String enabled, @Language("SQL") String expected) |
There was a problem hiding this comment.
nit: Would it be better to move this to TestNativeExpressionOptimizer ?
aditi-pandit
left a comment
There was a problem hiding this comment.
Thanks @pdabre12 for this code. Just have some nits.
presto-native-execution/presto_cpp/presto_protocol/connector/iceberg/presto_protocol_iceberg.h
Show resolved
Hide resolved
| } | ||
|
|
||
| @Test(dataProvider = "fieldNamesInJsonCastEnabled") | ||
| public void testSessionPropertiesPropagatingThroughOptimizer(String enabled, @Language("SQL") String expected) |
There was a problem hiding this comment.
Nit : Rename testSessionProperties
There was a problem hiding this comment.
Renamed and moved to TestNativeExpressionOptimizer, also keeping an e2e test in TestNativeSidecarPlugin.
4a8edba to
4cc691c
Compare
|
@aditi-pandit @pramodsatya Can you please have another look? I addressed the comments. |
aditi-pandit
left a comment
There was a problem hiding this comment.
Thanks @pdabre12. Just have a minor comment on the test. Code looks good otherwise.
...gin/src/test/java/com/facebook/presto/sidecar/expressions/TestNativeExpressionOptimizer.java
Show resolved
Hide resolved
4cc691c to
f5e9ee3
Compare
…pressionOptimizer
f5e9ee3 to
85b234f
Compare
Description
Currently, the
NativeExpressionOptimizerdoes not pass down system session properties, which may result in incorrect query results in a Prestissimo cluster. The optimizer should honor session properties defined at the session level. This change adds support for passing down system session properties inNativeExpressionOptimizer.Motivation and Context
Fixes: #27313
Impact
No user impact
Test Plan
CI, Unit tests
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.
Summary by Sourcery
Propagate session properties from Presto to the native sidecar expression optimizer so that native optimization can honor query/session configuration.
New Features:
Bug Fixes:
Enhancements:
Summary by Sourcery
Propagate Presto session properties to the native sidecar expression optimizer so native optimization respects session-level configuration.
New Features:
Bug Fixes:
Enhancements:
Tests: