feat(native): Add support for NativeFunctionHandle parsing#26948
feat(native): Add support for NativeFunctionHandle parsing#26948pdabre12 merged 6 commits intoprestodb:masterfrom
Conversation
Reviewer's GuideThis PR standardizes native execution to use NativeFunctionHandle instead of SqlFunctionHandle for native functions across Java and C++ components, adds full JSON/protocol support for NativeFunctionHandle, updates expression and query plan translators to construct and consume these handles, and introduces tests (Java and C++) plus wiring for the native expression optimizer. Sequence diagram for constructing and using NativeFunctionHandle in native executionsequenceDiagram
actor VeloxEngine
participant VeloxToPrestoExprConverter
participant Util_getFunctionHandle
participant ProtocolJson
participant PrestoCoordinator
participant NativeFunctionNamespaceManager
VeloxEngine->>VeloxToPrestoExprConverter: getCallExpression(veloxExpr)
VeloxToPrestoExprConverter->>VeloxToPrestoExprConverter: build Signature from veloxExpr
VeloxToPrestoExprConverter->>Util_getFunctionHandle: getFunctionHandle(name, signature)
Util_getFunctionHandle->>Util_getFunctionHandle: parts = getFunctionNameParts(name)
alt builtin_function
Util_getFunctionHandle->>Util_getFunctionHandle: create BuiltInFunctionHandle
Util_getFunctionHandle->>Util_getFunctionHandle: handle._type = $static
Util_getFunctionHandle->>Util_getFunctionHandle: handle.signature = signature
else native_function
Util_getFunctionHandle->>Util_getFunctionHandle: create NativeFunctionHandle
Util_getFunctionHandle->>Util_getFunctionHandle: handle._type = native
Util_getFunctionHandle->>Util_getFunctionHandle: handle.signature = signature
end
Util_getFunctionHandle-->>VeloxToPrestoExprConverter: shared_ptr FunctionHandle
VeloxToPrestoExprConverter->>ProtocolJson: to_json(plan_with_FunctionHandle)
ProtocolJson->>ProtocolJson: to_json(FunctionHandle)
alt type_native
ProtocolJson->>ProtocolJson: static_pointer_cast NativeFunctionHandle
ProtocolJson->>ProtocolJson: to_json(NativeFunctionHandle)
else type_static
ProtocolJson->>ProtocolJson: static_pointer_cast BuiltInFunctionHandle
end
ProtocolJson-->>PrestoCoordinator: HTTP request with FunctionHandle JSON
PrestoCoordinator->>NativeFunctionNamespaceManager: getAggregateFunctionImplementation(functionHandle, typeManager)
NativeFunctionNamespaceManager->>NativeFunctionNamespaceManager: checkCatalog(functionHandle)
NativeFunctionNamespaceManager->>NativeFunctionNamespaceManager: checkArgument(functionHandle instanceof NativeFunctionHandle)
NativeFunctionNamespaceManager->>NativeFunctionNamespaceManager: processNativeFunctionHandle(NativeFunctionHandle, typeManager)
NativeFunctionNamespaceManager-->>PrestoCoordinator: AggregationFunctionImplementation
PrestoCoordinator-->>VeloxEngine: results via native execution path
Sequence diagram for consuming NativeFunctionHandle in PrestoToVelox converterssequenceDiagram
actor PrestoCoordinator
participant ProtocolJson
participant VeloxQueryPlanConverterBase
participant VeloxExprConverter
participant Util_getSignatureFromFunctionHandle
PrestoCoordinator->>ProtocolJson: send plan JSON with FunctionHandle(type=native)
ProtocolJson->>VeloxQueryPlanConverterBase: from_json(PlanNode)
VeloxQueryPlanConverterBase->>Util_getSignatureFromFunctionHandle: getSignatureFromFunctionHandle(functionHandle)
alt builtin_handle
Util_getSignatureFromFunctionHandle->>Util_getSignatureFromFunctionHandle: cast to BuiltInFunctionHandle
Util_getSignatureFromFunctionHandle-->>VeloxQueryPlanConverterBase: &builtin.signature
else native_handle
Util_getSignatureFromFunctionHandle->>Util_getSignatureFromFunctionHandle: cast to NativeFunctionHandle
Util_getSignatureFromFunctionHandle-->>VeloxQueryPlanConverterBase: &native.signature
else unsupported_handle
Util_getSignatureFromFunctionHandle-->>VeloxQueryPlanConverterBase: nullptr
end
VeloxQueryPlanConverterBase->>VeloxQueryPlanConverterBase: build aggregate.rawInputTypes from signature.argumentTypes
VeloxQueryPlanConverterBase-->>VeloxExprConverter: PrestoExpression with NativeFunctionHandle
VeloxExprConverter->>VeloxExprConverter: toVeloxExpr(pexpr)
alt BuiltInFunctionHandle
VeloxExprConverter->>VeloxExprConverter: getFunctionName(sqlFunctionHandle.functionId)
VeloxExprConverter-->>VeloxExprConverter: CallTypedExpr(returnType, args, name)
else NativeFunctionHandle
VeloxExprConverter->>VeloxExprConverter: getFunctionName(signature)
VeloxExprConverter-->>VeloxExprConverter: CallTypedExpr(returnType, args, name)
end
Class diagram for NativeFunctionHandle integration across Java and C++classDiagram
class FunctionHandle
class SqlFunctionHandle
class BuiltInFunctionHandle
class NativeFunctionHandle {
+Signature signature
+NativeFunctionHandle()
}
FunctionHandle <|-- SqlFunctionHandle
FunctionHandle <|-- BuiltInFunctionHandle
FunctionHandle <|-- NativeFunctionHandle
class NativeFunctionHandle_Resolver {
+static NativeFunctionHandle_Resolver INSTANCE
+static NativeFunctionHandle_Resolver getInstance()
+Class getFunctionHandleClass()
+String getId(FunctionHandle functionHandle)
+String getCatalogName(FunctionHandle functionHandle)
}
NativeFunctionHandle_Resolver --> NativeFunctionHandle : resolves
class NativeFunctionNamespaceManager {
+AggregationFunctionImplementation getAggregateFunctionImplementation(FunctionHandle functionHandle, TypeManager typeManager)
+FunctionMetadata fetchFunctionMetadataDirect(SqlFunctionHandle functionHandle)
+FunctionHandle getFunctionHandle(OptionalFunctionNamespaceTransactionHandle transactionHandle, Signature signature)
-AggregationFunctionImplementation processNativeFunctionHandle(NativeFunctionHandle nativeFunctionHandle, TypeManager typeManager)
-SqlInvokedFunction getSqlInvokedFunctionFromSignature(Signature signature)
-FunctionMetadata getMetadataFromNativeFunctionHandle(SqlFunctionHandle functionHandle)
}
class NativeFunctionNamespaceManagerFactory {
+static String NAME
-static NativeFunctionHandle_Resolver HANDLE_RESOLVER
+String getName()
+FunctionNamespaceManager create(FunctionNamespaceManagerContext context)
}
NativeFunctionNamespaceManagerFactory --> NativeFunctionHandle_Resolver : uses HANDLE_RESOLVER
NativeFunctionNamespaceManager ..> NativeFunctionHandle : expects_native
class Cpp_FunctionHandle {
<<abstract>>
+String _type
}
class Cpp_BuiltInFunctionHandle {
+Signature signature
}
class Cpp_NativeFunctionHandle {
+Signature signature
+NativeFunctionHandle() noexcept
}
Cpp_FunctionHandle <|-- Cpp_BuiltInFunctionHandle
Cpp_FunctionHandle <|-- Cpp_NativeFunctionHandle
class PrestoProtocolCoreCpp {
+void to_json(json j, shared_ptr~FunctionHandle~ p)
+void from_json(json j, shared_ptr~FunctionHandle~ p)
+void to_json(json j, NativeFunctionHandle p)
+void from_json(json j, NativeFunctionHandle p)
}
PrestoProtocolCoreCpp ..> Cpp_NativeFunctionHandle : serialize_deserialize
PrestoProtocolCoreCpp ..> Cpp_BuiltInFunctionHandle : serialize_deserialize
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
8d2acba to
d1855e1
Compare
There was a problem hiding this comment.
Hey - I've found 4 issues, and left some high level feedback:
- In
VeloxToPrestoExpr.cpp::getFunctionHandle,util::getFunctionNameParts(name)is indexed atparts[0]andparts[1]without validating the number of segments, which can lead to undefined behavior for unexpected or malformed function names; consider checking the size before accessing. - Several expectations in
NativeFunctionHandleTestappear inconsistent with the type strings being parsed (e.g., expectingdecimal(p,s)andarray(decimal(10,2))to resolve toBIGINTelement types, oripaddress/ipprefixto map toBIGINT); it would be safer to either use concrete, supported type signatures that match the assertions or adjust the assertions to reflect the actual parsed types.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `VeloxToPrestoExpr.cpp::getFunctionHandle`, `util::getFunctionNameParts(name)` is indexed at `parts[0]` and `parts[1]` without validating the number of segments, which can lead to undefined behavior for unexpected or malformed function names; consider checking the size before accessing.
- Several expectations in `NativeFunctionHandleTest` appear inconsistent with the type strings being parsed (e.g., expecting `decimal(p,s)` and `array(decimal(10,2))` to resolve to `BIGINT` element types, or `ipaddress`/`ipprefix` to map to `BIGINT`); it would be safer to either use concrete, supported type signatures that match the assertions or adjust the assertions to reflect the actual parsed types.
## Individual Comments
### Comment 1
<location> `presto-native-execution/presto_cpp/main/types/VeloxToPrestoExpr.cpp:103-109` </location>
<code_context>
+
+// If the function name prefix starts from "presto.default", then it is a built
+// in function handle. Otherwise, it is a native function handle.
+std::shared_ptr<protocol::FunctionHandle> getFunctionHandle(
+ std::string& name,
+ protocol::Signature& signature) {
+ static constexpr char const* kStatic = "$static";
+ static constexpr char const* kNativeFunctionHandle = "native";
+
+ const auto parts = util::getFunctionNameParts(name);
+ if ((parts[0] == "presto") && (parts[1] == "default")) {
+ auto handle = std::make_shared<protocol::BuiltInFunctionHandle>();
</code_context>
<issue_to_address>
**issue (bug_risk):** Guard against unexpected function name formats when indexing `parts[0]` and `parts[1]`.
This assumes `util::getFunctionNameParts(name)` always returns at least two elements and then unconditionally uses `parts[0]` and `parts[1]`, which can cause out-of-range access if the name is missing catalog/schema or the parsing changes. Please check `parts.size()` before indexing and either use a safe default or fail with a clear error when the format is invalid.
</issue_to_address>
### Comment 2
<location> `presto-native-execution/presto_cpp/main/types/tests/NativeFunctionHandleTest.cpp:210` </location>
<code_context>
+ auto childArrayType =
+ std::dynamic_pointer_cast<const ArrayType>(argRowType->childAt(0));
+ ASSERT_NE(childArrayType, nullptr);
+ EXPECT_EQ(childArrayType->elementType()->kind(), TypeKind::BIGINT);
+
+ // Second child: map(bigint,varchar)
</code_context>
<issue_to_address>
**issue (testing):** The nested complex-type test asserts `BIGINT` for an `array(decimal(10,2))` element, which appears inconsistent with the declared type and may hide regressions.
In `parseNativeFunctionHandleWithNestedComplexTypes`, the JSON declares the first row child as `array(decimal(10,2))`, but the test asserts `TypeKind::BIGINT` for `childArrayType->elementType()->kind()`. This mismatch means the test may still pass even if decimal handling is broken. Update the expectation to the appropriate decimal kind (e.g., `TypeKind::DECIMAL`) and, if relevant, also assert precision and scale.
</issue_to_address>
### Comment 3
<location> `presto-native-execution/presto_cpp/main/types/tests/NativeFunctionHandleTest.cpp:58-67` </location>
<code_context>
+ "min(38, p1 + 10)");
+
+ // Verify type parsing for return type
+ auto returnType =
+ typeParser_.parse(nativeFunctionHandle->signature.returnType);
+ EXPECT_EQ(returnType->kind(), TypeKind::BIGINT);
+
+ // Verify arg0 type: array(T)
</code_context>
<issue_to_address>
**issue (testing):** The mixed-constraints test expects `BIGINT` for a `decimal(p,s)` return type, which conflicts with the signature and weakens coverage.
In `parseNativeFunctionHandleWithMixedConstraints`, the signature declares `"returnType": "decimal(p,s)"`, but the test asserts `TypeKind::BIGINT` for the parsed return type. If this test is meant to validate decimal return-type parsing, it should assert `TypeKind::DECIMAL` (and ideally verify precision/scale or at least that it’s a decimal type); otherwise the test can pass even when decimal parsing is broken.
</issue_to_address>
### Comment 4
<location> `presto-native-execution/presto_cpp/presto_protocol/connector/iceberg/presto_protocol_iceberg.cpp:460` </location>
<code_context>
{IcebergTableType::PARTITIONS, "PARTITIONS"},
{IcebergTableType::FILES, "FILES"},
{IcebergTableType::REFS, "REFS"},
+ {IcebergTableType::METADATA_LOG_ENTRIES, "METADATA_LOG_ENTRIES"},
{IcebergTableType::PROPERTIES, "PROPERTIES"},
{IcebergTableType::CHANGELOG, "CHANGELOG"},
</code_context>
<issue_to_address>
**issue (review_instructions):** The new IcebergTableType enumerator METADATA_LOG_ENTRIES does not follow the required kPascalCase naming convention for enumerators.
Per the style guide, enumerators should use kPascalCase (e.g., kMetadataLogEntries) instead of UPPER_SNAKE_CASE. Please rename the IcebergTableType value and update its uses accordingly in this and related files.
<details>
<summary>Review instructions:</summary>
**Path patterns:** `presto-native-execution/**/*.hpp,presto-native-execution/**/*.cpp`
**Instructions:**
Use kPascalCase for static constants and enumerators.
</details>
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
The assumed |
d1855e1 to
d1cc1f6
Compare
|
@sourcery-ai review |
|
@tdcmeehan @aditi-pandit Can you please take a look? |
There was a problem hiding this comment.
Hey - I've found 8 issues, and left some high level feedback:
- In VeloxToPrestoExpr::getFunctionHandle, getFunctionNameParts(name) is indexed at parts[0] and parts[1] without checking the vector’s size, which can lead to out-of-bounds access for unexpected function names; consider validating the number of segments before indexing.
- Several expectations in NativeFunctionHandleTest appear inconsistent with the declared type signatures (e.g., expecting BIGINT for array(decimal(10,2))’s element type and for ipprefix/ipaddress-derived types); it would be better to align the asserted TypeKind with the actual logical types implied by the signatures or adjust the signatures to match the intended kinds.
- The helper getFunctionHandle in VeloxToPrestoExpr takes the function name by non-const reference but does not modify it; consider changing the parameter to const std::string& to better reflect its usage and avoid confusion.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In VeloxToPrestoExpr::getFunctionHandle, getFunctionNameParts(name) is indexed at parts[0] and parts[1] without checking the vector’s size, which can lead to out-of-bounds access for unexpected function names; consider validating the number of segments before indexing.
- Several expectations in NativeFunctionHandleTest appear inconsistent with the declared type signatures (e.g., expecting BIGINT for array(decimal(10,2))’s element type and for ipprefix/ipaddress-derived types); it would be better to align the asserted TypeKind with the actual logical types implied by the signatures or adjust the signatures to match the intended kinds.
- The helper getFunctionHandle in VeloxToPrestoExpr takes the function name by non-const reference but does not modify it; consider changing the parameter to const std::string& to better reflect its usage and avoid confusion.
## Individual Comments
### Comment 1
<location> `presto-native-execution/presto_cpp/main/types/VeloxToPrestoExpr.cpp:103-110` </location>
<code_context>
+
+// If the function name prefix starts from "presto.default", then it is a built
+// in function handle. Otherwise, it is a native function handle.
+std::shared_ptr<protocol::FunctionHandle> getFunctionHandle(
+ std::string& name,
+ protocol::Signature& signature) {
+ static constexpr char const* kStatic = "$static";
+ static constexpr char const* kNativeFunctionHandle = "native";
+
+ const auto parts = util::getFunctionNameParts(name);
+ if ((parts[0] == "presto") && (parts[1] == "default")) {
+ auto handle = std::make_shared<protocol::BuiltInFunctionHandle>();
+ handle->_type = kStatic;
</code_context>
<issue_to_address>
**issue (bug_risk):** Guard against unexpected function name formats before indexing into `parts`.
This helper assumes `util::getFunctionNameParts(name)` always returns at least 2 elements. If a name lacks the expected `presto.default` prefix, indexing `parts[0]` / `parts[1]` is undefined behavior. Please check `parts.size() >= 2` (and optionally validate non-empty contents) before indexing, and either choose a safe default handle type or fail with a clear error message.
</issue_to_address>
### Comment 2
<location> `presto-native-sidecar-plugin/src/main/java/com/facebook/presto/sidecar/functionNamespace/NativeFunctionNamespaceManager.java:255-258` </location>
<code_context>
}
- private SqlFunction getSqlFunctionFromSignature(Signature signature)
+ private SqlInvokedFunction getSqlInvokedFunctionFromSignature(Signature signature)
{
try {
- return specializedFunctionKeyCache.getUnchecked(signature).getFunction();
+ SqlFunction sqlFunction = specializedFunctionKeyCache.getUnchecked(signature).getFunction();
+ return (SqlInvokedFunction) sqlFunction;
}
</code_context>
<issue_to_address>
**issue (bug_risk):** Unchecked cast to `SqlInvokedFunction` can surface as a `ClassCastException` instead of a PrestoException.
This method now assumes `specializedFunctionKeyCache.getUnchecked(signature).getFunction()` always returns a `SqlInvokedFunction`. If the cache ever holds another `SqlFunction` implementation (e.g., from misconfiguration or future changes), the cast will fail and skip `convertToPrestoException`. Please either check `instanceof SqlInvokedFunction` and throw a descriptive `PrestoException` on mismatch, or constrain the cache so it can only contain `SqlInvokedFunction` values.
</issue_to_address>
### Comment 3
<location> `presto-native-execution/presto_cpp/main/types/tests/NativeFunctionHandleTest.cpp:205-210` </location>
<code_context>
+ ASSERT_NE(argRowType, nullptr);
+ EXPECT_EQ(argRowType->size(), 2);
+
+ // First child: array(decimal(10,2))
+ EXPECT_EQ(argRowType->childAt(0)->kind(), TypeKind::ARRAY);
+ auto childArrayType =
+ std::dynamic_pointer_cast<const ArrayType>(argRowType->childAt(0));
+ ASSERT_NE(childArrayType, nullptr);
+ EXPECT_EQ(childArrayType->elementType()->kind(), TypeKind::BIGINT);
+
+ // Second child: map(bigint,varchar)
</code_context>
<issue_to_address>
**issue (testing):** Test expectations for the `array(decimal(10,2))` field are inconsistent with the documented type.
The JSON/comment describe the first child as `array(decimal(10,2))`, but the test asserts `TypeKind::BIGINT` for the element type:
```cpp
EXPECT_EQ(childArrayType->elementType()->kind(), TypeKind::BIGINT);
```
This mismatch can hide type regressions. Either change the assertion to expect DECIMAL (and optionally verify precision/scale), or, if `decimal(10,2)` is intentionally represented as BIGINT, update the comment and JSON fixture to match the actual type so the test remains accurate and self-descriptive.
</issue_to_address>
### Comment 4
<location> `presto-native-execution/presto_cpp/main/types/tests/NativeFunctionHandleTest.cpp:301-304` </location>
<code_context>
+ nativeFunctionHandle->signature.longVariableConstraints[0].expression,
+ "min(38, p1 + 10)");
+
+ // Verify type parsing for return type
+ auto returnType =
+ typeParser_.parse(nativeFunctionHandle->signature.returnType);
+ EXPECT_EQ(returnType->kind(), TypeKind::BIGINT);
+
+ // Verify arg0 type: array(T)
</code_context>
<issue_to_address>
**suggestion (testing):** The test for `returnType` of `decimal(p,s)` is likely asserting the wrong Velox type kind.
Here the signature uses:
```json
"returnType": "decimal(p,s)"
```
but the test asserts:
```cpp
auto returnType =
typeParser_.parse(nativeFunctionHandle->signature.returnType);
EXPECT_EQ(returnType->kind(), TypeKind::BIGINT);
```
That doesn’t match a decimal return type and risks either future breakage or encoding an implementation detail.
If `TypeParser` cannot handle symbolic `decimal(p,s)`, this test should avoid parsing it and instead:
- Assert `signature.returnType` equals the expected string; and
- Keep the focus on `typeVariableConstraints` / `longVariableConstraints`.
If `TypeParser` can parse it, then the expectation should be updated to `TypeKind::DECIMAL` rather than `BIGINT`.
```suggestion
// Verify return type encoding remains the expected symbolic decimal form.
// We intentionally do not parse symbolic "decimal(p,s)" here to avoid
// depending on implementation details of TypeParser; this test is focused
// on the signature shape and variable constraints.
EXPECT_EQ(nativeFunctionHandle->signature.returnType, "decimal(p,s)");
```
</issue_to_address>
### Comment 5
<location> `presto-native-execution/presto_cpp/main/types/tests/NativeFunctionHandleTest.cpp:26-35` </location>
<code_context>
+ TypeParser typeParser_;
+};
+
+TEST_F(NativeFunctionHandleTest, parseNativeFunctionHandle) {
+ try {
+ const std::string str = R"JSON(
+ {
+ "@type": "native",
+ "signature": {
+ "name": "native.default.test",
+ "kind": "SCALAR",
+ "typeVariableConstraints": [],
+ "longVariableConstraints": [],
+ "returnType": "integer",
+ "argumentTypes": ["integer", "integer"],
+ "variableArity": true
+ }
+ }
+ )JSON";
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding explicit `to_json`/`from_json` round-trip coverage for `NativeFunctionHandle` and `FunctionHandle` pointers.
The existing test only covers `from_json` for `NativeFunctionHandle` via parsing a hard-coded JSON string. It doesn’t exercise:
* `to_json(const NativeFunctionHandle&)` itself; or
* The `to_json`/`from_json` overloads for `std::shared_ptr<FunctionHandle>` that special-case `"native"`.
Please add tests that:
1) Create a `NativeFunctionHandle` in C++, serialize with `json j = handle;` and assert `j["@type"] == "native"` and the `signature` fields.
2) Create a `std::shared_ptr<FunctionHandle>` pointing to a `NativeFunctionHandle`, serialize and deserialize, and verify `dynamic_pointer_cast<NativeFunctionHandle>` succeeds and the signature round-trips.
Suggested implementation:
```cpp
EXPECT_EQ(nativeFunctionHandle->signature.returnType, "integer");
}
TEST_F(NativeFunctionHandleTest, nativeFunctionHandleToJsonRoundTrip) {
protocol::FunctionSignature signature{
"native.default.test",
protocol::FunctionKind::SCALAR,
/*typeVariableConstraints*/ {},
/*longVariableConstraints*/ {},
/*returnType*/ "integer",
/*argumentTypes*/ {"integer", "integer"},
/*variableArity*/ true};
protocol::NativeFunctionHandle handle{signature};
// Serialize NativeFunctionHandle to json.
json j = handle;
ASSERT_TRUE(j.contains("@type"));
EXPECT_EQ(j["@type"], "native");
ASSERT_TRUE(j.contains("signature"));
const auto& sig = j["signature"];
EXPECT_EQ(sig.at("name"), "native.default.test");
EXPECT_EQ(sig.at("kind"), "SCALAR");
EXPECT_EQ(sig.at("returnType"), "integer");
EXPECT_EQ(
sig.at("argumentTypes"),
json::array({"integer", "integer"}));
EXPECT_TRUE(sig.at("variableArity"));
// Round-trip back to NativeFunctionHandle and verify the signature.
protocol::NativeFunctionHandle roundTripped = j;
EXPECT_EQ(roundTripped.signature.name, handle.signature.name);
EXPECT_EQ(roundTripped.signature.kind, handle.signature.kind);
EXPECT_EQ(roundTripped.signature.returnType, handle.signature.returnType);
EXPECT_EQ(
roundTripped.signature.argumentTypes,
handle.signature.argumentTypes);
EXPECT_EQ(
roundTripped.signature.variableArity,
handle.signature.variableArity);
}
TEST_F(
NativeFunctionHandleTest,
functionHandleSharedPtrNativeHandleToJsonRoundTrip) {
protocol::FunctionSignature signature{
"native.default.test",
protocol::FunctionKind::SCALAR,
/*typeVariableConstraints*/ {},
/*longVariableConstraints*/ {},
/*returnType*/ "integer",
/*argumentTypes*/ {"integer", "integer"},
/*variableArity*/ true};
auto nativeHandle =
std::make_shared<protocol::NativeFunctionHandle>(signature);
// Upcast to std::shared_ptr<FunctionHandle>.
std::shared_ptr<protocol::FunctionHandle> handle =
std::static_pointer_cast<protocol::FunctionHandle>(nativeHandle);
// Serialize shared_ptr<FunctionHandle> to json, exercising the "native" case.
json j = handle;
ASSERT_TRUE(j.contains("@type"));
EXPECT_EQ(j["@type"], "native");
// Round-trip back through shared_ptr<FunctionHandle>.
std::shared_ptr<protocol::FunctionHandle> roundTrippedHandle = j;
ASSERT_NE(roundTrippedHandle, nullptr);
auto roundTrippedNativeHandle =
std::dynamic_pointer_cast<protocol::NativeFunctionHandle>(
roundTrippedHandle);
ASSERT_NE(roundTrippedNativeHandle, nullptr);
// Verify that the signature also round-trips.
EXPECT_EQ(
roundTrippedNativeHandle->signature.name,
nativeHandle->signature.name);
EXPECT_EQ(
roundTrippedNativeHandle->signature.kind,
nativeHandle->signature.kind);
EXPECT_EQ(
roundTrippedNativeHandle->signature.returnType,
nativeHandle->signature.returnType);
EXPECT_EQ(
roundTrippedNativeHandle->signature.argumentTypes,
nativeHandle->signature.argumentTypes);
EXPECT_EQ(
roundTrippedNativeHandle->signature.variableArity,
nativeHandle->signature.variableArity);
}
```
I only see part of the file, so you may need to adjust a few details:
1. Ensure `protocol::FunctionSignature` and `protocol::NativeFunctionHandle` support the aggregate initialization used above. If the field order or constructor signature differs, switch to explicit field assignment:
```cpp
protocol::FunctionSignature signature;
signature.name = "native.default.test";
signature.kind = protocol::FunctionKind::SCALAR;
signature.returnType = "integer";
signature.argumentTypes = {"integer", "integer"};
signature.variableArity = true;
// ... set any additional required fields ...
```
and similarly construct `NativeFunctionHandle` appropriately.
2. If `json` is not already a visible alias for `nlohmann::json` in this file, add:
```cpp
using json = nlohmann::json;
```
near the top with the other `using` declarations.
3. If the existing `parseNativeFunctionHandle` test already has additional expectations and a closing brace after the `EXPECT_EQ(returnType, "integer");` line, move the two new tests so that they are placed after the closing brace of that test, rather than in the middle of it.
</issue_to_address>
### Comment 6
<location> `presto-native-execution/presto_cpp/presto_protocol/core/presto_protocol_core.cpp:7561` </location>
<code_context>
}
} // namespace facebook::presto::protocol
namespace facebook::presto::protocol {
+NativeFunctionHandle::NativeFunctionHandle() noexcept {
+ _type = "native";
+}
</code_context>
<issue_to_address>
**issue (review_instructions):** Member `_type` used in this constructor does not follow the camelCase_ convention for private/protected members (should be e.g. `type_`).
In the body of this constructor you assign to `_type`, which violates the convention that private/protected members use camelCase_ (e.g., `type_`). If possible, the member should be renamed in the class definition and all uses updated to match the guideline.
<details>
<summary>Review instructions:</summary>
**Path patterns:** `presto-native-execution/**/*.hpp,presto-native-execution/**/*.cpp`
**Instructions:**
Use camelCase_ for private and protected members variables.
</details>
</issue_to_address>
### Comment 7
<location> `presto-native-execution/presto_cpp/presto_protocol/core/presto_protocol_core.cpp:7578` </location>
<code_context>
+}
+
+void from_json(const json& j, NativeFunctionHandle& p) {
+ p._type = j["@type"];
+ from_json_key(
+ j,
</code_context>
<issue_to_address>
**nitpick (review_instructions):** Use of member `_type` here does not follow the camelCase_ naming convention for private/protected members.
This line assigns to `p._type`, which does not conform to the camelCase_ convention for private/protected members. The member should ideally be named something like `type_` to match the project’s style guidelines.
<details>
<summary>Review instructions:</summary>
**Path patterns:** `presto-native-execution/**/*.hpp,presto-native-execution/**/*.cpp`
**Instructions:**
Use camelCase_ for private and protected members variables.
</details>
</issue_to_address>
### Comment 8
<location> `presto-native-execution/presto_cpp/main/types/VeloxToPrestoExpr.cpp:111` </location>
<code_context>
+
+ const auto parts = util::getFunctionNameParts(name);
+ if ((parts[0] == "presto") && (parts[1] == "default")) {
+ auto handle = std::make_shared<protocol::BuiltInFunctionHandle>();
+ handle->_type = kStatic;
+ handle->signature = signature;
</code_context>
<issue_to_address>
**nitpick (review_instructions):** The subsequent assignments to `handle->_type` in this helper violate the camelCase_ convention for private/protected members.
Inside this helper, `handle->_type = kStatic;` and `handle->_type = kNativeFunctionHandle;` use the `_type` member, which doesn’t follow the camelCase_ convention for private/protected members. The underlying member should be renamed (e.g., `type_`) and all uses updated to conform to the guideline.
<details>
<summary>Review instructions:</summary>
**Path patterns:** `presto-native-execution/**/*.hpp,presto-native-execution/**/*.cpp`
**Instructions:**
Use camelCase_ for private and protected members variables.
</details>
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
presto-native-execution/presto_cpp/main/types/VeloxToPrestoExpr.cpp
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/types/tests/NativeFunctionHandleTest.cpp
Show resolved
Hide resolved
d1cc1f6 to
f4adf7c
Compare
aditi-pandit
left a comment
There was a problem hiding this comment.
Thanks @pdabre12 for these changes.
| subclasses: | ||
| - { name: BuiltInFunctionHandle, key: $static } | ||
| - { name: SqlFunctionHandle, key: native } | ||
| - { name: NativeFunctionHandle, key: native } |
There was a problem hiding this comment.
Adding this means introducing a dependency on presto-native-sidecar-plugin in presto-native-execution ? I don't see that dependency already, so curious about it.
There was a problem hiding this comment.
We do not introduce a dependency, we just need to give the file path of the Java class in the .yml files and generate the new protocol changes.
Eg:
Files from presto-native-sidecar-plugin have been used before: https://github.com/prestodb/presto/blob/master/presto-native-execution/presto_cpp/presto_protocol/core/presto_protocol_core.yml#L355
There was a problem hiding this comment.
This is interesting... Typically adding a pom dependency means the dependent module is compiled before. In our case too we want NativeFunctionHandle to compile correctly before doing this protocol generation. But right now we only care about file access in the code-base.
Since this is done for other files as well I'll let it be for for this PR. But lets check what is the precedent on this.
There was a problem hiding this comment.
I don’t think we add a POM dependency. Although adding a dependency typically ensures that the dependent module is compiled first, that isn’t required in this case.
The presto-protocol module reads Java source files directly. Using java-struct-to-json.py, those Java sources are converted to JSON, and then from JSON to C++. Because this workflow operates on source files rather than compiled classes, there’s no need for the module defining NativeFunctionHandle to be compiled beforehand.
presto-native-execution/presto_cpp/main/types/VeloxToPrestoExpr.cpp
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/types/PrestoToVeloxExpr.cpp
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/types/tests/NativeFunctionHandleTest.cpp
Outdated
Show resolved
Hide resolved
aditi-pandit
left a comment
There was a problem hiding this comment.
Thanks @pdabre12. Just one small comment.
| subclasses: | ||
| - { name: BuiltInFunctionHandle, key: $static } | ||
| - { name: SqlFunctionHandle, key: native } | ||
| - { name: NativeFunctionHandle, key: native } |
There was a problem hiding this comment.
This is interesting... Typically adding a pom dependency means the dependent module is compiled before. In our case too we want NativeFunctionHandle to compile correctly before doing this protocol generation. But right now we only care about file access in the code-base.
Since this is done for other files as well I'll let it be for for this PR. But lets check what is the precedent on this.
...native-sidecar-plugin/src/test/java/com/facebook/presto/sidecar/TestNativeSidecarPlugin.java
Show resolved
Hide resolved
|
@aditi-pandit Can you please have another look? The CI is now green. |
3a3013d to
dc491d1
Compare
|
@tdcmeehan Can you please stamp? |
Description
Add parsing logic for
NativeFunctionHandle. All functions underNativeFunctionNamespaceManagernow return and parseNativeFunctionHandleinstead ofSqlFunctionHandle.Motivation and Context
This change is required for
VeloxToPrestoExprwhen the native expression optimizer is enabled. During translation from a Velox row expression back to a Presto row expression, we must construct a function handle that is supported byNativeFunctionNamespaceManagerwhen the underlying function is native.To avoid parsing different function handle types, all functions under
NativeFunctionNamespaceManagernow consistently return and parseNativeFunctionHandle.Issues discovered by : #26903
Impact
There is no direct user-facing impact. This change only affects internal function handle construction and parsing.
Test Plan
CI, Unit tests.
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.