feat(native): Add support for custom execution point in Rest functions#26636
feat(native): Add support for custom execution point in Rest functions#26636aditi-pandit merged 2 commits intoprestodb:masterfrom
Conversation
0c78658 to
901b73b
Compare
9bcded2 to
01eccf6
Compare
There was a problem hiding this comment.
Hey there - I've reviewed your changes - here's some feedback:
- In PrestoRestFunctionRegistration::registerFunction the comment about a read-only check "no lock needed for initial check" is now misleading since the registrationMutex_ is taken there; either remove the comment or change the locking pattern to match it.
- PrestoRestFunctionRegistration::urlEncode is now a private static helper that appears to be unused; consider removing it (and any related code) to avoid dead code.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In PrestoRestFunctionRegistration::registerFunction the comment about a read-only check "no lock needed for initial check" is now misleading since the registrationMutex_ is taken there; either remove the comment or change the locking pattern to match it.
- PrestoRestFunctionRegistration::urlEncode is now a private static helper that appears to be unused; consider removing it (and any related code) to avoid dead code.
## Individual Comments
### Comment 1
<location> `presto-native-execution/presto_cpp/main/functions/remote/PrestoRestFunctionRegistration.cpp:109-113` </location>
<code_context>
// Check if already registered (read-only, no lock needed for initial check)
{
- std::lock_guard<std::mutex> lock(registrationMutex);
- auto it = registeredFunctionHandles.find(functionId);
- if (it != registeredFunctionHandles.end() &&
+ std::lock_guard<std::mutex> lock(registrationMutex_);
+ auto it = registeredFunctionHandles_.find(functionId);
+ if (it != registeredFunctionHandles_.end() &&
it->second == serializedFunctionHandle) {
return;
</code_context>
<issue_to_address>
**suggestion:** Clarify locking strategy and reduce redundant mutex acquisitions in registerFunction.
We now take registrationMutex_ even for the initial read-only lookup, despite the comment saying no lock is needed. registerFunction also acquires registrationMutex_ three times (duplicate check, client lookup/insert, registration update), which increases contention and makes the comment inaccurate. Please either: (a) keep the initial duplicate check lock-free (or use a shared lock) and then take a single exclusive lock for mutations, or (b) take registrationMutex_ once for the whole critical section and update both maps under that lock, and update/remove the stale comment accordingly.
Suggested implementation:
```cpp
// Register function and associated client under a single lock to avoid
// redundant mutex acquisitions and ensure consistent updates to both
// registeredFunctionHandles_ and the remote client map.
{
std::lock_guard<std::mutex> lock(registrationMutex_);
auto it = registeredFunctionHandles_.find(functionId);
if (it != registeredFunctionHandles_.end() &&
it->second == serializedFunctionHandle) {
return;
}
// Get or create shared RestRemoteClient for this server URL
functions::rest::RestRemoteClientPtr remoteClient;
```
To fully implement the suggested locking strategy (option b: take registrationMutex_ once for the whole critical section), you should:
1. Merge the subsequent scopes that currently re-acquire registrationMutex_ into this single critical section:
- Locate any later blocks in registerFunction (or the containing function) that look like:
{
std::lock_guard<std::mutex> lock(registrationMutex_);
// logic to get/insert RestRemoteClient for remoteFunctionServerRestURL
// logic to update registeredFunctionHandles_ with serializedFunctionHandle
}
- Remove the extra std::lock_guard<std::mutex> lock(registrationMutex_); in those blocks.
- Move the body of those blocks inside the current single locked scope, directly after the "Get or create shared RestRemoteClient for this server URL" comment and remoteClient declaration introduced in the replacement block.
2. Ensure remoteClient is initialized and used entirely within the same locked scope:
- Right after the new remoteClient declaration, add logic like:
- Lookup in your map from URL to RestRemoteClientPtr.
- Create and insert a new client if none exists.
- Assign the found/created client to remoteClient.
- Keep any code that uses remoteClient (e.g., for registration or bookkeeping) inside this same { std::lock_guard<std::mutex> lock(registrationMutex_); ... } scope.
3. Update the final registration of the function handle under the same lock:
- Ensure that the line that updates registeredFunctionHandles_[functionId] = serializedFunctionHandle; (or equivalent) is moved into this single locked scope (if it is not already).
- Remove any separate blocks that update registeredFunctionHandles_ under a different acquisition of registrationMutex_.
4. After these changes:
- registrationMutex_ should be acquired exactly once in this function for the entire registration path.
- The comment at the top of the block now correctly describes the locking strategy, and the old "read-only, no lock needed for initial check" comment is fully removed.
</issue_to_address>
### Comment 2
<location> `presto-native-execution/presto_cpp/main/functions/remote/tests/PrestoRestFunctionRegistrationTest.cpp:25` </location>
<code_context>
+namespace facebook::presto::functions::remote::rest::test {
+namespace {
+
+class PrestoRestFunctionRegistrationTest : public ::testing::Test {
+ protected:
+ void SetUp() override {
</code_context>
<issue_to_address>
**suggestion (testing):** Add tests that exercise registerFunction, not only getRemoteFunctionServerUrl
These tests cover executionEndpoint shapes for getRemoteFunctionServerUrl, but they never invoke registerFunction. Since the goal is to verify that registration respects a custom executionEndpoint, please add at least one test that:
- creates one RestFunctionHandle with executionEndpoint set and one without,
- calls PrestoRestFunctionRegistration::getInstance().registerFunction(...) for both,
- and asserts that the resulting serialized handle or RestRemoteClient/metadata uses the expected URL in each case.
If the internal state isn’t observable, consider a minimal test-only accessor or a friend test to inspect registeredFunctionHandles_ / restClients_ and confirm they are keyed by executionEndpoint vs the default URL.
Suggested implementation:
```cpp
class PrestoRestFunctionRegistrationTest : public ::testing::Test {
protected:
void SetUp() override {
auto systemConfig = SystemConfig::instance();
std::unordered_map<std::string, std::string> config{
{std::string(SystemConfig::kRemoteFunctionServerSerde),
std::string("presto_page")},
{std::string(SystemConfig::kRemoteFunctionServerRestURL),
std::string("http://localhost:8080")}};
systemConfig->initialize(
std::make_unique<velox::config::ConfigBase>(std::move(config), true));
}
};
TEST_F(
PrestoRestFunctionRegistrationTest,
RegisterFunctionUsesExecutionEndpointWhenPresent) {
using facebook::presto::functions::remote::rest::
PrestoRestFunctionRegistration;
using facebook::presto::functions::remote::rest::RestFunctionHandle;
auto& registration = PrestoRestFunctionRegistration::getInstance();
// Handle with a custom execution endpoint.
RestFunctionHandle handleWithEndpoint;
handleWithEndpoint.executionEndpoint = "http://localhost:9090/custom";
// Handle without a custom execution endpoint.
RestFunctionHandle handleWithoutEndpoint;
handleWithoutEndpoint.executionEndpoint = "";
const std::string functionWithEndpointName = "test_function_with_endpoint";
const std::string functionWithoutEndpointName = "test_function_without_endpoint";
// Register both functions; this exercises the registration logic, not just
// URL computation.
registration.registerFunction(functionWithEndpointName, handleWithEndpoint);
registration.registerFunction(functionWithoutEndpointName, handleWithoutEndpoint);
// Validate that the internal state is keyed by the effective execution
// endpoint URL: the custom URL for the first handle, and the default
// configured URL for the second handle.
const auto& registeredHandles =
registration.registeredFunctionHandlesForTest();
const auto& restClients = registration.restClientsForTest();
// Verify that the function with a custom execution endpoint is registered
// under that custom URL.
ASSERT_NE(
registeredHandles.find(functionWithEndpointName),
registeredHandles.end());
EXPECT_EQ(
registeredHandles.at(functionWithEndpointName).executionEndpoint,
"http://localhost:9090/custom");
ASSERT_NE(
restClients.find("http://localhost:9090/custom"), restClients.end());
// Verify that the function without a custom execution endpoint is
// registered under the default URL from SystemConfig.
ASSERT_NE(
registeredHandles.find(functionWithoutEndpointName),
registeredHandles.end());
auto systemConfig = SystemConfig::instance();
const auto defaultUrl =
systemConfig->remoteFunctionServerRestUrl().value_or(
"http://localhost:8080");
EXPECT_TRUE(
registeredHandles.at(functionWithoutEndpointName).executionEndpoint.empty());
ASSERT_NE(restClients.find(defaultUrl), restClients.end());
}
TEST_F(
PrestoRestFunctionRegistrationTest,
RegisterFunctionSerializedHandleUsesExpectedUrl) {
using facebook::presto::functions::remote::rest::
PrestoRestFunctionRegistration;
using facebook::presto::functions::remote::rest::RestFunctionHandle;
auto& registration = PrestoRestFunctionRegistration::getInstance();
RestFunctionHandle handleWithEndpoint;
handleWithEndpoint.executionEndpoint = "http://localhost:9090/custom2";
RestFunctionHandle handleWithoutEndpoint;
handleWithoutEndpoint.executionEndpoint = "";
const std::string functionWithEndpointName = "test_function_with_endpoint_serialized";
const std::string functionWithoutEndpointName =
"test_function_without_endpoint_serialized";
registration.registerFunction(functionWithEndpointName, handleWithEndpoint);
registration.registerFunction(functionWithoutEndpointName, handleWithoutEndpoint);
// Assuming PrestoRestFunctionRegistration exposes a way to obtain the
// serialized handle metadata for testing.
const auto serializedWithEndpoint =
registration.serializedHandleForTest(functionWithEndpointName);
const auto serializedWithoutEndpoint =
registration.serializedHandleForTest(functionWithoutEndpointName);
// The serialized handle for the function with an execution endpoint should
// embed the custom URL.
EXPECT_NE(
serializedWithEndpoint.find("http://localhost:9090/custom2"),
std::string::npos);
// The serialized handle for the function without an execution endpoint
// should embed the default URL.
auto systemConfig = SystemConfig::instance();
const auto defaultUrl =
systemConfig->remoteFunctionServerRestUrl().value_or(
"http://localhost:8080");
EXPECT_NE(
serializedWithoutEndpoint.find(defaultUrl), std::string::npos);
}
```
To make these tests compile and behave as intended, you will need to:
1) Adjust the RestFunctionHandle usage to match the real type:
- If RestFunctionHandle is not default-constructible or does not have a public member `executionEndpoint`, replace the field access with the appropriate constructor or setter.
- If the execution endpoint is represented differently (e.g., as an std::optional<std::string> or via an accessor), update the test code accordingly:
- For example, use `handleWithEndpoint.setExecutionEndpoint("...")` or `handleWithEndpoint.executionEndpoint = std::optional<std::string>{"..."}`.
2) Expose test-only access to the internal registration state:
- In `PrestoRestFunctionRegistration` (header and implementation), add const accessors or friend test declarations, for example:
- `const std::unordered_map<std::string, RestFunctionHandle>& registeredFunctionHandlesForTest() const;`
- `const std::unordered_map<std::string, std::shared_ptr<RestRemoteClient>>& restClientsForTest() const;`
- Implement these accessors to simply return references to the internal `registeredFunctionHandles_` and `restClients_` containers.
- Alternatively, declare `friend class ::facebook_presto_functions_remote_rest_test_PrestoRestFunctionRegistrationTest_RegisterFunctionUsesExecutionEndpointWhenPresent_Test;`
and similar for the second test, if you prefer direct access instead of accessors.
3) Provide a way to obtain a serialized handle for testing:
- If `PrestoRestFunctionRegistration` does not have `serializedHandleForTest`, add a test-only helper such as:
- `std::string serializedHandleForTest(const std::string& functionName) const;`
- Implement it by looking up the registered handle and using the existing serialization utilities (whatever is used in production to serialize a remote function handle) to produce a string representation that includes the effective URL.
4) Verify and adjust registerFunction’s signature:
- If the actual signature of `registerFunction` differs (e.g., it takes additional parameters like function name, argument types, return type, or metadata), update the test calls accordingly.
- Ensure that both calls exercise the path where `registerFunction` resolves the execution endpoint (using the custom endpoint vs the default URL), so that the tests genuinely validate registration behavior.
5) If the internal container keys differ:
- If `restClients_` is keyed by something other than URL (e.g., a struct containing URL and serde), adjust the `EXPECT_*` checks to look up using the correct key or to inspect the stored client’s configuration to verify the URL.
With these adjustments in the main code and type-specific fixes in the tests, you will have two tests that:
- Construct handles with and without `executionEndpoint`.
- Invoke `PrestoRestFunctionRegistration::getInstance().registerFunction(...)` for both.
- Assert that the resulting internal registration and serialized metadata use the expected URL in each case.
</issue_to_address>
### Comment 3
<location> `presto-native-execution/presto_cpp/main/functions/remote/tests/PrestoRestFunctionRegistrationTest.cpp:51-60` </location>
<code_context>
+ EXPECT_EQ(result, "http://localhost:8080");
+}
+
+TEST_F(
+ PrestoRestFunctionRegistrationTest,
+ getRemoteFunctionServerUrlWithoutExecutionEndpoint) {
+ // Test when executionEndpoint is not provided (nullopt)
+ const std::string jsonStr = R"JSON(
</code_context>
<issue_to_address>
**suggestion (testing):** Consider explicitly asserting the absence of executionEndpoint when it is omitted
In getRemoteFunctionServerUrlWithoutExecutionEndpoint, you confirm the URL falls back to the default but don’t check executionEndpoint itself. Please also assert that handle->executionEndpoint is std::nullopt (or otherwise unset) to verify that a missing field deserializes as absent, not as an empty string, and to guard against future JSON deserialization regressions.
Suggested implementation:
```cpp
// Ensure that a missing executionEndpoint field is deserialized as absent,
// not as an empty string or a default value.
EXPECT_FALSE(restFunctionHandle.executionEndpoint.has_value());
EXPECT_EQ(result, "http://localhost:8080");
}
```
- This change assumes that in `getRemoteFunctionServerUrlWithoutExecutionEndpoint` you deserialize into a variable named `restFunctionHandle` of type `protocol::RestFunctionHandle` and then call `getRemoteFunctionServerUrl(restFunctionHandle)`.
- If the variable has a different name (for example, `handle` or `functionHandle`), update the new assertion to match:
- `EXPECT_FALSE(handle->executionEndpoint.has_value());` if it is a pointer/`shared_ptr`
- or `EXPECT_FALSE(handle.executionEndpoint.has_value());` if it is a value.
- Ensure that `executionEndpoint` is a `std::optional<...>`-like field so that `.has_value()` is available; if not, adapt the assertion to the appropriate API (e.g., `EXPECT_EQ(std::nullopt, restFunctionHandle.executionEndpoint);` if that’s the existing style in this file).
</issue_to_address>
### Comment 4
<location> `presto-native-execution/presto_cpp/main/functions/remote/PrestoRestFunctionRegistration.cpp:30` </location>
<code_context>
-PageFormat getSerdeFormat() {
+
+PrestoRestFunctionRegistration::PrestoRestFunctionRegistration()
+ : kRemoteFunctionServerRestURL_(
+ SystemConfig::instance()->remoteFunctionServerRestURL()) {}
+
</code_context>
<issue_to_address>
**issue (review_instructions):** kRemoteFunctionServerRestURL_ appears to be a (likely private) member variable but uses kPascalCase_ instead of camelCase_, which violates the member naming convention.
Based on its use in the constructor’s initializer list, kRemoteFunctionServerRestURL_ looks like a non-static member field. Per the guidelines, private/protected member variables should be camelCase_ (e.g., remoteFunctionServerRestUrl_), while the kPascalCase prefix is reserved for static constants and enumerators. If this is indeed a member (not a static constant), please rename it to follow the camelCase_ convention.
<details>
<summary>Review instructions:</summary>
**Path patterns:** `presto-native-execution/**/*.hpp,presto-native-execution/**/*.hpp,presto-native-execution/**/*.cpp`
**Instructions:**
camelCase_ for private and protected members variables.
</details>
</issue_to_address>
### Comment 5
<location> `presto-native-execution/presto_cpp/main/functions/remote/PrestoRestFunctionRegistration.cpp:29` </location>
<code_context>
-// @throws VeloxException if the configured format is unknown.
-PageFormat getSerdeFormat() {
+
+PrestoRestFunctionRegistration::PrestoRestFunctionRegistration()
+ : kRemoteFunctionServerRestURL_(
+ SystemConfig::instance()->remoteFunctionServerRestURL()) {}
</code_context>
<issue_to_address>
**issue (review_instructions):** kRemoteFunctionServerRestURL_ appears to be a non-static member but is named like a static const (kPascalCase), which conflicts with the camelCase_ convention for member variables.
In the constructor’s member initializer list you’re initializing kRemoteFunctionServerRestURL_. Based on the naming, this looks like a regular (non-static) member, but per the style guide private/protected members should use camelCase_, and kPascalCase is reserved for static constants/enumerators. Consider renaming this member to something like remoteFunctionServerRestUrl_ (and, if it’s intended to be a static const, moving it and naming it kRemoteFunctionServerRestUrl instead).
<details>
<summary>Review instructions:</summary>
**Path patterns:** `presto-native-execution/**/*.hpp,presto-native-execution/**/*.cpp`
**Instructions:**
camelCase_ for private and protected members variables. Use kPascalCase for static constants and enumerators.
</details>
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
...tive-execution/presto_cpp/main/functions/remote/tests/PrestoRestFunctionRegistrationTest.cpp
Show resolved
Hide resolved
Reviewer's GuideRefactors REST remote function registration into a singleton manager that can resolve per-function execution endpoints, updates native expression conversion to use it, aligns namespaces for remote REST components under facebook::presto::functions::remote::rest, and adds targeted unit tests for resolving the remote function server URL from RestFunctionHandle (including custom executionEndpoint). Sequence diagram for REST remote function registration with custom executionEndpointsequenceDiagram
actor PrestoCoordinator
participant VeloxExprConverter
participant PrestoRestFunctionRegistration
participant RestRemoteClient
participant RemoteFunctionServer
PrestoCoordinator->>VeloxExprConverter: toVeloxExpr(pexpr)
VeloxExprConverter->>VeloxExprConverter: detect RestFunctionHandle
VeloxExprConverter->>PrestoRestFunctionRegistration: getInstance()
VeloxExprConverter->>PrestoRestFunctionRegistration: registerFunction(restFunctionHandle)
activate PrestoRestFunctionRegistration
PrestoRestFunctionRegistration->>PrestoRestFunctionRegistration: getRemoteFunctionServerUrl(restFunctionHandle)
alt executionEndpoint present and non empty
PrestoRestFunctionRegistration-->>PrestoRestFunctionRegistration: use restFunctionHandle.executionEndpoint
else default endpoint
PrestoRestFunctionRegistration-->>PrestoRestFunctionRegistration: use kRemoteFunctionServerRestURL_
end
PrestoRestFunctionRegistration->>PrestoRestFunctionRegistration: serialize RestFunctionHandle with url
PrestoRestFunctionRegistration->>PrestoRestFunctionRegistration: check registeredFunctionHandles_
alt already registered with same handle
PrestoRestFunctionRegistration-->>VeloxExprConverter: return
else new or changed registration
PrestoRestFunctionRegistration->>PrestoRestFunctionRegistration: get or create RestRemoteClient for url
PrestoRestFunctionRegistration->>RestRemoteClient: constructor(url) [if new]
PrestoRestFunctionRegistration->>PrestoRestFunctionRegistration: buildVeloxSignatureFromPrestoSignature
PrestoRestFunctionRegistration->>PrestoRestFunctionRegistration: getSerdeFormat
PrestoRestFunctionRegistration->>PrestoRestFunctionRegistration: getFunctionName(functionId)
PrestoRestFunctionRegistration->>RemoteFunctionServer: registerVeloxRemoteFunction(..., RestRemoteClient)
PrestoRestFunctionRegistration->>PrestoRestFunctionRegistration: update registeredFunctionHandles_
PrestoRestFunctionRegistration-->>VeloxExprConverter: return
end
deactivate PrestoRestFunctionRegistration
VeloxExprConverter-->>PrestoCoordinator: CallTypedExpr with registered function name
Flow diagram for resolving remote function server URL from RestFunctionHandleflowchart TD
A[Start registration
RestFunctionHandle input] --> B{executionEndpoint present?}
B -->|no| C[Use kRemoteFunctionServerRestURL_
from SystemConfig]
B -->|yes| D{executionEndpoint empty?}
D -->|yes| C
D -->|no| E[Use executionEndpoint
from RestFunctionHandle]
C --> F[Set url field in serialized
RestFunctionHandle to default URL]
E --> F
F --> G[Lookup or create RestRemoteClient
for resolved URL]
G --> H[Register Velox remote function
with RestRemoteClient]
H --> I[Update registeredFunctionHandles_]
I --> J[End]
File-Level Changes
Assessment against linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey there - I've reviewed your changes - here's some feedback:
- PrestoRestFunctionRegistration caches kRemoteFunctionServerRestURL_ at singleton construction time, so if SystemConfig is reinitialized or the remoteFunctionServerRestURL changes at runtime the registration logic will continue using the old value; consider either documenting this snapshot behavior or resolving the URL from SystemConfig on each call instead.
- registerFunction currently takes multiple sequential locks on registrationMutex_ for checking existing registrations, obtaining a client, and updating the map; you could simplify and reduce contention by holding a single lock for all three operations or by clearly separating read-only and mutation paths (e.g., using a shared mutex).
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- PrestoRestFunctionRegistration caches kRemoteFunctionServerRestURL_ at singleton construction time, so if SystemConfig is reinitialized or the remoteFunctionServerRestURL changes at runtime the registration logic will continue using the old value; consider either documenting this snapshot behavior or resolving the URL from SystemConfig on each call instead.
- registerFunction currently takes multiple sequential locks on registrationMutex_ for checking existing registrations, obtaining a client, and updating the map; you could simplify and reduce contention by holding a single lock for all three operations or by clearly separating read-only and mutation paths (e.g., using a shared mutex).
## Individual Comments
### Comment 1
<location> `presto-native-execution/presto_cpp/main/functions/remote/PrestoRestFunctionRegistration.cpp:109` </location>
<code_context>
// Check if already registered (read-only, no lock needed for initial check)
{
- std::lock_guard<std::mutex> lock(registrationMutex);
</code_context>
<issue_to_address>
**nitpick:** The comment about not needing a lock no longer matches the code behavior.
This comment no longer matches the code: we now take `registrationMutex_` for this check to safely read the shared map. Please update or remove the comment so it reflects the current locking behavior.
</issue_to_address>
### Comment 2
<location> `presto-native-execution/presto_cpp/main/functions/remote/PrestoRestFunctionRegistration.cpp:29` </location>
<code_context>
-// @throws VeloxException if the configured format is unknown.
-PageFormat getSerdeFormat() {
+
+PrestoRestFunctionRegistration::PrestoRestFunctionRegistration()
+ : kRemoteFunctionServerRestURL_(
+ SystemConfig::instance()->remoteFunctionServerRestURL()) {}
</code_context>
<issue_to_address>
**suggestion (review_instructions):** The member name kRemoteFunctionServerRestURL_ doesn’t follow the specified naming convention for either member variables (camelCase_) or static constants (kPascalCase).
It looks like kRemoteFunctionServerRestURL_ is a (likely non-static) member variable, but its name mixes the "k" prefix (reserved for static constants) with a trailing underscore and PascalCase. To comply with the style guide, please either:
- rename it to camelCase_ (e.g., remoteFunctionServerRestUrl_) if it is a regular member, or
- if it is meant to be a static constant, use kPascalCase without the trailing underscore (e.g., kRemoteFunctionServerRestUrl) and make sure it is declared static const.
Right now it doesn’t clearly match either of the prescribed patterns.
<details>
<summary>Review instructions:</summary>
**Path patterns:** `presto-native-execution/**/*.hpp,presto-native-execution/**/*.hpp,presto-native-execution/**/*.cpp`
**Instructions:**
Use camelCase_ for private and protected members variables. Use kPascalCase for static constants and enumerators.
</details>
</issue_to_address>
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
presto-native-execution/presto_cpp/main/functions/remote/PrestoRestFunctionRegistration.cpp
Outdated
Show resolved
Hide resolved
|
@aditi-pandit, Can you please have a look? |
|
|
||
| } // namespace facebook::presto::functions::remote::rest | ||
|
|
||
| int main(int argc, char** argv) { |
There was a problem hiding this comment.
Don't think this code is needed.
Why are you making this an executable ? Can be avoided.
| } | ||
| )JSON"; | ||
|
|
||
| auto handle = parseRestFunctionHandle(jsonStr); |
There was a problem hiding this comment.
Add code to validate the function name and signature as well.
aditi-pandit
left a comment
There was a problem hiding this comment.
Thanks @Joe-Abraham
Code looks good. Just have few review comments on the tests.
…ote functions and improve test coverage
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey there - I've reviewed your changes - here's some feedback:
- The
PrestoRestFunctionRegistrationsingleton uses a single mutex to guard both the function-handle map and the client map; if this starts being used heavily, consider splitting the locks (or using finer-grained synchronization) to reduce contention aroundregisterFunction. - In
PrestoRestFunctionRegistrationTest::SetUp, you rely on configuringSystemConfigbeforegetInstance()is called; it may be safer to document or enforce thatPrestoRestFunctionRegistrationis only constructed after config initialization to avoid subtle order-of-initialization issues in other call sites.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The `PrestoRestFunctionRegistration` singleton uses a single mutex to guard both the function-handle map and the client map; if this starts being used heavily, consider splitting the locks (or using finer-grained synchronization) to reduce contention around `registerFunction`.
- In `PrestoRestFunctionRegistrationTest::SetUp`, you rely on configuring `SystemConfig` before `getInstance()` is called; it may be safer to document or enforce that `PrestoRestFunctionRegistration` is only constructed after config initialization to avoid subtle order-of-initialization issues in other call sites.
## Individual Comments
### Comment 1
<location> `presto-native-execution/presto_cpp/main/functions/remote/tests/PrestoRestFunctionRegistrationTest.cpp:44-53` </location>
<code_context>
+ EXPECT_EQ(httpsResult, "https://secure-server:8443");
+}
+
+TEST_F(
+ PrestoRestFunctionRegistrationTest,
+ getRemoteFunctionServerUrlWithComplexUrls) {
+ // Test with URLs containing paths and query parameters
+ const std::string jsonStr = R"JSON(
</code_context>
<issue_to_address>
**suggestion (testing):** Add tests that exercise `registerFunction` behavior, not only URL selection
The current tests thoroughly cover `getRemoteFunctionServerUrl`, but the new logic inside `PrestoRestFunctionRegistration::registerFunction` remains untested:
- Asserting the `url` field in the serialized handle matches `executionEndpoint` when set, and the default URL when unset/empty.
- Verifying the `RestRemoteClient` cache is keyed by URL (same client reused for the same endpoint, new clients for different endpoints).
- Confirming `registeredFunctionHandles_` prevents re-registration for identical handles but allows it when the handle changes.
Consider adding tests that call `registerFunction` with varying `RestFunctionHandle` instances and, via a test double or instrumentation on `registerVeloxRemoteFunction` / `RestRemoteClient`, validate the chosen URL, client reuse, and de-duplication behavior end-to-end.
Suggested implementation:
```cpp
}
};
TEST_F(
PrestoRestFunctionRegistrationTest,
registerFunctionUsesExecutionEndpointWhenProvided) {
// Handle with explicit executionEndpoint.
const std::string handleWithEndpointJson = R"JSON(
{
"@type": "RestFunctionHandle",
"functionId": "remote.testSchema.testFunction;BIGINT",
"version": "v1",
"signature": {
"name": "testFunction",
"kind": "SCALAR",
"returnType": "bigint",
"argumentTypes": ["bigint"],
"typeVariableConstraints": [],
"variableArity" : false,
"literalParameters" : []
},
"remoteSource" : {
"@type" : "PrestoThriftServiceSource",
"httpClientConfig" : {
"useHttps" : true
},
"serverUrl" : "http://default-server:8080",
"executionEndpoint" : "https://exec-endpoint:8443"
}
}
)JSON";
// Handle without executionEndpoint (uses default URL).
const std::string handleWithoutEndpointJson = R"JSON(
{
"@type": "RestFunctionHandle",
"functionId": "remote.testSchema.testFunction;BIGINT",
"version": "v1",
"signature": {
"name": "testFunction",
"kind": "SCALAR",
"returnType": "bigint",
"argumentTypes": ["bigint"],
"typeVariableConstraints": [],
"variableArity" : false,
"literalParameters" : []
},
"remoteSource" : {
"@type" : "PrestoThriftServiceSource",
"httpClientConfig" : {
"useHttps" : true
},
"serverUrl" : "https://default-server:8443"
}
}
)JSON";
auto handleWithEndpoint = parseRestFunctionHandle(handleWithEndpointJson);
auto handleWithoutEndpoint = parseRestFunctionHandle(handleWithoutEndpointJson);
// First registration should serialize the handle using executionEndpoint.
registration_.registerFunction(*handleWithEndpoint);
const auto& lastHandleWithEndpoint = lastRegisteredHandleForTesting();
EXPECT_EQ(
lastHandleWithEndpoint.url,
"https://exec-endpoint:8443");
// Second registration should use the default URL when executionEndpoint is
// not set.
registration_.registerFunction(*handleWithoutEndpoint);
const auto& lastHandleWithoutEndpoint = lastRegisteredHandleForTesting();
EXPECT_EQ(
lastHandleWithoutEndpoint.url,
"https://default-server:8443");
}
TEST_F(
PrestoRestFunctionRegistrationTest,
registerFunctionClientCachingAndDeduplication) {
const std::string baseHandleJson = R"JSON(
{
"@type": "RestFunctionHandle",
"functionId": "remote.testSchema.testFunction;BIGINT",
"version": "v1",
"signature": {
"name": "testFunction",
"kind": "SCALAR",
"returnType": "bigint",
"argumentTypes": ["bigint"],
"typeVariableConstraints": [],
"variableArity" : false,
"literalParameters" : []
},
"remoteSource" : {
"@type" : "PrestoThriftServiceSource",
"httpClientConfig" : {
"useHttps" : true
},
"serverUrl" : "https://server-a:8443"
}
}
)JSON";
const std::string differentUrlHandleJson = R"JSON(
{
"@type": "RestFunctionHandle",
"functionId": "remote.testSchema.testFunction;BIGINT",
"version": "v1",
"signature": {
"name": "testFunction",
"kind": "SCALAR",
"returnType": "bigint",
"argumentTypes": ["bigint"],
"typeVariableConstraints": [],
"variableArity" : false,
"literalParameters" : []
},
"remoteSource" : {
"@type" : "PrestoThriftServiceSource",
"httpClientConfig" : {
"useHttps" : true
},
"serverUrl" : "https://server-b:8443"
}
}
)JSON";
auto baseHandle = parseRestFunctionHandle(baseHandleJson);
auto differentUrlHandle = parseRestFunctionHandle(differentUrlHandleJson);
// First registration for server-a should create a new client and handle.
auto initialHandlesSize = registeredFunctionHandlesSizeForTesting();
registration_.registerFunction(*baseHandle);
EXPECT_EQ(
registeredFunctionHandlesSizeForTesting(),
initialHandlesSize + 1);
auto clientA1 = restClientForUrlForTesting("https://server-a:8443");
ASSERT_NE(clientA1, nullptr);
// Registering an identical handle should not register again and should reuse
// the same client.
registration_.registerFunction(*baseHandle);
EXPECT_EQ(
registeredFunctionHandlesSizeForTesting(),
initialHandlesSize + 1);
auto clientA2 = restClientForUrlForTesting("https://server-a:8443");
ASSERT_NE(clientA2, nullptr);
EXPECT_EQ(clientA1.get(), clientA2.get());
// Registering a handle pointing to a different URL should create a new
// client and a new registered handle entry.
registration_.registerFunction(*differentUrlHandle);
EXPECT_EQ(
registeredFunctionHandlesSizeForTesting(),
initialHandlesSize + 2);
auto clientB = restClientForUrlForTesting("https://server-b:8443");
ASSERT_NE(clientB, nullptr);
EXPECT_NE(clientA1.get(), clientB.get());
}
TEST_F(
PrestoRestFunctionRegistrationTest,
getRemoteFunctionServerUrlWithExecutionEndpoint) {
```
To make these tests compile and correctly exercise `registerFunction`, the following changes will be required elsewhere in the test fixture / production code:
1. In `PrestoRestFunctionRegistrationTest` (test fixture class in this file):
- Ensure there is a member under test named `registration_` of type `PrestoRestFunctionRegistration` (or adjust the test to the actual member name/type).
2. In `PrestoRestFunctionRegistration` (or via a small wrapper used only in tests), add testing accessors that expose the necessary state:
- `const SerializedRestFunctionHandle& lastRegisteredHandleForTesting() const;`
- Returns the last handle object passed into `registerVeloxRemoteFunction` (or the equivalent serialized representation).
- The type must provide a `std::string url` field that the tests assert on.
- `size_t registeredFunctionHandlesSizeForTesting() const;`
- Returns `registeredFunctionHandles_.size()` to verify de-duplication.
- `std::shared_ptr<velox::exec::remote::RestRemoteClient> restClientForUrlForTesting(const std::string& url) const;`
- Looks up the client in the internal cache keyed by URL and returns it (or `nullptr` if missing).
3. If you prefer not to add methods directly on `PrestoRestFunctionRegistration`, you can:
- Expose the above helpers as methods on `PrestoRestFunctionRegistrationTest` and implement them using `friend class PrestoRestFunctionRegistrationTest;` inside `PrestoRestFunctionRegistration` to access private members.
4. If the actual URL field name or type used in the serialized handle differs from `url`, adjust the expectations in `registerFunctionUsesExecutionEndpointWhenProvided` accordingly.
5. If `registerFunction` has a different signature (e.g., takes a `std::shared_ptr<protocol::RestFunctionHandle>`), update the calls in the tests to match the real signature.
</issue_to_address>
### Comment 2
<location> `presto-native-execution/presto_cpp/main/functions/remote/PrestoRestFunctionRegistration.cpp:30` </location>
<code_context>
-PageFormat getSerdeFormat() {
+
+PrestoRestFunctionRegistration::PrestoRestFunctionRegistration()
+ : kRemoteFunctionServerRestURL_(
+ SystemConfig::instance()->remoteFunctionServerRestURL()) {}
+
</code_context>
<issue_to_address>
**issue (review_instructions):** The member name `kRemoteFunctionServerRestURL_` violates the camelCase_ rule for private members and reserves the `k` prefix for static constants/enumerators.
Private/protected member variables should be named in `camelCase_`, and `kPascalCase` is reserved for static constants/enumerators. Here `kRemoteFunctionServerRestURL_` appears to be an instance member, so something like `remoteFunctionServerRestUrl_` (or similar camelCase_) would align with the naming guidelines.
<details>
<summary>Review instructions:</summary>
**Path patterns:** `presto-native-execution/**/*.hpp,presto-native-execution/**/*.hpp,presto-native-execution/**/*.cpp`
**Instructions:**
Use camelCase_ for private and protected members variables.
Use kPascalCase for static constants and enumerators.
</details>
</issue_to_address>
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
Thanks @aditi-pandit, I have update the test case, can you please have a look? |
aditi-pandit
left a comment
There was a problem hiding this comment.
Thanks @Joe-Abraham
Description
Add support for the native workers to route the custom
executionEndpointif present.Motivation and Context
Fixes
Impact
Allow native workers to send to execution servers provided by the Rest function Handle.
Test Plan
Unit tests included.
Contributor checklist
Release Notes