Skip to content

feat(native): Add support for custom execution point in Rest functions#26636

Merged
aditi-pandit merged 2 commits intoprestodb:masterfrom
Joe-Abraham:exec
Dec 16, 2025
Merged

feat(native): Add support for custom execution point in Rest functions#26636
aditi-pandit merged 2 commits intoprestodb:masterfrom
Joe-Abraham:exec

Conversation

@Joe-Abraham
Copy link
Contributor

@Joe-Abraham Joe-Abraham commented Nov 17, 2025

Description

Add support for the native workers to route the custom executionEndpoint if present.

Motivation and Context

Fixes

Impact

Allow native workers to send to execution servers provided by the Rest function Handle.

Test Plan

Unit tests included.

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.
  • If adding new dependencies, verified they have an OpenSSF Scorecard score of 5.0 or higher (or obtained explicit TSC approval for lower scores).

Release Notes

== NO RELEASE NOTE ==

@prestodb-ci prestodb-ci added the from:IBM PR from IBM label Nov 17, 2025
@Joe-Abraham Joe-Abraham force-pushed the exec branch 3 times, most recently from 0c78658 to 901b73b Compare November 24, 2025 05:07
@Joe-Abraham Joe-Abraham changed the title [WIP] feat(native): Add support for custom execution point feat(native): Add support for custom execution point Nov 24, 2025
@Joe-Abraham Joe-Abraham force-pushed the exec branch 2 times, most recently from 9bcded2 to 01eccf6 Compare November 24, 2025 09:00
@Joe-Abraham Joe-Abraham marked this pull request as ready for review November 24, 2025 09:04
@Joe-Abraham Joe-Abraham requested review from a team as code owners November 24, 2025 09:04
@prestodb-ci prestodb-ci requested review from a team, agrawalreetika and libianoss and removed request for a team November 24, 2025 09:04
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • In PrestoRestFunctionRegistration::registerFunction the comment about a read-only check "no lock needed for initial check" is now misleading since the registrationMutex_ is taken there; either remove the comment or change the locking pattern to match it.
  • PrestoRestFunctionRegistration::urlEncode is now a private static helper that appears to be unused; consider removing it (and any related code) to avoid dead code.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In PrestoRestFunctionRegistration::registerFunction the comment about a read-only check "no lock needed for initial check" is now misleading since the registrationMutex_ is taken there; either remove the comment or change the locking pattern to match it.
- PrestoRestFunctionRegistration::urlEncode is now a private static helper that appears to be unused; consider removing it (and any related code) to avoid dead code.

## Individual Comments

### Comment 1
<location> `presto-native-execution/presto_cpp/main/functions/remote/PrestoRestFunctionRegistration.cpp:109-113` </location>
<code_context>

   // Check if already registered (read-only, no lock needed for initial check)
   {
-    std::lock_guard<std::mutex> lock(registrationMutex);
-    auto it = registeredFunctionHandles.find(functionId);
-    if (it != registeredFunctionHandles.end() &&
+    std::lock_guard<std::mutex> lock(registrationMutex_);
+    auto it = registeredFunctionHandles_.find(functionId);
+    if (it != registeredFunctionHandles_.end() &&
         it->second == serializedFunctionHandle) {
       return;
</code_context>

<issue_to_address>
**suggestion:** Clarify locking strategy and reduce redundant mutex acquisitions in registerFunction.

We now take registrationMutex_ even for the initial read-only lookup, despite the comment saying no lock is needed. registerFunction also acquires registrationMutex_ three times (duplicate check, client lookup/insert, registration update), which increases contention and makes the comment inaccurate. Please either: (a) keep the initial duplicate check lock-free (or use a shared lock) and then take a single exclusive lock for mutations, or (b) take registrationMutex_ once for the whole critical section and update both maps under that lock, and update/remove the stale comment accordingly.

Suggested implementation:

```cpp
  // Register function and associated client under a single lock to avoid
  // redundant mutex acquisitions and ensure consistent updates to both
  // registeredFunctionHandles_ and the remote client map.
  {
    std::lock_guard<std::mutex> lock(registrationMutex_);
    auto it = registeredFunctionHandles_.find(functionId);
    if (it != registeredFunctionHandles_.end() &&
        it->second == serializedFunctionHandle) {
      return;
    }

    // Get or create shared RestRemoteClient for this server URL
    functions::rest::RestRemoteClientPtr remoteClient;

```

To fully implement the suggested locking strategy (option b: take registrationMutex_ once for the whole critical section), you should:

1. Merge the subsequent scopes that currently re-acquire registrationMutex_ into this single critical section:
   - Locate any later blocks in registerFunction (or the containing function) that look like:
     {
       std::lock_guard<std::mutex> lock(registrationMutex_);
       // logic to get/insert RestRemoteClient for remoteFunctionServerRestURL
       // logic to update registeredFunctionHandles_ with serializedFunctionHandle
     }
   - Remove the extra std::lock_guard<std::mutex> lock(registrationMutex_); in those blocks.
   - Move the body of those blocks inside the current single locked scope, directly after the "Get or create shared RestRemoteClient for this server URL" comment and remoteClient declaration introduced in the replacement block.

2. Ensure remoteClient is initialized and used entirely within the same locked scope:
   - Right after the new remoteClient declaration, add logic like:
     - Lookup in your map from URL to RestRemoteClientPtr.
     - Create and insert a new client if none exists.
     - Assign the found/created client to remoteClient.
   - Keep any code that uses remoteClient (e.g., for registration or bookkeeping) inside this same { std::lock_guard<std::mutex> lock(registrationMutex_); ... } scope.

3. Update the final registration of the function handle under the same lock:
   - Ensure that the line that updates registeredFunctionHandles_[functionId] = serializedFunctionHandle; (or equivalent) is moved into this single locked scope (if it is not already).
   - Remove any separate blocks that update registeredFunctionHandles_ under a different acquisition of registrationMutex_.

4. After these changes:
   - registrationMutex_ should be acquired exactly once in this function for the entire registration path.
   - The comment at the top of the block now correctly describes the locking strategy, and the old "read-only, no lock needed for initial check" comment is fully removed.
</issue_to_address>

### Comment 2
<location> `presto-native-execution/presto_cpp/main/functions/remote/tests/PrestoRestFunctionRegistrationTest.cpp:25` </location>
<code_context>
+namespace facebook::presto::functions::remote::rest::test {
+namespace {
+
+class PrestoRestFunctionRegistrationTest : public ::testing::Test {
+ protected:
+  void SetUp() override {
</code_context>

<issue_to_address>
**suggestion (testing):** Add tests that exercise registerFunction, not only getRemoteFunctionServerUrl

These tests cover executionEndpoint shapes for getRemoteFunctionServerUrl, but they never invoke registerFunction. Since the goal is to verify that registration respects a custom executionEndpoint, please add at least one test that:
- creates one RestFunctionHandle with executionEndpoint set and one without,
- calls PrestoRestFunctionRegistration::getInstance().registerFunction(...) for both,
- and asserts that the resulting serialized handle or RestRemoteClient/metadata uses the expected URL in each case.
If the internal state isn’t observable, consider a minimal test-only accessor or a friend test to inspect registeredFunctionHandles_ / restClients_ and confirm they are keyed by executionEndpoint vs the default URL.

Suggested implementation:

```cpp
class PrestoRestFunctionRegistrationTest : public ::testing::Test {
 protected:
  void SetUp() override {
    auto systemConfig = SystemConfig::instance();
    std::unordered_map<std::string, std::string> config{
        {std::string(SystemConfig::kRemoteFunctionServerSerde),
         std::string("presto_page")},
        {std::string(SystemConfig::kRemoteFunctionServerRestURL),
         std::string("http://localhost:8080")}};
    systemConfig->initialize(
        std::make_unique<velox::config::ConfigBase>(std::move(config), true));
  }
};

TEST_F(
    PrestoRestFunctionRegistrationTest,
    RegisterFunctionUsesExecutionEndpointWhenPresent) {
  using facebook::presto::functions::remote::rest::
      PrestoRestFunctionRegistration;
  using facebook::presto::functions::remote::rest::RestFunctionHandle;

  auto& registration = PrestoRestFunctionRegistration::getInstance();

  // Handle with a custom execution endpoint.
  RestFunctionHandle handleWithEndpoint;
  handleWithEndpoint.executionEndpoint = "http://localhost:9090/custom";

  // Handle without a custom execution endpoint.
  RestFunctionHandle handleWithoutEndpoint;
  handleWithoutEndpoint.executionEndpoint = "";

  const std::string functionWithEndpointName = "test_function_with_endpoint";
  const std::string functionWithoutEndpointName = "test_function_without_endpoint";

  // Register both functions; this exercises the registration logic, not just
  // URL computation.
  registration.registerFunction(functionWithEndpointName, handleWithEndpoint);
  registration.registerFunction(functionWithoutEndpointName, handleWithoutEndpoint);

  // Validate that the internal state is keyed by the effective execution
  // endpoint URL: the custom URL for the first handle, and the default
  // configured URL for the second handle.
  const auto& registeredHandles =
      registration.registeredFunctionHandlesForTest();
  const auto& restClients = registration.restClientsForTest();

  // Verify that the function with a custom execution endpoint is registered
  // under that custom URL.
  ASSERT_NE(
      registeredHandles.find(functionWithEndpointName),
      registeredHandles.end());
  EXPECT_EQ(
      registeredHandles.at(functionWithEndpointName).executionEndpoint,
      "http://localhost:9090/custom");

  ASSERT_NE(
      restClients.find("http://localhost:9090/custom"), restClients.end());

  // Verify that the function without a custom execution endpoint is
  // registered under the default URL from SystemConfig.
  ASSERT_NE(
      registeredHandles.find(functionWithoutEndpointName),
      registeredHandles.end());

  auto systemConfig = SystemConfig::instance();
  const auto defaultUrl =
      systemConfig->remoteFunctionServerRestUrl().value_or(
          "http://localhost:8080");

  EXPECT_TRUE(
      registeredHandles.at(functionWithoutEndpointName).executionEndpoint.empty());
  ASSERT_NE(restClients.find(defaultUrl), restClients.end());
}

TEST_F(
    PrestoRestFunctionRegistrationTest,
    RegisterFunctionSerializedHandleUsesExpectedUrl) {
  using facebook::presto::functions::remote::rest::
      PrestoRestFunctionRegistration;
  using facebook::presto::functions::remote::rest::RestFunctionHandle;

  auto& registration = PrestoRestFunctionRegistration::getInstance();

  RestFunctionHandle handleWithEndpoint;
  handleWithEndpoint.executionEndpoint = "http://localhost:9090/custom2";

  RestFunctionHandle handleWithoutEndpoint;
  handleWithoutEndpoint.executionEndpoint = "";

  const std::string functionWithEndpointName = "test_function_with_endpoint_serialized";
  const std::string functionWithoutEndpointName =
      "test_function_without_endpoint_serialized";

  registration.registerFunction(functionWithEndpointName, handleWithEndpoint);
  registration.registerFunction(functionWithoutEndpointName, handleWithoutEndpoint);

  // Assuming PrestoRestFunctionRegistration exposes a way to obtain the
  // serialized handle metadata for testing.
  const auto serializedWithEndpoint =
      registration.serializedHandleForTest(functionWithEndpointName);
  const auto serializedWithoutEndpoint =
      registration.serializedHandleForTest(functionWithoutEndpointName);

  // The serialized handle for the function with an execution endpoint should
  // embed the custom URL.
  EXPECT_NE(
      serializedWithEndpoint.find("http://localhost:9090/custom2"),
      std::string::npos);

  // The serialized handle for the function without an execution endpoint
  // should embed the default URL.
  auto systemConfig = SystemConfig::instance();
  const auto defaultUrl =
      systemConfig->remoteFunctionServerRestUrl().value_or(
          "http://localhost:8080");

  EXPECT_NE(
      serializedWithoutEndpoint.find(defaultUrl), std::string::npos);
}

```

To make these tests compile and behave as intended, you will need to:

1) Adjust the RestFunctionHandle usage to match the real type:
   - If RestFunctionHandle is not default-constructible or does not have a public member `executionEndpoint`, replace the field access with the appropriate constructor or setter.
   - If the execution endpoint is represented differently (e.g., as an std::optional<std::string> or via an accessor), update the test code accordingly:
     - For example, use `handleWithEndpoint.setExecutionEndpoint("...")` or `handleWithEndpoint.executionEndpoint = std::optional<std::string>{"..."}`.

2) Expose test-only access to the internal registration state:
   - In `PrestoRestFunctionRegistration` (header and implementation), add const accessors or friend test declarations, for example:
     - `const std::unordered_map<std::string, RestFunctionHandle>& registeredFunctionHandlesForTest() const;`
     - `const std::unordered_map<std::string, std::shared_ptr<RestRemoteClient>>& restClientsForTest() const;`
   - Implement these accessors to simply return references to the internal `registeredFunctionHandles_` and `restClients_` containers.
   - Alternatively, declare `friend class ::facebook_presto_functions_remote_rest_test_PrestoRestFunctionRegistrationTest_RegisterFunctionUsesExecutionEndpointWhenPresent_Test;`
     and similar for the second test, if you prefer direct access instead of accessors.

3) Provide a way to obtain a serialized handle for testing:
   - If `PrestoRestFunctionRegistration` does not have `serializedHandleForTest`, add a test-only helper such as:
     - `std::string serializedHandleForTest(const std::string& functionName) const;`
   - Implement it by looking up the registered handle and using the existing serialization utilities (whatever is used in production to serialize a remote function handle) to produce a string representation that includes the effective URL.

4) Verify and adjust registerFunction’s signature:
   - If the actual signature of `registerFunction` differs (e.g., it takes additional parameters like function name, argument types, return type, or metadata), update the test calls accordingly.
   - Ensure that both calls exercise the path where `registerFunction` resolves the execution endpoint (using the custom endpoint vs the default URL), so that the tests genuinely validate registration behavior.

5) If the internal container keys differ:
   - If `restClients_` is keyed by something other than URL (e.g., a struct containing URL and serde), adjust the `EXPECT_*` checks to look up using the correct key or to inspect the stored client’s configuration to verify the URL.

With these adjustments in the main code and type-specific fixes in the tests, you will have two tests that:
- Construct handles with and without `executionEndpoint`.
- Invoke `PrestoRestFunctionRegistration::getInstance().registerFunction(...)` for both.
- Assert that the resulting internal registration and serialized metadata use the expected URL in each case.
</issue_to_address>

### Comment 3
<location> `presto-native-execution/presto_cpp/main/functions/remote/tests/PrestoRestFunctionRegistrationTest.cpp:51-60` </location>
<code_context>
+  EXPECT_EQ(result, "http://localhost:8080");
+}
+
+TEST_F(
+    PrestoRestFunctionRegistrationTest,
+    getRemoteFunctionServerUrlWithoutExecutionEndpoint) {
+  // Test when executionEndpoint is not provided (nullopt)
+  const std::string jsonStr = R"JSON(
</code_context>

<issue_to_address>
**suggestion (testing):** Consider explicitly asserting the absence of executionEndpoint when it is omitted

In getRemoteFunctionServerUrlWithoutExecutionEndpoint, you confirm the URL falls back to the default but don’t check executionEndpoint itself. Please also assert that handle->executionEndpoint is std::nullopt (or otherwise unset) to verify that a missing field deserializes as absent, not as an empty string, and to guard against future JSON deserialization regressions.

Suggested implementation:

```cpp
  // Ensure that a missing executionEndpoint field is deserialized as absent,
  // not as an empty string or a default value.
  EXPECT_FALSE(restFunctionHandle.executionEndpoint.has_value());

  EXPECT_EQ(result, "http://localhost:8080");
}

```

- This change assumes that in `getRemoteFunctionServerUrlWithoutExecutionEndpoint` you deserialize into a variable named `restFunctionHandle` of type `protocol::RestFunctionHandle` and then call `getRemoteFunctionServerUrl(restFunctionHandle)`.
- If the variable has a different name (for example, `handle` or `functionHandle`), update the new assertion to match:
  - `EXPECT_FALSE(handle->executionEndpoint.has_value());` if it is a pointer/`shared_ptr`
  - or `EXPECT_FALSE(handle.executionEndpoint.has_value());` if it is a value.
- Ensure that `executionEndpoint` is a `std::optional<...>`-like field so that `.has_value()` is available; if not, adapt the assertion to the appropriate API (e.g., `EXPECT_EQ(std::nullopt, restFunctionHandle.executionEndpoint);` if that’s the existing style in this file).
</issue_to_address>

### Comment 4
<location> `presto-native-execution/presto_cpp/main/functions/remote/PrestoRestFunctionRegistration.cpp:30` </location>
<code_context>
-PageFormat getSerdeFormat() {
+
+PrestoRestFunctionRegistration::PrestoRestFunctionRegistration()
+    : kRemoteFunctionServerRestURL_(
+          SystemConfig::instance()->remoteFunctionServerRestURL()) {}
+
</code_context>

<issue_to_address>
**issue (review_instructions):** kRemoteFunctionServerRestURL_ appears to be a (likely private) member variable but uses kPascalCase_ instead of camelCase_, which violates the member naming convention.

Based on its use in the constructor’s initializer list, kRemoteFunctionServerRestURL_ looks like a non-static member field. Per the guidelines, private/protected member variables should be camelCase_ (e.g., remoteFunctionServerRestUrl_), while the kPascalCase prefix is reserved for static constants and enumerators. If this is indeed a member (not a static constant), please rename it to follow the camelCase_ convention.

<details>
<summary>Review instructions:</summary>

**Path patterns:** `presto-native-execution/**/*.hpp,presto-native-execution/**/*.hpp,presto-native-execution/**/*.cpp`

**Instructions:**
camelCase_ for private and protected members variables.

</details>
</issue_to_address>

### Comment 5
<location> `presto-native-execution/presto_cpp/main/functions/remote/PrestoRestFunctionRegistration.cpp:29` </location>
<code_context>
-// @throws VeloxException if the configured format is unknown.
-PageFormat getSerdeFormat() {
+
+PrestoRestFunctionRegistration::PrestoRestFunctionRegistration()
+    : kRemoteFunctionServerRestURL_(
+          SystemConfig::instance()->remoteFunctionServerRestURL()) {}
</code_context>

<issue_to_address>
**issue (review_instructions):** kRemoteFunctionServerRestURL_ appears to be a non-static member but is named like a static const (kPascalCase), which conflicts with the camelCase_ convention for member variables.

In the constructor’s member initializer list you’re initializing kRemoteFunctionServerRestURL_. Based on the naming, this looks like a regular (non-static) member, but per the style guide private/protected members should use camelCase_, and kPascalCase is reserved for static constants/enumerators. Consider renaming this member to something like remoteFunctionServerRestUrl_ (and, if it’s intended to be a static const, moving it and naming it kRemoteFunctionServerRestUrl instead).

<details>
<summary>Review instructions:</summary>

**Path patterns:** `presto-native-execution/**/*.hpp,presto-native-execution/**/*.cpp`

**Instructions:**
camelCase_ for private and protected members variables. Use kPascalCase for static constants and enumerators.

</details>
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@Joe-Abraham Joe-Abraham marked this pull request as draft November 24, 2025 09:06
@prestodb prestodb deleted a comment from sourcery-ai bot Nov 24, 2025
@Joe-Abraham Joe-Abraham marked this pull request as ready for review November 24, 2025 11:20
@prestodb-ci prestodb-ci requested a review from a team November 24, 2025 11:20
@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Nov 24, 2025

Reviewer's Guide

Refactors REST remote function registration into a singleton manager that can resolve per-function execution endpoints, updates native expression conversion to use it, aligns namespaces for remote REST components under facebook::presto::functions::remote::rest, and adds targeted unit tests for resolving the remote function server URL from RestFunctionHandle (including custom executionEndpoint).

Sequence diagram for REST remote function registration with custom executionEndpoint

sequenceDiagram
  actor PrestoCoordinator
  participant VeloxExprConverter
  participant PrestoRestFunctionRegistration
  participant RestRemoteClient
  participant RemoteFunctionServer

  PrestoCoordinator->>VeloxExprConverter: toVeloxExpr(pexpr)
  VeloxExprConverter->>VeloxExprConverter: detect RestFunctionHandle
  VeloxExprConverter->>PrestoRestFunctionRegistration: getInstance()
  VeloxExprConverter->>PrestoRestFunctionRegistration: registerFunction(restFunctionHandle)

  activate PrestoRestFunctionRegistration
  PrestoRestFunctionRegistration->>PrestoRestFunctionRegistration: getRemoteFunctionServerUrl(restFunctionHandle)
  alt executionEndpoint present and non empty
    PrestoRestFunctionRegistration-->>PrestoRestFunctionRegistration: use restFunctionHandle.executionEndpoint
  else default endpoint
    PrestoRestFunctionRegistration-->>PrestoRestFunctionRegistration: use kRemoteFunctionServerRestURL_
  end

  PrestoRestFunctionRegistration->>PrestoRestFunctionRegistration: serialize RestFunctionHandle with url
  PrestoRestFunctionRegistration->>PrestoRestFunctionRegistration: check registeredFunctionHandles_
  alt already registered with same handle
    PrestoRestFunctionRegistration-->>VeloxExprConverter: return
  else new or changed registration
    PrestoRestFunctionRegistration->>PrestoRestFunctionRegistration: get or create RestRemoteClient for url
    PrestoRestFunctionRegistration->>RestRemoteClient: constructor(url) [if new]
    PrestoRestFunctionRegistration->>PrestoRestFunctionRegistration: buildVeloxSignatureFromPrestoSignature
    PrestoRestFunctionRegistration->>PrestoRestFunctionRegistration: getSerdeFormat
    PrestoRestFunctionRegistration->>PrestoRestFunctionRegistration: getFunctionName(functionId)
    PrestoRestFunctionRegistration->>RemoteFunctionServer: registerVeloxRemoteFunction(..., RestRemoteClient)
    PrestoRestFunctionRegistration->>PrestoRestFunctionRegistration: update registeredFunctionHandles_
    PrestoRestFunctionRegistration-->>VeloxExprConverter: return
  end
  deactivate PrestoRestFunctionRegistration

  VeloxExprConverter-->>PrestoCoordinator: CallTypedExpr with registered function name
Loading

Flow diagram for resolving remote function server URL from RestFunctionHandle

flowchart TD
  A[Start registration
RestFunctionHandle input] --> B{executionEndpoint present?}
  B -->|no| C[Use kRemoteFunctionServerRestURL_
from SystemConfig]
  B -->|yes| D{executionEndpoint empty?}
  D -->|yes| C
  D -->|no| E[Use executionEndpoint
from RestFunctionHandle]

  C --> F[Set url field in serialized
RestFunctionHandle to default URL]
  E --> F
  F --> G[Lookup or create RestRemoteClient
for resolved URL]
  G --> H[Register Velox remote function
with RestRemoteClient]
  H --> I[Update registeredFunctionHandles_]
  I --> J[End]
Loading

File-Level Changes

Change Details Files
Introduce PrestoRestFunctionRegistration singleton that manages REST remote function registration and supports per-handle executionEndpoint overrides.
  • Replace the free function registration API with a PrestoRestFunctionRegistration class exposing getInstance() and registerFunction().
  • Move getSerdeFormat, urlEncode, getFunctionName, and signature-building helpers into the class as static/private methods.
  • Add internal mutex and maps to cache serialized function handles and RestRemoteClient instances, and store the default remote function server base URL from SystemConfig.
  • Implement getRemoteFunctionServerUrl to prefer a non-empty executionEndpoint from RestFunctionHandle, falling back to the configured default URL, and use it when serializing the handle and choosing/creating the RestRemoteClient.
presto-native-execution/presto_cpp/main/functions/remote/PrestoRestFunctionRegistration.h
presto-native-execution/presto_cpp/main/functions/remote/PrestoRestFunctionRegistration.cpp
Wire the new registration manager into the Presto-to-Velox expression conversion path.
  • Update VeloxExprConverter to call PrestoRestFunctionRegistration::getInstance().registerFunction() instead of the old registerRestRemoteFunction free function when handling RestFunctionHandle expressions.
presto-native-execution/presto_cpp/main/types/PrestoToVeloxExpr.cpp
Normalize namespaces for REST remote function components under facebook::presto::functions::remote::rest.
  • Change namespaces for RestRemoteFunction, RestRemoteClient, tests, and test server helpers/handlers from facebook::presto::functions::rest (and sub-namespaces) to facebook::presto::functions::remote::rest (and ::test).
  • Adjust usages and type aliases (e.g., RestRemoteClientPtr, VeloxRemoteFunctionMetadata) and registration calls to match the new namespace layout.
presto-native-execution/presto_cpp/main/functions/remote/RestRemoteFunction.cpp
presto-native-execution/presto_cpp/main/functions/remote/RestRemoteFunction.h
presto-native-execution/presto_cpp/main/functions/remote/client/RestRemoteClient.cpp
presto-native-execution/presto_cpp/main/functions/remote/client/RestRemoteClient.h
presto-native-execution/presto_cpp/main/functions/remote/tests/RemoteFunctionRestTest.cpp
presto-native-execution/presto_cpp/main/functions/remote/tests/server/RemoteFunctionRestHandler.h
presto-native-execution/presto_cpp/main/functions/remote/tests/server/RemoteFunctionRestService.cpp
presto-native-execution/presto_cpp/main/functions/remote/tests/server/RemoteFunctionRestService.h
presto-native-execution/presto_cpp/main/functions/remote/tests/server/RestFunctionRegistry.cpp
presto-native-execution/presto_cpp/main/functions/remote/tests/server/RestFunctionRegistry.h
presto-native-execution/presto_cpp/main/functions/remote/tests/server/examples/RemoteDoubleDivHandler.h
presto-native-execution/presto_cpp/main/functions/remote/tests/server/examples/RemoteFibonacciHandler.h
presto-native-execution/presto_cpp/main/functions/remote/tests/server/examples/RemoteInverseCdfHandler.h
presto-native-execution/presto_cpp/main/functions/remote/tests/server/examples/RemoteRemoveCharHandler.h
presto-native-execution/presto_cpp/main/functions/remote/tests/server/examples/RemoteStrLenHandler.h
Add unit tests for PrestoRestFunctionRegistration URL resolution logic and wire them into the CMake test build.
  • Create PrestoRestFunctionRegistrationTest fixture that initializes SystemConfig and parses RestFunctionHandle JSON, and add multiple tests for getRemoteFunctionServerUrl covering executionEndpoint present, empty, absent, multiple distinct URLs, protocol differences, and complex URL forms.
  • Register the new GTest binary (presto_rest_function_registration_test) in CMake and link it against existing remote function/remote protocol libraries.
presto-native-execution/presto_cpp/main/functions/remote/tests/PrestoRestFunctionRegistrationTest.cpp
presto-native-execution/presto_cpp/main/functions/remote/tests/CMakeLists.txt

Assessment against linked issues

Issue Objective Addressed Explanation
#25331 Extend the REST function metadata/handle schema (e.g., JsonBasedUdfFunctionMetadata / RestFunctionHandle in presto-openapi, namespace manager, reference REST function server) with an optional executionEndpoint field so metadata servers can specify a different execution server URL per function. The PR only modifies the native C++ worker side: registration logic, client usage, namespaces, and tests. There are no changes to the REST function API schema, presto-openapi definitions, namespace manager, or reference REST function server to add or surface an executionEndpoint field in the metadata.
#25331 Update the C++ worker / native execution path so that when a RestFunctionHandle contains an executionEndpoint, the worker routes function execution to that URL; if it is absent or empty, the worker falls back to the configured default remote function server URL, preserving existing behavior.

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • PrestoRestFunctionRegistration caches kRemoteFunctionServerRestURL_ at singleton construction time, so if SystemConfig is reinitialized or the remoteFunctionServerRestURL changes at runtime the registration logic will continue using the old value; consider either documenting this snapshot behavior or resolving the URL from SystemConfig on each call instead.
  • registerFunction currently takes multiple sequential locks on registrationMutex_ for checking existing registrations, obtaining a client, and updating the map; you could simplify and reduce contention by holding a single lock for all three operations or by clearly separating read-only and mutation paths (e.g., using a shared mutex).
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- PrestoRestFunctionRegistration caches kRemoteFunctionServerRestURL_ at singleton construction time, so if SystemConfig is reinitialized or the remoteFunctionServerRestURL changes at runtime the registration logic will continue using the old value; consider either documenting this snapshot behavior or resolving the URL from SystemConfig on each call instead.
- registerFunction currently takes multiple sequential locks on registrationMutex_ for checking existing registrations, obtaining a client, and updating the map; you could simplify and reduce contention by holding a single lock for all three operations or by clearly separating read-only and mutation paths (e.g., using a shared mutex).

## Individual Comments

### Comment 1
<location> `presto-native-execution/presto_cpp/main/functions/remote/PrestoRestFunctionRegistration.cpp:109` </location>
<code_context>

   // Check if already registered (read-only, no lock needed for initial check)
   {
-    std::lock_guard<std::mutex> lock(registrationMutex);
</code_context>

<issue_to_address>
**nitpick:** The comment about not needing a lock no longer matches the code behavior.

This comment no longer matches the code: we now take `registrationMutex_` for this check to safely read the shared map. Please update or remove the comment so it reflects the current locking behavior.
</issue_to_address>

### Comment 2
<location> `presto-native-execution/presto_cpp/main/functions/remote/PrestoRestFunctionRegistration.cpp:29` </location>
<code_context>
-// @throws VeloxException if the configured format is unknown.
-PageFormat getSerdeFormat() {
+
+PrestoRestFunctionRegistration::PrestoRestFunctionRegistration()
+    : kRemoteFunctionServerRestURL_(
+          SystemConfig::instance()->remoteFunctionServerRestURL()) {}
</code_context>

<issue_to_address>
**suggestion (review_instructions):** The member name kRemoteFunctionServerRestURL_ doesn’t follow the specified naming convention for either member variables (camelCase_) or static constants (kPascalCase).

It looks like kRemoteFunctionServerRestURL_ is a (likely non-static) member variable, but its name mixes the "k" prefix (reserved for static constants) with a trailing underscore and PascalCase. To comply with the style guide, please either:
- rename it to camelCase_ (e.g., remoteFunctionServerRestUrl_) if it is a regular member, or
- if it is meant to be a static constant, use kPascalCase without the trailing underscore (e.g., kRemoteFunctionServerRestUrl) and make sure it is declared static const.

Right now it doesn’t clearly match either of the prescribed patterns.

<details>
<summary>Review instructions:</summary>

**Path patterns:** `presto-native-execution/**/*.hpp,presto-native-execution/**/*.hpp,presto-native-execution/**/*.cpp`

**Instructions:**
Use camelCase_ for private and protected members variables. Use kPascalCase for static constants and enumerators.

</details>
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@Joe-Abraham
Copy link
Contributor Author

@aditi-pandit, Can you please have a look?


} // namespace facebook::presto::functions::remote::rest

int main(int argc, char** argv) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think this code is needed.

Why are you making this an executable ? Can be avoided.

}
)JSON";

auto handle = parseRestFunctionHandle(jsonStr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add code to validate the function name and signature as well.

Copy link
Contributor

@aditi-pandit aditi-pandit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Joe-Abraham

Code looks good. Just have few review comments on the tests.

@Joe-Abraham
Copy link
Contributor Author

@sourcery-ai review

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • The PrestoRestFunctionRegistration singleton uses a single mutex to guard both the function-handle map and the client map; if this starts being used heavily, consider splitting the locks (or using finer-grained synchronization) to reduce contention around registerFunction.
  • In PrestoRestFunctionRegistrationTest::SetUp, you rely on configuring SystemConfig before getInstance() is called; it may be safer to document or enforce that PrestoRestFunctionRegistration is only constructed after config initialization to avoid subtle order-of-initialization issues in other call sites.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `PrestoRestFunctionRegistration` singleton uses a single mutex to guard both the function-handle map and the client map; if this starts being used heavily, consider splitting the locks (or using finer-grained synchronization) to reduce contention around `registerFunction`.
- In `PrestoRestFunctionRegistrationTest::SetUp`, you rely on configuring `SystemConfig` before `getInstance()` is called; it may be safer to document or enforce that `PrestoRestFunctionRegistration` is only constructed after config initialization to avoid subtle order-of-initialization issues in other call sites.

## Individual Comments

### Comment 1
<location> `presto-native-execution/presto_cpp/main/functions/remote/tests/PrestoRestFunctionRegistrationTest.cpp:44-53` </location>
<code_context>
+  EXPECT_EQ(httpsResult, "https://secure-server:8443");
+}
+
+TEST_F(
+    PrestoRestFunctionRegistrationTest,
+    getRemoteFunctionServerUrlWithComplexUrls) {
+  // Test with URLs containing paths and query parameters
+  const std::string jsonStr = R"JSON(
</code_context>

<issue_to_address>
**suggestion (testing):** Add tests that exercise `registerFunction` behavior, not only URL selection

The current tests thoroughly cover `getRemoteFunctionServerUrl`, but the new logic inside `PrestoRestFunctionRegistration::registerFunction` remains untested:

- Asserting the `url` field in the serialized handle matches `executionEndpoint` when set, and the default URL when unset/empty.
- Verifying the `RestRemoteClient` cache is keyed by URL (same client reused for the same endpoint, new clients for different endpoints).
- Confirming `registeredFunctionHandles_` prevents re-registration for identical handles but allows it when the handle changes.

Consider adding tests that call `registerFunction` with varying `RestFunctionHandle` instances and, via a test double or instrumentation on `registerVeloxRemoteFunction` / `RestRemoteClient`, validate the chosen URL, client reuse, and de-duplication behavior end-to-end.

Suggested implementation:

```cpp
  }
};

TEST_F(
    PrestoRestFunctionRegistrationTest,
    registerFunctionUsesExecutionEndpointWhenProvided) {
  // Handle with explicit executionEndpoint.
  const std::string handleWithEndpointJson = R"JSON(
    {
      "@type": "RestFunctionHandle",
      "functionId": "remote.testSchema.testFunction;BIGINT",
      "version": "v1",
      "signature": {
        "name": "testFunction",
        "kind": "SCALAR",
        "returnType": "bigint",
        "argumentTypes": ["bigint"],
        "typeVariableConstraints": [],
        "variableArity" : false,
        "literalParameters" : []
      },
      "remoteSource" : {
        "@type" : "PrestoThriftServiceSource",
        "httpClientConfig" : {
          "useHttps" : true
        },
        "serverUrl" : "http://default-server:8080",
        "executionEndpoint" : "https://exec-endpoint:8443"
      }
    }
  )JSON";

  // Handle without executionEndpoint (uses default URL).
  const std::string handleWithoutEndpointJson = R"JSON(
    {
      "@type": "RestFunctionHandle",
      "functionId": "remote.testSchema.testFunction;BIGINT",
      "version": "v1",
      "signature": {
        "name": "testFunction",
        "kind": "SCALAR",
        "returnType": "bigint",
        "argumentTypes": ["bigint"],
        "typeVariableConstraints": [],
        "variableArity" : false,
        "literalParameters" : []
      },
      "remoteSource" : {
        "@type" : "PrestoThriftServiceSource",
        "httpClientConfig" : {
          "useHttps" : true
        },
        "serverUrl" : "https://default-server:8443"
      }
    }
  )JSON";

  auto handleWithEndpoint = parseRestFunctionHandle(handleWithEndpointJson);
  auto handleWithoutEndpoint = parseRestFunctionHandle(handleWithoutEndpointJson);

  // First registration should serialize the handle using executionEndpoint.
  registration_.registerFunction(*handleWithEndpoint);
  const auto& lastHandleWithEndpoint = lastRegisteredHandleForTesting();
  EXPECT_EQ(
      lastHandleWithEndpoint.url,
      "https://exec-endpoint:8443");

  // Second registration should use the default URL when executionEndpoint is
  // not set.
  registration_.registerFunction(*handleWithoutEndpoint);
  const auto& lastHandleWithoutEndpoint = lastRegisteredHandleForTesting();
  EXPECT_EQ(
      lastHandleWithoutEndpoint.url,
      "https://default-server:8443");
}

TEST_F(
    PrestoRestFunctionRegistrationTest,
    registerFunctionClientCachingAndDeduplication) {
  const std::string baseHandleJson = R"JSON(
    {
      "@type": "RestFunctionHandle",
      "functionId": "remote.testSchema.testFunction;BIGINT",
      "version": "v1",
      "signature": {
        "name": "testFunction",
        "kind": "SCALAR",
        "returnType": "bigint",
        "argumentTypes": ["bigint"],
        "typeVariableConstraints": [],
        "variableArity" : false,
        "literalParameters" : []
      },
      "remoteSource" : {
        "@type" : "PrestoThriftServiceSource",
        "httpClientConfig" : {
          "useHttps" : true
        },
        "serverUrl" : "https://server-a:8443"
      }
    }
  )JSON";

  const std::string differentUrlHandleJson = R"JSON(
    {
      "@type": "RestFunctionHandle",
      "functionId": "remote.testSchema.testFunction;BIGINT",
      "version": "v1",
      "signature": {
        "name": "testFunction",
        "kind": "SCALAR",
        "returnType": "bigint",
        "argumentTypes": ["bigint"],
        "typeVariableConstraints": [],
        "variableArity" : false,
        "literalParameters" : []
      },
      "remoteSource" : {
        "@type" : "PrestoThriftServiceSource",
        "httpClientConfig" : {
          "useHttps" : true
        },
        "serverUrl" : "https://server-b:8443"
      }
    }
  )JSON";

  auto baseHandle = parseRestFunctionHandle(baseHandleJson);
  auto differentUrlHandle = parseRestFunctionHandle(differentUrlHandleJson);

  // First registration for server-a should create a new client and handle.
  auto initialHandlesSize = registeredFunctionHandlesSizeForTesting();
  registration_.registerFunction(*baseHandle);
  EXPECT_EQ(
      registeredFunctionHandlesSizeForTesting(),
      initialHandlesSize + 1);
  auto clientA1 = restClientForUrlForTesting("https://server-a:8443");
  ASSERT_NE(clientA1, nullptr);

  // Registering an identical handle should not register again and should reuse
  // the same client.
  registration_.registerFunction(*baseHandle);
  EXPECT_EQ(
      registeredFunctionHandlesSizeForTesting(),
      initialHandlesSize + 1);
  auto clientA2 = restClientForUrlForTesting("https://server-a:8443");
  ASSERT_NE(clientA2, nullptr);
  EXPECT_EQ(clientA1.get(), clientA2.get());

  // Registering a handle pointing to a different URL should create a new
  // client and a new registered handle entry.
  registration_.registerFunction(*differentUrlHandle);
  EXPECT_EQ(
      registeredFunctionHandlesSizeForTesting(),
      initialHandlesSize + 2);
  auto clientB = restClientForUrlForTesting("https://server-b:8443");
  ASSERT_NE(clientB, nullptr);
  EXPECT_NE(clientA1.get(), clientB.get());
}

TEST_F(
    PrestoRestFunctionRegistrationTest,
    getRemoteFunctionServerUrlWithExecutionEndpoint) {

```

To make these tests compile and correctly exercise `registerFunction`, the following changes will be required elsewhere in the test fixture / production code:

1. In `PrestoRestFunctionRegistrationTest` (test fixture class in this file):
   - Ensure there is a member under test named `registration_` of type `PrestoRestFunctionRegistration` (or adjust the test to the actual member name/type).
2. In `PrestoRestFunctionRegistration` (or via a small wrapper used only in tests), add testing accessors that expose the necessary state:
   - `const SerializedRestFunctionHandle& lastRegisteredHandleForTesting() const;`
     - Returns the last handle object passed into `registerVeloxRemoteFunction` (or the equivalent serialized representation).
     - The type must provide a `std::string url` field that the tests assert on.
   - `size_t registeredFunctionHandlesSizeForTesting() const;`
     - Returns `registeredFunctionHandles_.size()` to verify de-duplication.
   - `std::shared_ptr<velox::exec::remote::RestRemoteClient> restClientForUrlForTesting(const std::string& url) const;`
     - Looks up the client in the internal cache keyed by URL and returns it (or `nullptr` if missing).
3. If you prefer not to add methods directly on `PrestoRestFunctionRegistration`, you can:
   - Expose the above helpers as methods on `PrestoRestFunctionRegistrationTest` and implement them using `friend class PrestoRestFunctionRegistrationTest;` inside `PrestoRestFunctionRegistration` to access private members.
4. If the actual URL field name or type used in the serialized handle differs from `url`, adjust the expectations in `registerFunctionUsesExecutionEndpointWhenProvided` accordingly.
5. If `registerFunction` has a different signature (e.g., takes a `std::shared_ptr<protocol::RestFunctionHandle>`), update the calls in the tests to match the real signature.
</issue_to_address>

### Comment 2
<location> `presto-native-execution/presto_cpp/main/functions/remote/PrestoRestFunctionRegistration.cpp:30` </location>
<code_context>
-PageFormat getSerdeFormat() {
+
+PrestoRestFunctionRegistration::PrestoRestFunctionRegistration()
+    : kRemoteFunctionServerRestURL_(
+          SystemConfig::instance()->remoteFunctionServerRestURL()) {}
+
</code_context>

<issue_to_address>
**issue (review_instructions):** The member name `kRemoteFunctionServerRestURL_` violates the camelCase_ rule for private members and reserves the `k` prefix for static constants/enumerators.

Private/protected member variables should be named in `camelCase_`, and `kPascalCase` is reserved for static constants/enumerators. Here `kRemoteFunctionServerRestURL_` appears to be an instance member, so something like `remoteFunctionServerRestUrl_` (or similar camelCase_) would align with the naming guidelines.

<details>
<summary>Review instructions:</summary>

**Path patterns:** `presto-native-execution/**/*.hpp,presto-native-execution/**/*.hpp,presto-native-execution/**/*.cpp`

**Instructions:**
Use camelCase_ for private and protected members variables.
Use kPascalCase for static constants and enumerators.

</details>
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@Joe-Abraham
Copy link
Contributor Author

Thanks @aditi-pandit, I have update the test case, can you please have a look?

Copy link
Contributor

@aditi-pandit aditi-pandit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Joe-Abraham

@aditi-pandit aditi-pandit changed the title feat(native): Add support for custom execution point feat(native): Add support for custom execution point in Rest functions Dec 16, 2025
@aditi-pandit aditi-pandit merged commit c919610 into prestodb:master Dec 16, 2025
82 of 83 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

from:IBM PR from IBM

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for separate metadata and execution servers in REST function API

3 participants