test: Refactor DeltaQueryRunner and add NativeDeltaQueryRunner support#26815
test: Refactor DeltaQueryRunner and add NativeDeltaQueryRunner support#26815mohsaka merged 1 commit intoprestodb:masterfrom
Conversation
Reviewer's GuideRefactors Delta connector tests to share a reusable DeltaQueryRunner, and introduces a native-execution-aware DeltaQueryRunner builder wired into PrestoNativeQueryRunnerUtils, plus required test dependencies for running Delta tests against both Java and native engines. Class diagram for refactored DeltaQueryRunner and native integrationclassDiagram
class AbstractDeltaDistributedQueryTestBase {
-QueryRunner queryRunner
+void createQueryRunner()
+void tearDown()
}
class DeltaQueryRunner {
+QueryRunner createDeltaQueryRunner()
+QueryRunner createDeltaQueryRunnerWithFiles(String catalogName, String schemaName)
+QueryRunner createDeltaQueryRunnerWithNativeSupport()
}
class NativeDeltaQueryRunner {
+QueryRunner createNativeDeltaQueryRunner()
}
class PrestoNativeQueryRunnerUtils {
+QueryRunner createDeltaNativeQueryRunner()
}
AbstractDeltaDistributedQueryTestBase --> DeltaQueryRunner : uses
PrestoNativeQueryRunnerUtils --> NativeDeltaQueryRunner : uses
NativeDeltaQueryRunner --> DeltaQueryRunner : delegates_configuration
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey there - I've reviewed your changes - here's some feedback:
- In
DeltaQueryRunnerBuilderthe fieldsdataDirectory,catalogType,extraConnectorProperties, andDELTA_DEFAULT_STORAGE_FORMATare never used; either wire them into the builder logic or remove them to keep the delta runner configuration minimal and clearer. - The comment above
useExternalWorkerLauncherinDeltaQueryRunnerBuilderstill refers to "the native iceberg query runner" and should be updated to accurately describe the delta-specific behavior. - There is a typo in
DeltaQueryRunnerBuilder: the fieldcaseSensitiveParitionsand its uses should be renamed tocaseSensitivePartitionsfor consistency and readability.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `DeltaQueryRunnerBuilder` the fields `dataDirectory`, `catalogType`, `extraConnectorProperties`, and `DELTA_DEFAULT_STORAGE_FORMAT` are never used; either wire them into the builder logic or remove them to keep the delta runner configuration minimal and clearer.
- The comment above `useExternalWorkerLauncher` in `DeltaQueryRunnerBuilder` still refers to "the native iceberg query runner" and should be updated to accurately describe the delta-specific behavior.
- There is a typo in `DeltaQueryRunnerBuilder`: the field `caseSensitiveParitions` and its uses should be renamed to `caseSensitivePartitions` for consistency and readability.
## Individual Comments
### Comment 1
<location> `presto-delta/src/test/java/com/facebook/presto/delta/DeltaQueryRunner.java:76-78` </location>
<code_context>
+ return this;
+ }
+
+ public Builder setExtraProperties(Map<String, String> extraProperties)
+ {
+ this.extraProperties = extraProperties;
+ return this;
+ }
</code_context>
<issue_to_address>
**nitpick (bug_risk):** Defensive copy of extraProperties would make the shared test query runner more robust
This stores the `extraProperties` map reference directly, so later mutations of the original map could affect other tests using the shared runner. To keep tests isolated, make a defensive copy here (e.g., `this.extraProperties = new HashMap<>(extraProperties);` or `ImmutableMap.copyOf(extraProperties)`).
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
presto-delta/src/test/java/com/facebook/presto/delta/DeltaQueryRunner.java
Outdated
Show resolved
Hide resolved
9bc84a3 to
6d743b4
Compare
aditi-pandit
left a comment
There was a problem hiding this comment.
Thanks @mohsaka for this refactoring. Have few comments.
presto-delta/src/test/java/com/facebook/presto/delta/DeltaQueryRunner.java
Show resolved
Hide resolved
...e-execution/src/test/java/com/facebook/presto/nativeworker/PrestoNativeQueryRunnerUtils.java
Show resolved
Hide resolved
| } | ||
| } | ||
|
|
||
| public static DeltaQueryRunnerBuilder nativeDeltaQueryRunnerBuilder() |
There was a problem hiding this comment.
This use of DeltaQueryRunnerBuilder in both places is hard to read. Does the nested DeltaQueryRunnerBuilder really need to be a class ? Can these be just functions instead ?
There was a problem hiding this comment.
Currently following what both Hive and Iceberg have through
public static IcebergQueryRunnerBuilder nativeIcebergQueryRunnerBuilder()
{
return new IcebergQueryRunnerBuilder(QueryRunnerType.NATIVE);
}
public static IcebergQueryRunnerBuilder javaIcebergQueryRunnerBuilder()
{
return new IcebergQueryRunnerBuilder(QueryRunnerType.JAVA);
}
and
public static HiveQueryRunnerBuilder nativeHiveQueryRunnerBuilder()
{
return new HiveQueryRunnerBuilder(QueryRunnerType.NATIVE);
}
public static HiveQueryRunnerBuilder javaHiveQueryRunnerBuilder()
{
return new HiveQueryRunnerBuilder(QueryRunnerType.JAVA);
}
These are functions and can be called via the Utility class ex.
QueryRunner queryRunner = PrestoNativeQueryRunnerUtils.nativeDeltaQueryRunnerBuilder().build();
There was a problem hiding this comment.
So what was hard for me to get my head around was that there is DeltaQueryRunner and DeltaQueryRunner.Builder and then there is DeltaQueryRunnerBuilder.build as well. And DeltaQueryRunnerBuilder.build returns a DeltaQueryRunner.
You are right that this degree of indirection is there for Hive and Iceberg as well. Though would it be possible to eliminate it ?
There was a problem hiding this comment.
We could eliminate it, but I believe the reason we have it is for two things. Maybe I can add a comment explaining this.
- We have a DeltaQueryRunner. This comes with a builder DeltaQueryRunner builder. This is used for Java test cases.
- We have a NativeDeltaQueryRunner builder. This one is used for native tests and can create both Java and Native query runners so that they can do comparisons.
The NativeDeltaQueryRunner has a builder that calls DeltaQueryRunner builder. We mainly don't want to have native code in the java code. So thats why theres this separation of classes.
If we want to have one, we can replace the DeltaQueryRunner with one that does native and java. But then we will be going through native code on the java code path. When it should only be the other way around. We would also have to move over all of the native utilities for the setup into the presto side somewhere.
I think the reason it is not called NativeDeltaQueryRunnerBuilder is because it can make both native and Java. Also its already under presto-native-execution.
There was a problem hiding this comment.
Added comment explaining this in more detail.
There was a problem hiding this comment.
Thanks for explaining. I see the distinction now... We can leave the original DeltaQueryRunner as be.
e6de878 to
c2c9dab
Compare
8d6329e to
c78cb7e
Compare
| } | ||
| } | ||
|
|
||
| public static DeltaQueryRunnerBuilder nativeDeltaQueryRunnerBuilder() |
There was a problem hiding this comment.
Thanks for explaining. I see the distinction now... We can leave the original DeltaQueryRunner as be.
42a7092 to
7e0d554
Compare
tdcmeehan
left a comment
There was a problem hiding this comment.
Just some nits, LGTM, thanks for this refactoring
| @@ -39,6 +42,22 @@ jobs: | |||
| permissions: | |||
| contents: read | |||
| steps: | |||
| # We cannot use the github action to free disk space from the runner | |||
| # because we are in the container and not on the runner anymore. | |||
| - name: Free Disk Space | |||
| run: | | |||
| # Re-used from free-disk-space github action. | |||
| getAvailableSpace() { echo $(df -a $1 | awk 'NR > 1 {avail+=$4} END {print avail}'); } | |||
| # Show before | |||
| echo "Original available disk space: " $(getAvailableSpace) | |||
| # Remove DotNet. | |||
| rm -rf /host_usr/share/dotnet || true | |||
| # Remove android | |||
| rm -rf /host_usr/local/lib/android || true | |||
| # Remove CodeQL | |||
| rm -rf /host_opt/hostedtoolcache/CodeQL || true | |||
| # Show after | |||
| echo "New available disk space: " $(getAvailableSpace) | |||
There was a problem hiding this comment.
Can you please put this in a separate PR?
There was a problem hiding this comment.
Removed as it's no longer necessary per #26876
|
|
||
| public Builder setExternalWorkerLauncher(Optional<BiFunction<Integer, URI, Process>> externalWorkerLauncher) | ||
| { | ||
| this.externalWorkerLauncher = externalWorkerLauncher; |
There was a problem hiding this comment.
| this.externalWorkerLauncher = externalWorkerLauncher; | |
| this.externalWorkerLauncher = requireNonNull(externalWorkerLauncher, "externalWorkerLauncher is null"); |
Please do this for the other methods too, and consider the Sourcery feedback to make a defensive copy for extraProperties.
| import static java.util.Locale.US; | ||
| import static java.util.Objects.requireNonNull; | ||
|
|
||
| public class DeltaQueryRunner |
There was a problem hiding this comment.
Perhaps let's add a setupLogging method as IcebergQueryRunner does, so we don't spam the logs.
c6bd48b to
5cb0b9d
Compare
Co-authored-by: unidevel <unidevel@hotmail.com>
5cb0b9d to
7e89f82
Compare
|
@tdcmeehan Thanks for the review! I have addressed the comments. |
prestodb#26815) ## Description <!---Describe your changes in detail--> To prepare for delta prestissimo support and following the organization of hive and iceberg, separate out a DeltaQueryRunner from existing delta test cases. Also add in delta native query runner execution support. ## Motivation and Context <!---Why is this change required? What problem does it solve?--> <!---If it fixes an open issue, please link to the issue here.--> We will be adding in delta connector support in Prestissimo. This is to run the current Java delta tests against Prestissimo in a future PR. ## Impact <!---Describe any public API or user-facing feature change or any performance impact--> None ## Test Plan <!---Please fill in how you tested your change--> Ran test cases: IncrementalUpdateQueriesTest TestUppercasePartitionColumns TestDeltaScanOptimizations TestDeltaIntegration ## Contributor checklist - [ ] Please make sure your submission complies with our [contributing guide](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md), in particular [code style](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#code-style) and [commit standards](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#commit-standards). - [ ] PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced. - [ ] Documented new properties (with its default value), SQL syntax, functions, or other functionality. - [ ] If release notes are required, they follow the [release notes guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines). - [ ] Adequate tests were added if applicable. - [ ] CI passed. - [ ] If adding new dependencies, verified they have an [OpenSSF Scorecard](https://securityscorecards.dev/#the-checks) score of 5.0 or higher (or obtained explicit TSC approval for lower scores). ## Release Notes Please follow [release notes guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines) and fill in the release notes below. ``` == NO RELEASE NOTE == ``` Co-authored-by: unidevel <unidevel@hotmail.com>
Description
To prepare for delta prestissimo support and following the organization of hive and iceberg, separate out a DeltaQueryRunner from existing delta test cases. Also add in delta native query runner execution support.
Motivation and Context
We will be adding in delta connector support in Prestissimo. This is to run the current Java delta tests against Prestissimo in a future PR.
Impact
None
Test Plan
Ran test cases:
IncrementalUpdateQueriesTest
TestUppercasePartitionColumns
TestDeltaScanOptimizations
TestDeltaIntegration
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.