Skip to content

Conversation

@JiaqiWang18
Copy link
Contributor

@JiaqiWang18 JiaqiWang18 commented Oct 8, 2025

What changes were proposed in this pull request?

  • Create the spark connect proto for SDP sinks
  • The encapsulating DefineDataset renamed to DefineOutput since sink isn't a "dataset".

Why are the changes needed?

To be able to issue these requests from client to server.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Proto changes

Was this patch authored or co-authored using generative AI tooling?

No

@JiaqiWang18 JiaqiWang18 force-pushed the SPARK-53850-sdp-sinks-proto branch from 9ace5ed to cc15705 Compare October 8, 2025 23:35
This reverts commit cc15705.
@JiaqiWang18 JiaqiWang18 force-pushed the SPARK-53850-sdp-sinks-proto branch from 8ad194c to f5a5fef Compare October 9, 2025 04:38
@JiaqiWang18 JiaqiWang18 force-pushed the SPARK-53850-sdp-sinks-proto branch from f5a5fef to 2541b01 Compare October 9, 2025 04:40
@JiaqiWang18
Copy link
Contributor Author

@sryza

}

// Metadata that's only applicable to external sinks.
message ExternalSinkDetails {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason we need "external" in the name? Why not just call them "sink details"?

@JiaqiWang18 JiaqiWang18 changed the title [SPARK-53850][SDP] Define proto for Sinks [SPARK-53850][SDP] Define proto for Sinks and Rename DefineDataset to DefineOutput Oct 9, 2025
@JiaqiWang18
Copy link
Contributor Author

Also had to do some code related refactoring since registerDataset(dataset: Output) doesn't really make sense

@JiaqiWang18 JiaqiWang18 requested a review from sryza October 9, 2025 17:37
Copy link
Contributor

@sryza sryza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small comment – otherwise LGTM!

// The type of output.
enum OutputType {
// Safe default value. Should not be used.
DATASET_TYPE_UNSPECIFIED = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be OUTPUT_TYPE_UNSPECIFIED?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, fixed

@sryza sryza closed this in efcc8f6 Oct 9, 2025
@dongjoon-hyun
Copy link
Member

+1, LGTM, too.

dongjoon-hyun added a commit to apache/spark-connect-swift that referenced this pull request Oct 27, 2025
…th `4.1.0-preview3` RC1

### What changes were proposed in this pull request?

This PR aims to update Spark Connect-generated Swift source code with Apache Spark `4.1.0-preview3` RC1.

### Why are the changes needed?

There are many changes between Apache Spark 4.1.0-preview2 and preview3.

- apache/spark#52685
- apache/spark#52613
- apache/spark#52553
- apache/spark#52532
- apache/spark#52517
- apache/spark#52514
- apache/spark#52487
- apache/spark#52328
- apache/spark#52200
- apache/spark#52154
- apache/spark#51344

To use the latest bug fixes and new messages to develop for new features of `4.1.0-preview3`.

```
$ git clone -b v4.1.0-preview3 https://github.com/apache/spark.git
$ cd spark/sql/connect/common/src/main/protobuf/
$ protoc --swift_out=. spark/connect/*.proto
$ protoc --grpc-swift_out=. spark/connect/*.proto

// Remove empty GRPC files
$ cd spark/connect
$ grep 'This file contained no services' * | awk -F: '{print $1}' | xargs rm
```

### Does this PR introduce _any_ user-facing change?

Pass the CIs.

### How was this patch tested?

Pass the CIs. I manually tested with `Apache Spark 4.1.0-preview3` (with the two SDP ignored tests).

```
$ swift test --no-parallel
...
✔ Test run with 203 tests in 21 suites passed after 19.088 seconds.
```
```

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #252 from dongjoon-hyun/SPARK-54043.

Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
huangxiaopingRD pushed a commit to huangxiaopingRD/spark that referenced this pull request Nov 25, 2025
… DefineOutput

### What changes were proposed in this pull request?

* Create the spark connect proto for SDP sinks
* The encapsulating `DefineDataset` renamed to `DefineOutput` since sink isn't a "dataset".

### Why are the changes needed?

To be able to issue these requests from client to server.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Proto changes

### Was this patch authored or co-authored using generative AI tooling?

No

Closes apache#52553 from JiaqiWang18/SPARK-53850-sdp-sinks-proto.

Authored-by: Jacky Wang <[email protected]>
Signed-off-by: Sandy Ryza <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants