Skip to content

Conversation

@JiaqiWang18
Copy link
Contributor

@JiaqiWang18 JiaqiWang18 commented Jul 18, 2025

What changes were proposed in this pull request?

Remove the once arg in @append_flow decorator for SDP

def append_flow(
    *,
    target: str,
    name: Optional[str] = None,
    spark_conf: Optional[Dict[str, str]] = None,
    once: bool = False,                                              # <--- removed
) -> Callable[[QueryFunction], None]:

Also removes once field from the DefineFlow proto.

The SQL API CreateFlowCommand does not take in a once argument so no change is needed there. The argument is hardcoded to be False: (source)

Why are the changes needed?

The Declarative Pipelines append_flow decorator includes a once argument, which, if True, indicates the flow should run only once. (It will be rerun upon a full refresh operation.)

However, the server does not currently implement this behavior yet. To avoid accidentally releasing APIs that don't actually work, we should take these arguments out for now. And add them back in when we actually support this functionality.

Does this PR introduce any user-facing change?

Yes, an argument was removed from a public API. However, SDP is not released yet, so there will be minimal user impact.

How was this patch tested?

Modified existing testcases to ensure consistency.

Was this patch authored or co-authored using generative AI tooling?

No

@JiaqiWang18 JiaqiWang18 changed the title [SPARK-52714] Remove public APIs for append once flows [SPARK-52851] Remove public APIs for append once flows Jul 18, 2025
@JiaqiWang18 JiaqiWang18 force-pushed the SPARK-52851-remove-append-once-arg branch from 9538639 to 4f76e5a Compare July 18, 2025 18:09
@JiaqiWang18
Copy link
Contributor Author

@anishm-db @sryza

@JiaqiWang18 JiaqiWang18 changed the title [SPARK-52851] Remove public APIs for append once flows [SPARK-52851][SDP] Remove public APIs for append once flows Jul 18, 2025
Copy link
Contributor

@sryza sryza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for picking this up @JiaqiWang18 . Are we also able to remove property from the DefineFlow proto definition – that's technically a public API as well. And also, is their SQL syntax that supports this?

@JiaqiWang18
Copy link
Contributor Author

Thanks for picking this up @JiaqiWang18 . Are we also able to remove property from the DefineFlow proto definition – that's technically a public API as well. And also, is their SQL syntax that supports this?

Yeah, just removed once from DefineFlow. The SQL syntax doesn't support using ONCE in the query according to this CreateFlowCommand. It is hardcoding the value to be false here.

@JiaqiWang18 JiaqiWang18 requested a review from sryza July 21, 2025 20:02
Copy link
Contributor

@sryza sryza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@sryza sryza closed this in 4a45054 Jul 21, 2025
@sryza
Copy link
Contributor

sryza commented Jul 21, 2025

Merged to master

dongjoon-hyun added a commit to apache/spark-connect-swift that referenced this pull request Oct 1, 2025
…th `4.1.0-preview2`

### What changes were proposed in this pull request?

This PR aims to update Spark Connect-generated Swift source code with Apache Spark `4.1.0-preview2`.

### Why are the changes needed?

There are many changes from Apache Spark 4.1.0.

- apache/spark#52342
- apache/spark#52256
- apache/spark#52271
- apache/spark#52242
- apache/spark#51473
- apache/spark#51653
- apache/spark#52072
- apache/spark#51561
- apache/spark#51563
- apache/spark#51489
- apache/spark#51507
- apache/spark#51462
- apache/spark#51464
- apache/spark#51442

To use the latest bug fixes and new messages to develop for new features of `4.1.0-preview2`.

```
$ git clone -b v4.1.0-preview2 https://github.com/apache/spark.git
$ cd spark/sql/connect/common/src/main/protobuf/
$ protoc --swift_out=. spark/connect/*.proto
$ protoc --grpc-swift_out=. spark/connect/*.proto

// Remove empty GRPC files
$ cd spark/connect

$ grep 'This file contained no services' *
catalog.grpc.swift:// This file contained no services.
commands.grpc.swift:// This file contained no services.
common.grpc.swift:// This file contained no services.
example_plugins.grpc.swift:// This file contained no services.
expressions.grpc.swift:// This file contained no services.
ml_common.grpc.swift:// This file contained no services.
ml.grpc.swift:// This file contained no services.
pipelines.grpc.swift:// This file contained no services.
relations.grpc.swift:// This file contained no services.
types.grpc.swift:// This file contained no services.

$ rm catalog.grpc.swift commands.grpc.swift common.grpc.swift example_plugins.grpc.swift expressions.grpc.swift ml_common.grpc.swift ml.grpc.swift pipelines.grpc.swift relations.grpc.swift types.grpc.swift
```

### Does this PR introduce _any_ user-facing change?

Pass the CIs.

### How was this patch tested?

Pass the CIs.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #250 from dongjoon-hyun/SPARK-53777.

Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants