-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-40411][SS] Refactor FlatMapGroupsWithStateExec to have a parent trait #37859
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc @HeartSaVioR and @viirya FYI |
HeartSaVioR
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks OK in general. Minor/nit comments.
...ore/src/main/scala/org/apache/spark/sql/execution/streaming/FlatMapGroupsWithStateExec.scala
Outdated
Show resolved
Hide resolved
...ore/src/main/scala/org/apache/spark/sql/execution/streaming/FlatMapGroupsWithStateExec.scala
Outdated
Show resolved
Hide resolved
|
|
||
| private val isTimeoutEnabled = timeoutConf != NoTimeout | ||
| private val watermarkPresent = child.output.exists { | ||
| val stateInfo: Option[StatefulOperatorStateInfo] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stateInfo / eventTimeWatermark <= would they work if we change them to protected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember I exposed them for a reason at that time.. let me change this to protected and see if it passes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error is like this:
[error] /.../spark/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FlatMapGroupsWithStateExec.scala:51:17: overriding method stateInfo in trait StatefulOperator of type => Option[org.apache.spark.sql.execution.streaming.StatefulOperatorStateInfo];
[error] value stateInfo has weaker access privileges; it should be public
[error] protected val stateInfo: Option[StatefulOperatorStateInfo]
[error] ^
[error] /.../spark/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FlatMapGroupsWithStateExec.scala:57:17: overriding method eventTimeWatermark in trait WatermarkSupport of type => Option[Long];
[error] value eventTimeWatermark has weaker access privileges; it should be public
[error] protected val eventTimeWatermark: Option[Long]
[error] ^
[error] two errors found
[error] (sql / Compile / compileIncremental) Compilation failed
[error] Total time: 160 s (02:40), completed Sep 13, 2022 1:03:01 PM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah OK that was from another trait. Thanks for explanation.
7411726 to
04320b8
Compare
|
Let me get this in - I am pretty much confident that this is the right change too :-). Merged to master. |
| import org.apache.spark.util.{CompletionIterator, SerializableConfiguration} | ||
|
|
||
| /** | ||
| * Physical operator for executing `FlatMapGroupsWithState` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's nice if we could update this doc too.
| private val getValueObj = | ||
| ObjectOperator.deserializeRowToObject(valueDeserializer, dataAttributes) | ||
| private val getOutputRow = ObjectOperator.wrapObjectToRow(outputObjectType) | ||
| abstract class InputProcessor(store: StateStore) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe good to have some doc for this abstract class?
viirya
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks okay just minor comments about doc.
What changes were proposed in this pull request?
This PR proposes to factor the common attributes out from
FlatMapGroupsWithStateExectoFlatMapGroupsWithStateExecBase.Why are the changes needed?
There are a lot of stuff to share if you implement another version of
FlatMapGroupsWithStateExec.Should better factor them out. This is also part of #37285 which demonstrates how the refactored trait is used.
Does this PR introduce any user-facing change?
No, this is refactoring-only.
How was this patch tested?
Existing test cases should cover it.