Skip to content

Commit a234cc6

Browse files
lw-linmarmbrus
authored andcommitted
[SPARK-14874][SQL][STREAMING] Remove the obsolete Batch representation
## What changes were proposed in this pull request? The `Batch` class, which had been used to indicate progress in a stream, was abandoned by [[SPARK-13985][SQL] Deterministic batches with ids](apache@caea152) and then became useless. This patch: - removes the `Batch` class - ~~does some related renaming~~ (update: this has been reverted) - fixes some related comments ## How was this patch tested? N/A Author: Liwei Lin <[email protected]> Closes apache#12638 from lw-lin/remove-batch.
1 parent 7dd01d9 commit a234cc6

File tree

5 files changed

+4
-30
lines changed

5 files changed

+4
-30
lines changed

sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/Batch.scala

Lines changed: 0 additions & 26 deletions
This file was deleted.

sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -88,7 +88,7 @@ class FileStreamSource(
8888
}
8989

9090
/**
91-
* Returns the next batch of data that is available after `start`, if any is available.
91+
* Returns the data that is between the offsets (`start`, `end`].
9292
*/
9393
override def getBatch(start: Option[Offset], end: Offset): DataFrame = {
9494
val startId = start.map(_.asInstanceOf[LongOffset].offset).getOrElse(-1L)

sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/Sink.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ import org.apache.spark.sql.DataFrame
2727
trait Sink {
2828

2929
/**
30-
* Adds a batch of data to this sink. The data for a given `batchId` is deterministic and if
30+
* Adds a batch of data to this sink. The data for a given `batchId` is deterministic and if
3131
* this method is called more than once with the same batchId (which will happen in the case of
3232
* failures), then `data` should only be added once.
3333
*/

sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/Source.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ trait Source {
3434
def getOffset: Option[Offset]
3535

3636
/**
37-
* Returns the data that is between the offsets (`start`, `end`]. When `start` is `None` then
37+
* Returns the data that is between the offsets (`start`, `end`]. When `start` is `None` then
3838
* the batch should begin with the first available record. This method must always return the
3939
* same data for a particular `start` and `end` pair.
4040
*/

sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/memory.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -91,7 +91,7 @@ case class MemoryStream[A : Encoder](id: Int, sqlContext: SQLContext)
9191
}
9292

9393
/**
94-
* Returns the next batch of data that is available after `start`, if any is available.
94+
* Returns the data that is between the offsets (`start`, `end`].
9595
*/
9696
override def getBatch(start: Option[Offset], end: Offset): DataFrame = {
9797
val startOrdinal =

0 commit comments

Comments
 (0)