[Bug]: BigtableSource "Desired bundle size 0 bytes must be greater than 0" #28793
Closed
2 of 16 tasks
Labels
Milestone
In short,
targetParallelism
≥BigtableSource#getEstimatedSizeBytes
; thendesiredBundleSizeBytes
is set to0
; whichBigtableSource#splitKeyRangeIntoBundleSizedSubranges
angry.What happened?
Imagine a case where in:
beam/runners/direct-java/src/main/java/org/apache/beam/runners/direct/BoundedReadEvaluatorFactory.java
Lines 215 to 217 in 282d027
targetParallelism
is32
; andsource.getEstimatedByteSize()
is10
then
bytesPerBundle
will be0
so
beam/runners/direct-java/src/main/java/org/apache/beam/runners/direct/BoundedReadEvaluatorFactory.java
Line 217 in 282d027
will be called with the values:
split.source(0L, options)
In
OffsetBasedSource#split
, this desired-0-sized split is handled:beam/sdks/java/core/src/main/java/org/apache/beam/sdk/io/OffsetBasedSource.java
Lines 115 to 116 in 282d027
But
BigtableSource#split
does not seem to handle the desired-0-sized split:beam/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java
Lines 1328 to 1333 in 282d027
so a few frames down the road from
BigtableSource#split
you'll end up violating thischeckArgument
inBigtableSource#splitKeyRangeIntoBundleSizedSubranges
:beam/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java
Lines 1623 to 1626 in 71c8459
Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
Issue Components
The text was updated successfully, but these errors were encountered: