-
Notifications
You must be signed in to change notification settings - Fork 254
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
When using native_datafusion scan, we sometimes try and construct a HashJoinExec with different number of partitions for left and right inputs. This issue is currently hidden because we wrap the inputs in a CopyExec which always reports output partition count of 1.
Here is debug logging for one example. The left input has 5 partitions and the right input has 1 partition.
LEFT: FilterExec: c0@0 IS NOT NULL
DataSourceExec: file_groups={5 groups: [[tmp/CometFuzzTestSuite_1761747873664.parquet/part-00002-e5c142ac-d0f8-4eb4-891c-4484865ded05-c000.snappy.parquet:0..133654], [tmp/CometFuzzTestSuite_1761747873664.parquet/part-00003-e5c142ac-d0f8-4eb4-891c-4484865ded05-c000.snappy.parquet:0..133566], [tmp/CometFuzzTestSuite_1761747873664.parquet/part-00000-e5c142ac-d0f8-4eb4-891c-4484865ded05-c000.snappy.parquet:0..133492], [tmp/CometFuzzTestSuite_1761747873664.parquet/part-00004-e5c142ac-d0f8-4eb4-891c-4484865ded05-c000.snappy.parquet:0..133471], [tmp/CometFuzzTestSuite_1761747873664.parquet/part-00001-e5c142ac-d0f8-4eb4-891c-4484865ded05-c000.snappy.parquet:0..132287]]}, projection=[c0], file_type=parquet, predicate=c0@0 IS NOT NULL, pruning_predicate=c0_null_count@1 != row_count@0, required_guarantees=[]
RIGHT: ScanExec: source=[BroadcastQueryStage (unknown), Statistics(sizeInBytes=1384.0 B, rowCount=1.00E+3)], schema=[col_0: Boolean]
Steps to reproduce
No response
Expected behavior
No response
Additional context
No response
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working