Skip to content

Performance Optimization: Propagate Runtime Filters Through Join Equivalence Classes #18857

@SkyFan2002

Description

@SkyFan2002

Summary
When equi-join columns (e.g., t1.c1 = t2.c1 = t3.c1) form an equivalence class, runtime filters should propagate to all members. Currently, they attach only to the probe key, causing unnecessary data scans and compute.

For example:

explain WITH cte1 AS (
  SELECT
    t1.c1
  FROM
    t1,
    t2
  WHERE
    t1.c1 = t2.c1
    AND t1.c2 = t2.c2
  GROUP BY
    t1.c1
)
SELECT
  *
FROM
  cte1
  JOIN t3 ON cte1.c1 = t3.c1
WHERE
  t3.c2 = 1

-[ EXPLAIN ]-----------------------------------
HashJoin
├── output columns: [t1.c1 (#0), t3.c2 (#5), t3.c1 (#4)]
├── join type: INNER
├── build keys: [t3.c1 (#4)]
├── probe keys: [cte1.c1 (#0)]
├── keys is null equal: [false]
├── filters: []
├── build join filters:
│   └── filter id:2, build key:t3.c1 (#4), probe key:cte1.c1 (#0), filter type:inlist,min_max
├── estimated rows: 2.00
├── Filter(Build)
│   ├── output columns: [t3.c1 (#4), t3.c2 (#5)]
│   ├── filters: [is_true(t3.c2 (#5) = 1)]
│   ├── estimated rows: 2.00
│   └── TableScan
│       ├── table: default.default.t3
│       ├── output columns: [c1 (#4), c2 (#5)]
│       ├── read rows: 4
│       ├── read size: < 1 KiB
│       ├── partitions total: 1
│       ├── partitions scanned: 1
│       ├── pruning stats: [segments: <range pruning: 1 to 1>, blocks: <range pruning: 1 to 1, bloom pruning: 1 to 1>]
│       ├── push downs: [filters: [is_true(t3.c2 (#5) = 1)], limit: NONE]
│       └── estimated rows: 4.00
└── AggregateFinal(Probe)
    ├── output columns: [t1.c1 (#0)]
    ├── group by: [c1]
    ├── aggregate functions: []
    ├── estimated rows: 3.00
    └── AggregatePartial
        ├── group by: [c1]
        ├── aggregate functions: []
        ├── estimated rows: 3.00
        └── HashJoin
            ├── output columns: [t1.c1 (#0)]
            ├── join type: INNER
            ├── build keys: [t2.c1 (#2), t2.c2 (#3)]
            ├── probe keys: [t1.c1 (#0), t1.c2 (#1)]
            ├── keys is null equal: [false, false]
            ├── filters: []
            ├── build join filters:
            │   ├── filter id:0, build key:t2.c1 (#2), probe key:t1.c1 (#0), filter type:inlist,min_max
            │   └── filter id:1, build key:t2.c2 (#3), probe key:t1.c2 (#1), filter type:inlist,min_max
            ├── estimated rows: 3.00
            ├── TableScan(Build)
            │   ├── table: default.default.t2
            │   ├── output columns: [c1 (#2), c2 (#3)]
            │   ├── read rows: 4
            │   ├── read size: < 1 KiB
            │   ├── partitions total: 1
            │   ├── partitions scanned: 1
            │   ├── pruning stats: [segments: <range pruning: 1 to 1>, blocks: <range pruning: 1 to 1>]
            │   ├── push downs: [filters: [], limit: NONE]
            │   └── estimated rows: 4.00
            └── TableScan(Probe)
                ├── table: default.default.t1
                ├── output columns: [c1 (#0), c2 (#1)]
                ├── read rows: 5
                ├── read size: < 1 KiB
                ├── partitions total: 1
                ├── partitions scanned: 1
                ├── pruning stats: [segments: <range pruning: 1 to 1>, blocks: <range pruning: 1 to 1>]
                ├── push downs: [filters: [], limit: NONE]
                ├── apply join filters: [#2, #0, #1]
                └── estimated rows: 5.00

65 rows explain in 0.367 sec. Processed 0 rows, 0 B (0 row/s, 0 B/s)

Now: the runtime filter built from t3.c1 is only pushed down to t1.
Optimized: the runtime filter built from t3.c1 will also be pushed down to t2.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions