Skip to content

perf: Avoid recursive union all by name call for duckdb diagonal concat#3399

Closed
FBruzzesi wants to merge 2 commits intomainfrom
perf/duckdb-diagonal-concat
Closed

perf: Avoid recursive union all by name call for duckdb diagonal concat#3399
FBruzzesi wants to merge 2 commits intomainfrom
perf/duckdb-diagonal-concat

Conversation

@FBruzzesi
Copy link
Member

@FBruzzesi FBruzzesi commented Jan 11, 2026

Description

I run a benchmark locally with different configurations (times are average on 3 runs per configuration), and overlap fraction of 50% of the columns:

n_frames n_rows n_cols time[s] (main) time[s] (branch)
5 10000 20 0.0591 0.0504
5 10000 50 0.0627 0.0603
5 100000 20 0.1759 0.1709
5 100000 50 0.4753 0.4462
10 10000 20 0.0914 0.0710
10 10000 50 0.2312 0.1790
10 100000 20 0.5308 0.5432
10 100000 50 1.3510 1.4438
20 10000 20 0.3652 0.2255
20 10000 50 0.9011 0.5725
20 100000 20 1.9238 1.7864
20 100000 50 4.9235 4.4886

What type of PR is this? (check all applicable)

  • 💾 Refactor
  • ✨ Feature
  • 🐛 Bug Fix
  • 🔧 Optimization
  • 📝 Documentation
  • ✅ Test
  • 🐳 Other

Related issues

Checklist

  • Code follows style guide (ruff)
  • Tests added
  • Documented the changes

@FBruzzesi FBruzzesi added performance duckdb Issue is related to duckdb backend labels Jan 11, 2026
@FBruzzesi FBruzzesi changed the title perf: Avoid recursive union all by name call perf: Avoid recursive union all by name call for duckdb diagonal concat Jan 11, 2026
@dangotbanned
Copy link
Member

dangotbanned commented Jan 12, 2026

Thanks @FBruzzesi

Note

take all of this with a grain of salt, as my extent of DuckDB use has been within narwhals 😅

Benchmark

Double thanks for including a benchmark!

I run a benchmark locally with different configurations (times are average on 3 runs per configuration),
and overlap fraction of 50% of the columns:

n_frames n_rows n_cols time[s] (main) time[s] (branch)
5 10000 20 0.0591 0.0504
... ... ... ... ...
20 100000 50 4.9235 4.4886

I am left with some more questions though, which should be easy enough to get to the bottom of 🤞

Configurations

The most general conclusion I can make is that there looks to be more of a benefit as the size and number of datasets increase.

At the lower end, the speedup is narrow and weirdly n_frames=10 goes into the negative.

Would you be able to extend the config to give a broader view?

Current

n_frames = 5, 10, 20
n_rows= 10_000 100_000
n_cols = 20, 50

Proposed

I would hope when we go below the current n_*, we at-worst get the same performance and not regress.

This also extends the upper bounds for 2/3 parameters, where I'm expecting to continue seeing a perf boost 😎

n_frames = 2, 5, 10, 20, 50
n_rows= 100, 1_000, 10_000 100_000
n_cols = 5, 10, 20, 50

Presentation

time[s] (main) and time[s] (branch) are useful, but the first thing my brain wanted to do is compare the runtimes.

Could you replace these with diff[s] and speedup?

I'm only suggesting to replace, since the table will be pretty big after the other request 😂

@FBruzzesi
Copy link
Member Author

FBruzzesi commented Jan 12, 2026

Thanks @dangotbanned

take all of this with a grain of salt, as my extent of DuckDB use has been within narwhals 😅

That makes two of us!

Proposed

I would hope when we go below the current n_*, we at-worst get the same performance and not regress.

This also extends the upper bounds for 2/3 parameters, where I'm expecting to continue seeing a perf boost 😎

n_frames = 2, 5, 10, 20, 50
n_rows= 100, 1_000, 10_000 100_000
n_cols = 5, 10, 20, 50

I have a 16 GB machine 500GB Drive, I can barely run the (20, 100_000, 50) config, 50 dataframes will go OOM

Presentation

time[s] (main) and time[s] (branch) are useful, but the first thing my brain wanted to do is compare the runtimes.

Could you replace these with diff[s] and speedup?

I'm only suggesting to replace, since the table will be pretty big after the other request 😂

This is quite easy to achieve

@dangotbanned
Copy link
Member

#3399 (comment)

I have a 16 GB machine 500GB Drive, I can barely run the (20, 100_000, 50) config, 50 dataframes will go OOM

Ah yes, that'll teach me for plucking numbers out of thin air 😂

@FBruzzesi
Copy link
Member Author

FBruzzesi commented Jan 12, 2026

Caution

Removed conclusions comment as inaccurate

@FBruzzesi FBruzzesi marked this pull request as draft January 12, 2026 19:02
@FBruzzesi
Copy link
Member Author

I am going to close this as it does not seem to have particular improvement. We can come back to it either once they support union by name in the python API or via AlignDiagonal (from #3404) and .union

@FBruzzesi FBruzzesi closed this Jan 25, 2026
@FBruzzesi FBruzzesi deleted the perf/duckdb-diagonal-concat branch January 25, 2026 11:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

duckdb Issue is related to duckdb backend performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants