Quick clickbench with a smaller dataset #12455

Rachelint · 2024-09-13T16:15:19Z

Is your feature request related to a problem or challenge?

For measuring the performance improvement of #11827 , some extended queries with more complex udaf(like median, approx_median) + high cardinality group by are needed #12438 .

But I found, such queries can't run successfully to get the result in my local. After debugging, I found it is due to their large intermdiate results which will full memory rapidly, leading to swap or oom...

However, when I run it in a subset with only 15% of the whole clickbench dataset, they can finish successfully and reflect the improvement #11827 (comment)

I think maybe we need a clickbench with the smaller dataset (like tpch 1, tpch 10...) in some situations.

Describe the solution you'd like

Support to generate a samller dataset of the whole clickbench dataset, and we can run queries on it.

Describe alternatives you've considered

No response

Additional context

No response

The text was updated successfully, but these errors were encountered:

alamb · 2024-09-13T18:59:06Z

I would like to troll / 🐟 for improvements: instead of making the benchmark easier, let's spend our time reducing the size of the intermediate state for those aggregates :)

Rachelint · 2024-09-14T03:33:47Z

I would like to troll / 🐟 for improvements: instead of making the benchmark easier, let's spend our time reducing the size of the intermediate state for those aggregates :)

🤔 Make sense, solving the real problem may be more valuable.

Rachelint added the enhancement New feature or request label Sep 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quick clickbench with a smaller dataset #12455

Quick clickbench with a smaller dataset #12455

Rachelint commented Sep 13, 2024 •

edited

Loading

alamb commented Sep 13, 2024

Rachelint commented Sep 14, 2024

Quick clickbench with a smaller dataset #12455

Quick clickbench with a smaller dataset #12455

Comments

Rachelint commented Sep 13, 2024 • edited Loading

Is your feature request related to a problem or challenge?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

alamb commented Sep 13, 2024

Rachelint commented Sep 14, 2024

Rachelint commented Sep 13, 2024 •

edited

Loading