Skip to content

Conversation

@dqhl76
Copy link
Collaborator

@dqhl76 dqhl76 commented Nov 9, 2025

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

WIP

part of #18906

Replace per-block “spill immediately” with a batch/stream policy so spilled fragments are reasonably sized. This reduces tiny files and I/O amplification.

uv run compare_query_stats.py 002bb805-0dd8-4bdc-b181-bdda52ff39c2 45450639-3cde-48cf-a519-c617c5dbc34e AggregateFinal
Comparing node 'AggregateFinal':
  Query A: 002bb805-0dd8-4bdc-b181-bdda52ff39c2
  Query B: 45450639-3cde-48cf-a519-c617c5dbc34e
-------------------------------------------------------------------------------
  Metric                Unit                Query A       Query B       Δ (B-A)
-------------------------------------------------------------------------------
  RemoteSpillReadCount  Count                 4,472            86        -4,386
  RemoteSpillReadBytes  Bytes            11,511,760       172,033   -11,339,727
  RemoteSpillReadTime   MillisSeconds         4,012            15        -3,997
-------------------------------------------------------------------------------
uv run compare_query_stats.py 002bb805-0dd8-4bdc-b181-bdda52ff39c2 45450639-3cde-48cf-a519-c617c5dbc34e AggregatePartial
Comparing node 'AggregatePartial':
  Query A: 002bb805-0dd8-4bdc-b181-bdda52ff39c2
  Query B: 45450639-3cde-48cf-a519-c617c5dbc34e
-----------------------------------------------------------------------------------
  Metric                 Unit                 Query A        Query B        Δ (B-A)
-----------------------------------------------------------------------------------
  RemoteSpillWriteCount  Count                     52             86            +34
  RemoteSpillWriteBytes  Bytes             11,511,760         90,575    -11,421,185
  RemoteSpillWriteTime   MillisSeconds            355             44           -311
-----------------------------------------------------------------------------------

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@github-actions github-actions bot added the pr-refactor this PR changes the code base without new features or bugfix label Nov 9, 2025
@dqhl76 dqhl76 force-pushed the partialaggregate-spill branch 2 times, most recently from 359c3c6 to b4dc485 Compare November 9, 2025 12:14
@dqhl76 dqhl76 force-pushed the partialaggregate-spill branch from 67add26 to 5e6d151 Compare November 11, 2025 02:06
@dqhl76 dqhl76 added ci-cloud Build docker image for cloud test and removed ci-cloud Build docker image for cloud test labels Nov 11, 2025
@github-actions
Copy link
Contributor

Docker Image for PR

  • tag: pr-18943-c2e21fb-1762832908

note: this image tag is only available for internal use.

@dqhl76 dqhl76 added the ci-cloud Build docker image for cloud test label Nov 12, 2025
@github-actions
Copy link
Contributor

Docker Image for PR

  • tag: pr-18943-9447b3a-1762948324

note: this image tag is only available for internal use.

@dqhl76 dqhl76 added ci-benchmark-cloud Benchmark: run only cloud tests for tpch/hits and removed ci-cloud Build docker image for cloud test labels Nov 13, 2025
@github-actions
Copy link
Contributor

Docker Image for PR

  • tag: pr-18943-a064856-1763005916

note: this image tag is only available for internal use.

@github-actions
Copy link
Contributor

@dqhl76 dqhl76 marked this pull request as ready for review November 13, 2025 09:13
@dqhl76 dqhl76 requested a review from zhang2014 November 13, 2025 09:13
@dqhl76 dqhl76 marked this pull request as draft November 13, 2025 09:14
@dqhl76 dqhl76 marked this pull request as ready for review November 13, 2025 09:16
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@dqhl76 dqhl76 added ci-benchmark-cloud Benchmark: run only cloud tests for tpch/hits and removed ci-benchmark-cloud Benchmark: run only cloud tests for tpch/hits labels Nov 14, 2025
@github-actions
Copy link
Contributor

Docker Image for PR

  • tag: pr-18943-0667c1f-1763088331

note: this image tag is only available for internal use.

@github-actions
Copy link
Contributor

@zhang2014 zhang2014 merged commit 4f03b8d into databendlabs:main Nov 14, 2025
187 of 189 checks passed
dqhl76 added a commit to dqhl76/databend that referenced this pull request Nov 16, 2025
@dqhl76 dqhl76 deleted the partialaggregate-spill branch November 19, 2025 12:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-benchmark-cloud Benchmark: run only cloud tests for tpch/hits pr-refactor this PR changes the code base without new features or bugfix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants