Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(query): Multiple states Aggregate function #17148

Merged
merged 39 commits into from
Jan 20, 2025

Conversation

forsaken628
Copy link
Collaborator

@forsaken628 forsaken628 commented Dec 31, 2024

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

  • Introduced AggrStateRegistry and get_states_layout to reduce the memory consumption of aggregated states.
  • Changed memory allocation in Payload::append_rows from one-by-one to array.
  • Fix udaf memory leak.

fixes: #16984

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@github-actions github-actions bot added the pr-refactor this PR changes the code base without new features or bugfix label Dec 31, 2024
@forsaken628 forsaken628 added the ci-benchmark Benchmark: run all test label Jan 9, 2025
Copy link
Contributor

github-actions bot commented Jan 9, 2025

Docker Image for PR

  • tag: pr-17148-0a4a840-1736399623

note: this image tag is only available for internal use,
please check the internal doc for more details.

@forsaken628 forsaken628 added the ci-benchmark Benchmark: run all test label Jan 13, 2025
Copy link
Contributor

Docker Image for PR

  • tag: pr-17148-e31eca7-1736788050

note: this image tag is only available for internal use,
please check the internal doc for more details.

@forsaken628
Copy link
Collaborator Author

Benchmark

table

CREATE TABLE t1 (
    category_id INT NOT NULL,
    region_id INT NOT NULL,
    user_id BIGINT NOT NULL,
    product_id INT NOT NULL,
    transaction_date DATETIME NOT NULL,
    transaction_amount DECIMAL(10,2) NOT NULL,
    discount_amount DECIMAL(10,2) NOT NULL,
    is_returned BOOLEAN NOT NULL DEFAULT 0,
    inventory_count INT NOT NULL,
    last_updated TIMESTAMP NOT NULL
);

query

SELECT category_id,     region_id,     user_id,     product_id,     
   COUNT(*) AS transaction_count,    
   SUM(transaction_amount) AS total_transaction_amount,     
   SUM(discount_amount) AS total_discount_amount,     
   MAX(transaction_date) AS latest_transaction_date,     
   MAX(inventory_count) AS max_inventory,
   MAX(transaction_amount) AS max_returned_transaction_amount 
FROM      t1 
GROUP BY category_id, region_id, user_id,  product_id
ignore_result;

states layout

this pr:

StatesLayout { 
  layout: Layout { size: 93, align: 8 },
  // (index,offset)
  states_loc: [[Custom(0, 0)], [Custom(1, 8), Bool(2, 88)], [Custom(3, 24), Bool(4, 89)], [Custom(5, 40), Bool(6, 90)], [Custom(7, 80), Bool(8, 91)], [Custom(9, 56), Bool(10, 92)]]
}

before:

layout: Layout { size: 128, align: 8 }) 
offsets: [0, 8, 32, 56, 80, 96]

memory usage
this pr:
this pr

before:
before

@forsaken628 forsaken628 removed the ci-benchmark Benchmark: run all test label Jan 14, 2025
@forsaken628 forsaken628 marked this pull request as ready for review January 14, 2025 03:20
@forsaken628 forsaken628 requested a review from sundy-li January 14, 2025 03:20
Copy link
Member

@sundy-li sundy-li left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost LGTM. Please fix the comments and conflicts.

@forsaken628 forsaken628 requested a review from sundy-li January 17, 2025 12:07
@forsaken628 forsaken628 added this pull request to the merge queue Jan 17, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jan 17, 2025
@forsaken628 forsaken628 added this pull request to the merge queue Jan 20, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jan 20, 2025
@BohuTANG BohuTANG merged commit 625e31a into databendlabs:main Jan 20, 2025
70 checks passed
@forsaken628 forsaken628 deleted the or_null branch January 22, 2025 05:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-refactor this PR changes the code base without new features or bugfix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature: improve OrNull aggregate function adaptor
3 participants