Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(meta): Create distributions meta tables #5748

Merged
merged 6 commits into from
Apr 25, 2024

Conversation

evanh
Copy link
Member

@evanh evanh commented Apr 10, 2024

This creates the local table, distributed table and materialized view for the
meta tables of the distributions cluster.

Depends on #5759
Depends on #5749

This creates the local table, distributed table and materialized view for the
meta tables of the distributions cluster.
@evanh evanh requested a review from a team as a code owner April 10, 2024 16:09
Copy link

github-actions bot commented Apr 10, 2024

This PR has a migration; here is the generated SQL

-- start migrations

-- forward migration generic_metrics : 0045_distributions_meta_tables
Local op: CREATE TABLE IF NOT EXISTS generic_metric_distributions_meta_local (org_id UInt64, project_id UInt64, use_case_id LowCardinality(String), metric_id UInt64, tag_key UInt64, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, count AggregateFunction(sum, Float64)) ENGINE ReplicatedAggregatingMergeTree('/clickhouse/tables/generic_metrics_distributions/{shard}/default/generic_metric_distributions_meta_local', '{replica}') PRIMARY KEY (org_id, project_id, use_case_id, metric_id, tag_key, timestamp) ORDER BY (org_id, project_id, use_case_id, metric_id, tag_key, timestamp) PARTITION BY toMonday(timestamp) TTL timestamp + toIntervalDay(retention_days) SETTINGS index_granularity=8192, ttl_only_drop_parts=0;
Distributed op: CREATE TABLE IF NOT EXISTS generic_metric_distributions_meta_dist (org_id UInt64, project_id UInt64, use_case_id LowCardinality(String), metric_id UInt64, tag_key UInt64, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, count AggregateFunction(sum, Float64)) ENGINE Distributed(`cluster_one_sh`, default, generic_metric_distributions_meta_local);
Local op: CREATE MATERIALIZED VIEW IF NOT EXISTS generic_metric_distributions_meta_mv TO generic_metric_distributions_meta_local (org_id UInt64, project_id UInt64, use_case_id LowCardinality(String), metric_id UInt64, tag_key UInt64, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, count AggregateFunction(sum, Float64)) AS 
                SELECT
                    org_id,
                    project_id,
                    use_case_id,
                    metric_id,
                    tag_key,
                    toMonday(timestamp) as timestamp,
                    retention_days,
                    sumState(count_value) as count
                FROM generic_metric_distributions_raw_local
                ARRAY JOIN tags.key AS tag_key
                WHERE record_meta = 1
                GROUP BY
                    org_id,
                    project_id,
                    use_case_id,
                    metric_id,
                    tag_key,
                    timestamp,
                    retention_days
                ;
Local op: CREATE TABLE IF NOT EXISTS generic_metric_distributions_meta_tag_values_local (project_id UInt64, metric_id UInt64, tag_key UInt64, tag_value String, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, count AggregateFunction(sum, Float64)) ENGINE ReplicatedAggregatingMergeTree('/clickhouse/tables/generic_metrics_distributions/{shard}/default/generic_metric_distributions_meta_tag_values_local', '{replica}') PRIMARY KEY (project_id, metric_id, tag_key, tag_value, timestamp) ORDER BY (project_id, metric_id, tag_key, tag_value, timestamp) PARTITION BY toMonday(timestamp) TTL timestamp + toIntervalDay(retention_days) SETTINGS index_granularity=8192, ttl_only_drop_parts=0;
Distributed op: CREATE TABLE IF NOT EXISTS generic_metric_distributions_meta_tag_values_dist (project_id UInt64, metric_id UInt64, tag_key UInt64, tag_value String, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, count AggregateFunction(sum, Float64)) ENGINE Distributed(`cluster_one_sh`, default, generic_metric_distributions_meta_tag_values_local);
Local op: CREATE MATERIALIZED VIEW IF NOT EXISTS generic_metric_distributions_meta_tag_values_mv TO generic_metric_distributions_meta_tag_values_local (project_id UInt64, metric_id UInt64, tag_key UInt64, tag_value String, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, count AggregateFunction(sum, Float64)) AS 
                SELECT
                    project_id,
                    metric_id,
                    tag_key,
                    tag_value,
                    toMonday(timestamp) as timestamp,
                    retention_days,
                    sumState(count_value) as count
                FROM generic_metric_distributions_raw_local
                ARRAY JOIN
                    tags.key AS tag_key, tags.raw_value AS tag_value
                WHERE record_meta = 1
                GROUP BY
                    project_id,
                    metric_id,
                    tag_key,
                    tag_value,
                    timestamp,
                    retention_days
                ;
-- end forward migration generic_metrics : 0045_distributions_meta_tables




-- backward migration generic_metrics : 0045_distributions_meta_tables
Local op: DROP TABLE IF EXISTS generic_metric_distributions_meta_tag_values_mv;
Distributed op: DROP TABLE IF EXISTS generic_metric_distributions_meta_tag_values_dist;
Local op: DROP TABLE IF EXISTS generic_metric_distributions_meta_tag_values_local;
Local op: DROP TABLE IF EXISTS generic_metric_distributions_meta_mv;
Distributed op: DROP TABLE IF EXISTS generic_metric_distributions_meta_dist;
Local op: DROP TABLE IF EXISTS generic_metric_distributions_meta_local;
-- end backward migration generic_metrics : 0045_distributions_meta_tables

storage_set=self.storage_set_key,
table_name=self.meta_dist_table_name,
engine=table_engines.Distributed(
local_table_name=self.meta_local_table_name, sharding_key=None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dist table name?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This setting is telling the Distributed engine which local table to use as the storage node. You can see table_name above has the dist table in it.

storage_set=self.storage_set_key,
table_name=self.tag_value_dist_table_name,
engine=table_engines.Distributed(
local_table_name=self.tag_value_local_table_name, sharding_key=None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dist table name?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above.

Copy link

codecov bot commented Apr 10, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

❗ No coverage uploaded for pull request base (master@eadc30b). Click here to learn what that means.

✅ All tests successful. No failed tests found ☺️

Additional details and impacted files
@@            Coverage Diff            @@
##             master    #5748   +/-   ##
=========================================
  Coverage          ?   92.01%           
=========================================
  Files             ?      877           
  Lines             ?    42330           
  Branches          ?        0           
=========================================
  Hits              ?    38948           
  Misses            ?     3382           
  Partials          ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@nikhars nikhars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving the changes. Please fix the ordering of the migrations list as you merge them. Since there would be conflicts in the group loader definition.

@evanh evanh merged commit 1fdf3af into master Apr 25, 2024
29 checks passed
@evanh evanh deleted the evanh/feat/distributions-meta-tables branch April 25, 2024 13:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants