-
-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(meta): Create distributions meta tables #5748
Conversation
This creates the local table, distributed table and materialized view for the meta tables of the distributions cluster.
This PR has a migration; here is the generated SQL -- start migrations
-- forward migration generic_metrics : 0045_distributions_meta_tables
Local op: CREATE TABLE IF NOT EXISTS generic_metric_distributions_meta_local (org_id UInt64, project_id UInt64, use_case_id LowCardinality(String), metric_id UInt64, tag_key UInt64, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, count AggregateFunction(sum, Float64)) ENGINE ReplicatedAggregatingMergeTree('/clickhouse/tables/generic_metrics_distributions/{shard}/default/generic_metric_distributions_meta_local', '{replica}') PRIMARY KEY (org_id, project_id, use_case_id, metric_id, tag_key, timestamp) ORDER BY (org_id, project_id, use_case_id, metric_id, tag_key, timestamp) PARTITION BY toMonday(timestamp) TTL timestamp + toIntervalDay(retention_days) SETTINGS index_granularity=8192, ttl_only_drop_parts=0;
Distributed op: CREATE TABLE IF NOT EXISTS generic_metric_distributions_meta_dist (org_id UInt64, project_id UInt64, use_case_id LowCardinality(String), metric_id UInt64, tag_key UInt64, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, count AggregateFunction(sum, Float64)) ENGINE Distributed(`cluster_one_sh`, default, generic_metric_distributions_meta_local);
Local op: CREATE MATERIALIZED VIEW IF NOT EXISTS generic_metric_distributions_meta_mv TO generic_metric_distributions_meta_local (org_id UInt64, project_id UInt64, use_case_id LowCardinality(String), metric_id UInt64, tag_key UInt64, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, count AggregateFunction(sum, Float64)) AS
SELECT
org_id,
project_id,
use_case_id,
metric_id,
tag_key,
toMonday(timestamp) as timestamp,
retention_days,
sumState(count_value) as count
FROM generic_metric_distributions_raw_local
ARRAY JOIN tags.key AS tag_key
WHERE record_meta = 1
GROUP BY
org_id,
project_id,
use_case_id,
metric_id,
tag_key,
timestamp,
retention_days
;
Local op: CREATE TABLE IF NOT EXISTS generic_metric_distributions_meta_tag_values_local (project_id UInt64, metric_id UInt64, tag_key UInt64, tag_value String, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, count AggregateFunction(sum, Float64)) ENGINE ReplicatedAggregatingMergeTree('/clickhouse/tables/generic_metrics_distributions/{shard}/default/generic_metric_distributions_meta_tag_values_local', '{replica}') PRIMARY KEY (project_id, metric_id, tag_key, tag_value, timestamp) ORDER BY (project_id, metric_id, tag_key, tag_value, timestamp) PARTITION BY toMonday(timestamp) TTL timestamp + toIntervalDay(retention_days) SETTINGS index_granularity=8192, ttl_only_drop_parts=0;
Distributed op: CREATE TABLE IF NOT EXISTS generic_metric_distributions_meta_tag_values_dist (project_id UInt64, metric_id UInt64, tag_key UInt64, tag_value String, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, count AggregateFunction(sum, Float64)) ENGINE Distributed(`cluster_one_sh`, default, generic_metric_distributions_meta_tag_values_local);
Local op: CREATE MATERIALIZED VIEW IF NOT EXISTS generic_metric_distributions_meta_tag_values_mv TO generic_metric_distributions_meta_tag_values_local (project_id UInt64, metric_id UInt64, tag_key UInt64, tag_value String, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, count AggregateFunction(sum, Float64)) AS
SELECT
project_id,
metric_id,
tag_key,
tag_value,
toMonday(timestamp) as timestamp,
retention_days,
sumState(count_value) as count
FROM generic_metric_distributions_raw_local
ARRAY JOIN
tags.key AS tag_key, tags.raw_value AS tag_value
WHERE record_meta = 1
GROUP BY
project_id,
metric_id,
tag_key,
tag_value,
timestamp,
retention_days
;
-- end forward migration generic_metrics : 0045_distributions_meta_tables
-- backward migration generic_metrics : 0045_distributions_meta_tables
Local op: DROP TABLE IF EXISTS generic_metric_distributions_meta_tag_values_mv;
Distributed op: DROP TABLE IF EXISTS generic_metric_distributions_meta_tag_values_dist;
Local op: DROP TABLE IF EXISTS generic_metric_distributions_meta_tag_values_local;
Local op: DROP TABLE IF EXISTS generic_metric_distributions_meta_mv;
Distributed op: DROP TABLE IF EXISTS generic_metric_distributions_meta_dist;
Local op: DROP TABLE IF EXISTS generic_metric_distributions_meta_local;
-- end backward migration generic_metrics : 0045_distributions_meta_tables |
storage_set=self.storage_set_key, | ||
table_name=self.meta_dist_table_name, | ||
engine=table_engines.Distributed( | ||
local_table_name=self.meta_local_table_name, sharding_key=None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dist table name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This setting is telling the Distributed engine which local table to use as the storage node. You can see table_name above has the dist table in it.
storage_set=self.storage_set_key, | ||
table_name=self.tag_value_dist_table_name, | ||
engine=table_engines.Distributed( | ||
local_table_name=self.tag_value_local_table_name, sharding_key=None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dist table name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above.
Codecov ReportAll modified and coverable lines are covered by tests ✅
✅ All tests successful. No failed tests found Additional details and impacted files@@ Coverage Diff @@
## master #5748 +/- ##
=========================================
Coverage ? 92.01%
=========================================
Files ? 877
Lines ? 42330
Branches ? 0
=========================================
Hits ? 38948
Misses ? 3382
Partials ? 0 ☔ View full report in Codecov by Sentry. |
snuba/snuba_migrations/generic_metrics/0036_distributions_meta_tables.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving the changes. Please fix the ordering of the migrations list as you merge them. Since there would be conflicts in the group loader definition.
This creates the local table, distributed table and materialized view for the
meta tables of the distributions cluster.
Depends on #5759
Depends on #5749