Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(subscriptions): Allow scheduling watermark mode to be overridden #2791

Merged
merged 3 commits into from
Jun 8, 2022

Conversation

lynnagara
Copy link
Member

This is a temporary change that allows the scheduling watermark mode
to be overridden in the combined scheduler/executor. This is needed for
single tenant as we often have empty partitions there and empty partitions are
incompatible with "global" mode. Once transactions are no longer semantically
partitioned, this will not be necessary and this can be removed.

This is a temporary change that allows the scheduling watermark mode
to be overridden in the combined scheduler/executor. This is needed for
single tenant as we often have empty partitions there and this is incompatible
with "global" mode. Once transactions are no longer semantically partitioned,
this will not be necessary and this can be removed.
@lynnagara lynnagara requested a review from a team as a code owner June 7, 2022 16:57
@lynnagara lynnagara changed the title feat(subscriptions): Allow scheduling watermark mode to be overrideen feat(subscriptions): Allow scheduling watermark mode to be overridden Jun 7, 2022
@onewland
Copy link
Contributor

onewland commented Jun 8, 2022

Can you define what a scheduling watermark mode is?

@@ -70,6 +71,11 @@
help="Skip scheduling if timestamp is beyond this threshold compared to the system time",
)
@click.option("--log-level", help="Logging level to use.")
@click.option(
"--override-partition-mode",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Instead of calling this override-partition-mode, would it make more sense to call it --scheduling-mode. What this means is that if a parameter is provided from the CLI then use that instead of the default configuration. That would make it more explicit what scheduling mode the consumer is running on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, ok

@click.option(
"--override-partition-mode",
type=click.Choice(["partition", "global"]),
help="Skip scheduling if timestamp is beyond this threshold compared to the system time",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

help string has a copy/paste error.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@lynnagara
Copy link
Member Author

@nikhars Updated with your feedback

@lynnagara
Copy link
Member Author

Can you define what a scheduling watermark mode is?

@onewland It defines whether scheduling happens either immediately or waits for all partitions. It's described in

"""
The TickBuffer buffers ticks until they are ready to be submitted to
the next processing step.
The behavior of the TickBuffer depends on which of the two scheduler
modes applies.
If the scheduler mode is PARTITION then there is no buffering and a
message is always immediately submitted to the next processing step.
If the scheduler mode is GLOBAL then messages are
buffered until all partitions are at least up to the timestamp of the
tick.
`max_ticks_buffered_per_partition` applies if the scheduler mode is
GLOBAL. Once the maximum ticks is received for that
partition, we start to submit ticks for processing even if that timestamp
is not received for all partitions yet.
This prevents the buffered tick list growing infinitely and scheduling
to grind to a halt if one partition starts falling far behind for some reason.
"""

@codecov-commenter
Copy link

Codecov Report

Merging #2791 (34c4fc4) into master (d96735f) will decrease coverage by 0.10%.
The diff coverage is 60.00%.

@@            Coverage Diff             @@
##           master    #2791      +/-   ##
==========================================
- Coverage   92.87%   92.76%   -0.11%     
==========================================
  Files         610      609       -1     
  Lines       28683    28594      -89     
==========================================
- Hits        26638    26524     -114     
- Misses       2045     2070      +25     
Impacted Files Coverage Δ
snuba/cli/subscriptions_scheduler_executor.py 57.89% <50.00%> (+1.75%) ⬆️
snuba/subscriptions/combined_scheduler_executor.py 78.43% <66.66%> (-0.78%) ⬇️
snuba/settings/settings_distributed.py 0.00% <0.00%> (-100.00%) ⬇️
snuba/settings/settings_test_distributed.py 0.00% <0.00%> (-100.00%) ⬇️
...ts/0010_groupedmessages_onpremise_compatibility.py 95.55% <0.00%> (-4.45%) ⬇️
snuba/optimize.py 87.50% <0.00%> (-3.75%) ⬇️
snuba/migrations/table_engines.py 95.50% <0.00%> (-3.38%) ⬇️
...nsactions_onpremise_fix_orderby_and_partitionby.py 81.33% <0.00%> (-2.67%) ⬇️
tests/migrations/test_runner_individual.py 91.20% <0.00%> (-2.20%) ⬇️
snuba/datasets/entities/clickhouse_upgrade.py 79.23% <0.00%> (-1.54%) ⬇️
... and 23 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d96735f...34c4fc4. Read the comment docs.

@lynnagara lynnagara enabled auto-merge (squash) June 8, 2022 18:55
@lynnagara lynnagara merged commit 1f77517 into master Jun 8, 2022
@lynnagara lynnagara deleted the override-mode branch June 8, 2022 18:59
JoshFerge pushed a commit that referenced this pull request Jun 13, 2022
…#2791)

This is a temporary change that allows the scheduling watermark mode
to be overridden in the combined scheduler/executor. This is needed for
single tenant as we often have empty partitions there and empty partitions are
incompatible with "global" mode. Once transactions are no longer semantically
partitioned, this will not be necessary and this can be removed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants