-
-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(consumer): Remove parallel collect option #3555
Conversation
…tistorage consumer The KafkaConusmerStrategyFactory is supposed to get deprecated. This stops using it in the multistorage consumer and removes the need for the generic `ConsumerStrategyFactory` which takes a MultistorageKafkaPayload instead of a KafkaPayload. Soon we will be adding a new processing step for payload decoding and schema validation - this will happen between the DLQ and the rest of the message processing steps. This change will make it easier to introduce the decoder/validator as it needs to happen in different places in the main consumer vs the multistorage consumer.
It is always better to use parallel collect rather than collect in Snuba consumers. It was already the default. Let's simplify and just remove the option.
Codecov ReportBase: 92.23% // Head: 92.22% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## master #3555 +/- ##
==========================================
- Coverage 92.23% 92.22% -0.01%
==========================================
Files 724 724
Lines 33789 33782 -7
==========================================
- Hits 31164 31156 -8
- Misses 2625 2626 +1
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
snuba/consumers/consumer.py
Outdated
CommitOffsets(commit), | ||
self.__max_batch_size, | ||
self.__max_batch_time, | ||
10.0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: the meaning of 10.0 is is not super clear in this call. could we do one of:
- use named arguments rather than positional here
- make a named constant of DEFAULT_PARALLEL_COLLECT_TIMEOUT_MS = 10.0
in general I think named over positional is better when the argument count is > 3. It would be easy to transpose self.__max_batch_time
and DEFAULT_PARALLEL_COLLECT_TIMEOUT_MS
, right (same type)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to a named argument. The 10.0 value for the timeout actually happens to be the default in Arroyo so another option could've been to omit it. But it's probably not a bad idea to be explicit about it here.
It is always better to use parallel collect rather than collect in Snuba consumers. It was already the default. Let's simplify and just remove the option (that was never used anyway).
This change depends on #3547