Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CapMan visibility] emit a warning when a query gets throttled by the capacity management system #73442

Merged
merged 1 commit into from
Jun 28, 2024

Conversation

xurui-c
Copy link
Member

@xurui-c xurui-c commented Jun 27, 2024

https://getsentry.atlassian.net/browse/SNS-2799

Allocation policies are our mechanism for doing traffic management for Snuba queries. Currently, the result of applying allocation policies in the internal API is simply accept/reject/throttle, and Snuba sends back a payload to Sentry that contains metadata about those policy decisions. In the case of a throttled query, we want to emit a warning to GCP as well as a warning to Sentry Issues

@xurui-c xurui-c requested review from a team as code owners June 27, 2024 17:06
Copy link

sentry-io bot commented Jun 27, 2024

🔍 Existing Issues For Review

Your pull request is modifying functions with the following pre-existing issues:

📄 File: src/sentry/utils/snuba.py

Function Unhandled Issue
_bulk_snuba_query RateLimitExceeded: Query on could not be run due to allocation policies, info: {'details': {'ReferrerGuardRailPolicy... ...
Event Count: 181
_bulk_snuba_query RateLimitExceeded: Query on could not be run due to allocation policies, info: {'details': {'ConcurrentRateLimitAllo... ...
Event Count: 64
_bulk_snuba_query QueryMemoryLimitExceeded: DB::Exception: Received from snuba-outcomes-mz-1-3:9000. DB::Exception: Memory limit (for query) ... ...
Event Count: 49

Did you find this useful? React with a 👍 or 👎

@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Jun 27, 2024
Copy link

codecov bot commented Jun 28, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 78.01%. Comparing base (358f803) to head (40dd852).
Report is 46 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff            @@
##           master   #73442    +/-   ##
========================================
  Coverage   78.01%   78.01%            
========================================
  Files        6634     6637     +3     
  Lines      296698   296856   +158     
  Branches    51095    51120    +25     
========================================
+ Hits       231455   231601   +146     
- Misses      58860    58867     +7     
- Partials     6383     6388     +5     
Files Coverage Δ
src/sentry/utils/snuba.py 89.66% <100.00%> (+0.07%) ⬆️

... and 112 files with indirect coverage changes

@xurui-c xurui-c merged commit ffe6395 into master Jun 28, 2024
50 checks passed
@xurui-c xurui-c deleted the rachel/throttleWarningMetric branch June 28, 2024 21:22
nhsiehgit pushed a commit that referenced this pull request Jul 1, 2024
Snuba throttles queries in the magnitude of 10k-20k. A [previous
PR](#73442) emits a warning to
Sentry and logs to GCP for every throttled query, and thus has the
potential to overwhelm Sentry/GCP. This PR removes the warnings and logs
in case we have an incident.

Co-authored-by: Rachel Chen <[email protected]>
xurui-c added a commit that referenced this pull request Jul 3, 2024
…c) (#73709)

Allocation policies are our mechanism for doing traffic management for
Snuba queries. Currently, the result of applying multiple allocation
policies in the internal API is simply accept/reject/throttle.

In a #73442, we emit a Sentry
warning and a GCP log for every query throttled by Snuba's capacity
management system. However, we quickly realized [a large volume of
queries were
throttled](https://sentry.sentry.io/issues/5550501231/?project=1&query=&referrer=issue-stream&statsPeriod=24h&stream_index=5).

To address this, we sample the throttled queries such that we will only
emit a Sentry warning and a GCP log for only 1% of all throttled
queries, across all referrers and policies.

Co-authored-by: Rachel Chen <[email protected]>
@github-actions github-actions bot locked and limited conversation to collaborators Jul 14, 2024
@xurui-c xurui-c changed the title Emit a warning when a query gets throttled by the capacity management system [CapMan visibility] emit a warning when a query gets throttled by the capacity management system Oct 31, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Scope: Backend Automatically applied to PRs that change backend components
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants