-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
recreate base task metrics with prometheus #144
Conversation
Codecov Report
Changes have been made to critical files, which contain lines commonly executed in production. Learn more @@ Coverage Diff @@
## main #144 +/- ##
==========================================
- Coverage 98.37% 98.35% -0.02%
==========================================
Files 374 374
Lines 27929 28048 +119
==========================================
+ Hits 27475 27587 +112
- Misses 454 461 +7
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 1 file with indirect coverage changes
|
Codecov Report
@@ Coverage Diff @@
## main #144 +/- ##
=======================================
Coverage ? 98.40%
=======================================
Files ? 346
Lines ? 27100
Branches ? 0
=======================================
Hits ? 26669
Misses ? 431
Partials ? 0
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #144 +/- ##
==========================================
- Coverage 98.41% 98.39% -0.03%
==========================================
Files 348 348
Lines 27433 27464 +31
==========================================
+ Hits 26998 27022 +24
- Misses 435 442 +7
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Codecov Report
@@ Coverage Diff @@
## main #144 +/- ##
==========================================
- Coverage 98.41% 98.39% -0.03%
==========================================
Files 348 348
Lines 27433 27464 +31
==========================================
+ Hits 26998 27022 +24
- Misses 435 442 +7
Flags with carried forward coverage won't be shown. Click here to find out more.
|
) | ||
TASK_TIME_IN_QUEUE = Histogram( | ||
"worker_tasks_timers_time_in_queue_seconds", | ||
"Time in {TODO} spent waiting in the queue before being run", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Time in {TODO}"?...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh thank you! i am used to a linter that stops me from committing if there are TODO comments that aren't accompanied with a task number haha. i'll see if i can set that up with precommit. this was a stand-in for units, which i know now and will update before merging
@property | ||
def metrics_prefix(self): | ||
return f"worker.task.{self.name}" | ||
def __init_subclass__(cls, name=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know this was a thing.... interesting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah i haven't gotten into metaclasses yet but i guess this is like metaclasses-lite?
another approach i could maybe have taken (assuming self.__class__
will refer to a subclass and not the base class) is something like:
global_run_counter = Counter('number of runs', 'docstring', labels=['task'])
global_retry_counter = Counter('number of retries', 'docstring', labels=['task'])
...
class PerTaskMetrics:
def __init__(self, name):
self.task_run_counter = global_run_counter.labels(task=name)
self.task_retry_counter = global_retry_counter.labels(task=name)
...
def BaseTask:
@property
def metrics(self):
# lazily initialize the per-task metrics
if not hasattr(self.__class__, '_metrics'):
self.__class__._metrics = PerTaskMetrics(self.name)
return self.__class__._metrics
async def apply_async(self, ...):
self.metrics.task_run_counter.inc()
...
i don't really know what the tradeoffs are, but i have a weak preference for the approach taken in the PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. the approach in the PR is quite elegant
Suspect IssuesThis pull request was deployed and Sentry observed the following issues:
Did you find this useful? React with a 👍 or 👎 |
depends on codecov/shared#64
by default, worker runs with 2 celery worker processes. we thus have to use prometheus's multiprocess mode to report our metrics. when deploying, we should set the
$PROMETHEUS_MULTIPROC_DIR
env var to something like/var/tmp/prometheus
.this PR mirrors the statsd base task metrics which include:
Legal Boilerplate
Look, I get it. The entity doing business as "Sentry" was incorporated in the State of Delaware in 2015 as Functional Software, Inc. In 2022 this entity acquired Codecov and as result Sentry is going to need some rights from me in order to utilize my contributions in this PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Sentry can use, modify, copy, and redistribute my contributions, under Sentry's choice of terms.