Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow more control over Prometheus metrics collection #3333

Merged
merged 1 commit into from
Jan 15, 2025

Conversation

mateka
Copy link
Contributor

@mateka mateka commented Jan 13, 2025

Description

I have added possibility to define prometheus metrics labels based on task parameters.

Motivation and Context

I wanted to have more grouping options for prometheus monitoring. We use one scheduler for many graphs which share tasks and we wanted to have possibility to discern different graphs.

Have you tested this? If so, how?

I ran example jobs with this code and it works for me.

@mateka mateka requested review from dlstadther and a team as code owners January 13, 2025 15:46
@mateka mateka force-pushed the more-prometheus-monitoring branch from 6cba1e0 to e385b5c Compare January 13, 2025 19:20
Copy link
Collaborator

@dlstadther dlstadther left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution. I've left some comments mostly about naming and organization. Please take a look and address them or provide response to the feedback.

Also, please see if there are valuable tests you can add to the existing prometheus test file such that any future contributions avoid causing undesirable changes of behavior.

doc/configuration.rst Outdated Show resolved Hide resolved
doc/configuration.rst Outdated Show resolved Hide resolved
luigi/contrib/prometheus_metric.py Outdated Show resolved Hide resolved
doc/configuration.rst Outdated Show resolved Hide resolved
@mateka mateka force-pushed the more-prometheus-monitoring branch from e385b5c to 4c83e51 Compare January 14, 2025 06:59
@mateka mateka force-pushed the more-prometheus-monitoring branch from 4c83e51 to 05a4bc2 Compare January 14, 2025 07:03
@mateka
Copy link
Contributor Author

mateka commented Jan 14, 2025

Thanks for your contribution. I've left some comments mostly about naming and organization. Please take a look and address them or provide response to the feedback.

Wow. Thank you for quick feedback! I addressed your comments.

Also, please see if there are valuable tests you can add to the existing prometheus test file such that any future contributions avoid causing undesirable changes of behavior.

Yes, after pushing my changes I realized that I had not added unit tests. It took a while, because I was not so sure how to make unit tests code nice and without repetitions. Now all cases should be covered.

I am also thinking about extending scheduler Task class by adding metadata dictionary, which could be somehow updated from user provided context in run method of ContextManagedTaskProcess class. This metadata could be used for even more metrics in custom MetricsCollector classes (for instance to measure used RAM or reporting statistics from cloud or map-reduce frameworks). Do you think this is a good direction? Probably it would be even possible to define additional Prometheus metrics based on above mentioned metadata in configuration without custom Prometheus Metrics Collector provided by users!

@mateka mateka requested a review from dlstadther January 14, 2025 09:51
Copy link
Collaborator

@dlstadther dlstadther left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing my feedback as well as adding tests!

@dlstadther
Copy link
Collaborator

I am also thinking about extending scheduler Task class by adding metadata dictionary, which could be somehow updated from user provided context in run method of ContextManagedTaskProcess class. This metadata could be used for even more metrics in custom MetricsCollector classes (for instance to measure used RAM or reporting statistics from cloud or map-reduce frameworks). Do you think this is a good direction? Probably it would be even possible to define additional Prometheus metrics based on above mentioned metadata in configuration without custom Prometheus Metrics Collector provided by users!

I don't fully picture this extension of Task. However, my mind hasn't been deep into the development of Luigi internals for a few years. I'm mostly here to provide PR reviews and try to maintain code quality and test coverage. If you feel you would like some feedback on your idea, feel free to post an Issue with your idea and try to elicit feedback from various folks who you have seen are active in this project or who have contributed in the past to related features.

Given that the Luigi project doesn't see the same usage and interaction as it did many years ago (in favor or other tools such as Airflow, Prefect, Dagster, and others), I do have some caution about adding too much complexity to Luigi. I want this project to continue to fulfill the needs of its community, but I also don't want to see it become overly complex to maintain (given the few folks still managing the project).

All that said, I welcome any contribution that fills a need and which will become generally applicable to the community.

@dlstadther dlstadther merged commit 6d12bd1 into spotify:master Jan 15, 2025
42 checks passed
@mateka mateka deleted the more-prometheus-monitoring branch January 15, 2025 06:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants