-
Notifications
You must be signed in to change notification settings - Fork 7k
[Serve] lazily evaluate autoscaling context #58963
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: abrar <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request refactors AutoscalingContext to support lazy evaluation of expensive metrics, which is a great optimization for the controller's CPU usage. The implementation is mostly correct, but I've found a critical bug and some inconsistencies in type hints that should be addressed.
Specifically:
- There's a
TypeErrorin thetotal_running_requestsproperty due to calling properties as methods. - The type hints for some
__init__parameters and property setters inAutoscalingContextare inconsistent with their actual usage, which can lead to confusion and issues with static analysis tools.
I've left specific comments with suggestions to fix these issues. Once they are addressed, this PR should be good to go.
Signed-off-by: abrar <[email protected]>
| def total_running_requests(self) -> float: | ||
| # NOTE: for non additive aggregation functions, total_running_requests is not | ||
| # accurate, consider this is a approximation. | ||
| return self.total_num_requests - self.total_queued_requests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Null handling missing in total_running_requests property
The total_running_requests property computes total_num_requests - total_queued_requests, but total_queued_requests can be None (as indicated by the Optional type in __init__ at line 65). When total_queued_requests is None, accessing total_running_requests will raise a TypeError attempting to subtract None from a float. The old dataclass implementation explicitly allowed None for total_running_requests, but the new computed property doesn't handle this case.
Signed-off-by: abrar <[email protected]>
Signed-off-by: abrar <[email protected]>
autoscaling context need expensive function evaluation, not all autoscaling policies need the data. Lazily evaluate them to save controller CPU --------- Signed-off-by: abrar <[email protected]>
autoscaling context need expensive function evaluation, not all autoscaling policies need the data. Lazily evaluate them to save controller CPU