Skip to content

Conversation

@abrarsheikh
Copy link
Contributor

autoscaling context need expensive function evaluation, not all autoscaling policies need the data. Lazily evaluate them to save controller CPU

@abrarsheikh abrarsheikh requested a review from a team as a code owner November 25, 2025 06:13
@abrarsheikh abrarsheikh added the go add ONLY when ready to merge, run all tests label Nov 25, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors AutoscalingContext to support lazy evaluation of expensive metrics, which is a great optimization for the controller's CPU usage. The implementation is mostly correct, but I've found a critical bug and some inconsistencies in type hints that should be addressed.

Specifically:

  • There's a TypeError in the total_running_requests property due to calling properties as methods.
  • The type hints for some __init__ parameters and property setters in AutoscalingContext are inconsistent with their actual usage, which can lead to confusion and issues with static analysis tools.

I've left specific comments with suggestions to fix these issues. Once they are addressed, this PR should be good to go.

Signed-off-by: abrar <[email protected]>
def total_running_requests(self) -> float:
# NOTE: for non additive aggregation functions, total_running_requests is not
# accurate, consider this is a approximation.
return self.total_num_requests - self.total_queued_requests
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Null handling missing in total_running_requests property

The total_running_requests property computes total_num_requests - total_queued_requests, but total_queued_requests can be None (as indicated by the Optional type in __init__ at line 65). When total_queued_requests is None, accessing total_running_requests will raise a TypeError attempting to subtract None from a float. The old dataclass implementation explicitly allowed None for total_running_requests, but the new computed property doesn't handle this case.

Fix in Cursor Fix in Web

@ray-gardener ray-gardener bot added the serve Ray Serve Related Issue label Nov 25, 2025
Signed-off-by: abrar <[email protected]>
Signed-off-by: abrar <[email protected]>
@abrarsheikh abrarsheikh merged commit ec792a4 into master Nov 26, 2025
6 checks passed
@abrarsheikh abrarsheikh deleted the SERVE-1447-abrar-controller_3 branch November 26, 2025 22:51
SheldonTsen pushed a commit to SheldonTsen/ray that referenced this pull request Dec 1, 2025
autoscaling context need expensive function evaluation, not all
autoscaling policies need the data. Lazily evaluate them to save
controller CPU

---------

Signed-off-by: abrar <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests serve Ray Serve Related Issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants