-
Notifications
You must be signed in to change notification settings - Fork 7k
[core][stats-die/02] kill STATS in object manager component #57974
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
f3e327b to
cccbc85
Compare
cccbc85 to
0459b69
Compare
Signed-off-by: Cuong Nguyen <[email protected]>
0459b69 to
04198f7
Compare
jjyao
approved these changes
Oct 22, 2025
landscapepainter
pushed a commit
to landscapepainter/ray
that referenced
this pull request
Nov 17, 2025
…ect#57974) This PR replace STATS with Metric as a way to define metric inside ray (as a unification effort) in all object-manager components. Normally, metrics are defined at the top-level component and passed down to sub-components. However, in this case, because object manager is used as an API across, doing so would feel unnecessarily cumbersome. I decided to define the metrics inline within each client and server class instead. Note that the metric classes (Metric, Gauge, Sum, etc.) are simply wrappers around static OpenCensus/OpenTelemetry entities. **Details** Full context of this refactoring work. - Each component (e.g., gcs, raylet, core_worker, etc.) now has a metrics.h file located in its top-level directory. This file defines all metrics for that component. - In most cases, metrics are defined once in the main entry point of each component (gcs/gcs_server_main.cc for GCS, raylet/main.cc for Raylet, core_worker/core_worker_process.cc for the Core Worker, etc.). These metrics are then passed down to subcomponents via the ray::observability::MetricInterface. - This approach significantly reduces rebuild time when metric infrastructure changes. Previously, a change would trigger a full Ray rebuild; now, only the top-level entry points of each component need rebuilding. - There are a few exceptions where metrics are tracked inside object libraries (e.g., task_specification). In these cases, metrics are defined within the library itself, since there is no corresponding top-level entry point. - Finally, the obsolete metric_defs.h and metric_defs.cc files can now be completely removed. This paves the way for further dead code cleanup in a future PR. Test: - CI Signed-off-by: Cuong Nguyen <[email protected]>
Aydin-ab
pushed a commit
to Aydin-ab/ray-aydin
that referenced
this pull request
Nov 19, 2025
…ect#57974) This PR replace STATS with Metric as a way to define metric inside ray (as a unification effort) in all object-manager components. Normally, metrics are defined at the top-level component and passed down to sub-components. However, in this case, because object manager is used as an API across, doing so would feel unnecessarily cumbersome. I decided to define the metrics inline within each client and server class instead. Note that the metric classes (Metric, Gauge, Sum, etc.) are simply wrappers around static OpenCensus/OpenTelemetry entities. **Details** Full context of this refactoring work. - Each component (e.g., gcs, raylet, core_worker, etc.) now has a metrics.h file located in its top-level directory. This file defines all metrics for that component. - In most cases, metrics are defined once in the main entry point of each component (gcs/gcs_server_main.cc for GCS, raylet/main.cc for Raylet, core_worker/core_worker_process.cc for the Core Worker, etc.). These metrics are then passed down to subcomponents via the ray::observability::MetricInterface. - This approach significantly reduces rebuild time when metric infrastructure changes. Previously, a change would trigger a full Ray rebuild; now, only the top-level entry points of each component need rebuilding. - There are a few exceptions where metrics are tracked inside object libraries (e.g., task_specification). In these cases, metrics are defined within the library itself, since there is no corresponding top-level entry point. - Finally, the obsolete metric_defs.h and metric_defs.cc files can now be completely removed. This paves the way for further dead code cleanup in a future PR. Test: - CI Signed-off-by: Cuong Nguyen <[email protected]> Signed-off-by: Aydin Abiar <[email protected]>
Future-Outlier
pushed a commit
to Future-Outlier/ray
that referenced
this pull request
Dec 7, 2025
…ect#57974) This PR replace STATS with Metric as a way to define metric inside ray (as a unification effort) in all object-manager components. Normally, metrics are defined at the top-level component and passed down to sub-components. However, in this case, because object manager is used as an API across, doing so would feel unnecessarily cumbersome. I decided to define the metrics inline within each client and server class instead. Note that the metric classes (Metric, Gauge, Sum, etc.) are simply wrappers around static OpenCensus/OpenTelemetry entities. **Details** Full context of this refactoring work. - Each component (e.g., gcs, raylet, core_worker, etc.) now has a metrics.h file located in its top-level directory. This file defines all metrics for that component. - In most cases, metrics are defined once in the main entry point of each component (gcs/gcs_server_main.cc for GCS, raylet/main.cc for Raylet, core_worker/core_worker_process.cc for the Core Worker, etc.). These metrics are then passed down to subcomponents via the ray::observability::MetricInterface. - This approach significantly reduces rebuild time when metric infrastructure changes. Previously, a change would trigger a full Ray rebuild; now, only the top-level entry points of each component need rebuilding. - There are a few exceptions where metrics are tracked inside object libraries (e.g., task_specification). In these cases, metrics are defined within the library itself, since there is no corresponding top-level entry point. - Finally, the obsolete metric_defs.h and metric_defs.cc files can now be completely removed. This paves the way for further dead code cleanup in a future PR. Test: - CI Signed-off-by: Cuong Nguyen <[email protected]> Signed-off-by: Future-Outlier <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR replace STATS with Metric as a way to define metric inside ray (as a unification effort) in all object-manager components. Normally, metrics are defined at the top-level component and passed down to sub-components. However, in this case, because object manager is used as an API across, doing so would feel unnecessarily cumbersome. I decided to define the metrics inline within each client and server class instead.
Note that the metric classes (Metric, Gauge, Sum, etc.) are simply wrappers around static OpenCensus/OpenTelemetry entities.
Details
Full context of this refactoring work.
Test: