[pull] master from ray-project:master #140

pull · 2023-06-29T21:04:01Z

See Commits and Changes for more details.

Can you help keep this open source service alive? 💖 Please sponsor : )

…51055) It's not common to call `unique_ptr::release()` because it can easily lead to memory leaks. However, `ray_syncer_test.cc` is a special case. I tried to change `cli_reactor` to `unique_ptr`, and then the tests will fail with double free. I use ASAN to check: 1. `RayClientBidiReactor::OnDone` calls `delete this;`. 2. Unique pointer goes out of scope. <img width="1728" alt="image" src="https://github.com/user-attachments/assets/5f807a16-2633-4576-b057-d34dd1aaa546" /> Signed-off-by: kaihsun <[email protected]>

The macro is never used in our codebase, so delete it. Signed-off-by: dentiny <[email protected]>

TPU device logs for k8s containers that request `google.com/tpu` resources are written to the `/tmp/tpu_logs` directory. This PR adds a symlink to the `/tmp/tpu_logs` directory when the `TPU_WORKER_ID` env var is set, TPU log files are then added to `monitor_log_paths`. The logs are then viewable from the Ray Dashboard: Create a file in /tmp/tpu_logs and view symlink: ![command-line-logging](https://github.com/user-attachments/assets/c50915ad-8382-4af7-a398-40d5a249e8c8) The tpu_logs directory is added to the 'Logs' tab on a TPU Ray worker: ![tpu_logs_dir](https://github.com/user-attachments/assets/394133b0-be70-4b98-9e86-dcad50c1b4fd) The log file we created is ingested/viewable: ![tpu-device-log-file](https://github.com/user-attachments/assets/c42ab96a-f88b-4959-adf2-8650fd75c773) --------- Signed-off-by: Ryan O'Leary <[email protected]> Co-authored-by: Kai-Hsun Chen <[email protected]>

#51113) This reverts commit e4a448f.   ## Why are these changes needed?  #47814 (comment) ## Related issue number  ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :(

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

UV doesn't seem to carry over the right environment markers in `requirements_compiled.txt`. We can temporarily revert this until we find a fix for the issue. Signed-off-by: Kevin H. Luu <[email protected]>

Fix typos in comments and strings Signed-off-by: co63oc <[email protected]>

…tutorials (#50240) Updates docs to use correct normalization values in image datasets. --------- Signed-off-by: Ricardo Decal <[email protected]>

…a HuggingFace `Dataset` (#50998) ## Why are these changes needed? `override_num_blocks` is not supported when reading from a HuggingFace `Dataset` object - i.e in non-streaming mode. It is supported however, in streaming mode. However, the current error message is incorrect and mixes up the wording. This is a tiny PR to improve the wording. Signed-off-by: sumanthrh <[email protected]>

changes ordering of libraries based on popularity (pytorch first, then xgboost, then the rest) --------- Signed-off-by: Ricardo Decal <[email protected]>

## Why are these changes needed? The proxy currently includes http redirects as http errors, which are emitted to metrics. 3xx shouldn't be an error. This PR excludes 3xx responses from the error count, and updates the relevant test case. Signed-off-by: akyang-anyscale <[email protected]>

Currently the V2 Autoscaler formats logs by converting the V2 data structure `ClusterStatus` to the V1 structures `AutoscalerSummary` and `LoadMetricsSummary` and then passing them to the legacy `format_info_string`. It'd be useful for the V2 autoscaler to directly format `ClusterStatus` to the correct output log format. This PR refactors `utils.py` to directly format `ClusterStatus`. Additionally, this PR changes the node reports to output `instance_id` rather than `ip_address`, since the latter is not necessarily unique for failed nodes. ## Related issue number Closes #37856 --------- Signed-off-by: ryanaoleary <[email protected]> Signed-off-by: Ryan O'Leary <[email protected]>

## Why are these changes needed? Some docs were failing to index properly due to their extremely long length. I've hidden verbose cell outputs so that they index properly ## Related issue number N/A ## Checks - [x] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( Signed-off-by: Ricardo Decal <[email protected]>

Double checked locking is known to be buggy (check if null and update pointer leads to data race), `std::once_flag` is the solution. --------- Signed-off-by: dentiny <[email protected]>

Add various metrics that are captured in the progress bar but are not captured in the prometheus metrics emitted. --------- Signed-off-by: Matthew Owen <[email protected]>

Explicitly tear down the compiled graph and kill the actors rather than relying on GC. Also previously in Compiled Graph, actor is killed only when the actor task does not finish within timeout. This PR fixes it by always killing the actor when kill_actors=True. Signed-off-by: Rui Qiao <[email protected]>

Fix the operator id name format. The current format causes potential collision between operator in rare cases. Example: ``` ds = ray.data.range(100, override_num_blocks=20).limit(11) for i in range(11): ds = ds.limit(1) ds._set_name("data_head_test") ds.materialize() ``` You would expect there will be 12 limit operators but the dashboard only shows 11 because of id collisions: <img width="1821" alt="Screenshot 2025-03-05 at 5 34 31 PM" src="https://github.com/user-attachments/assets/1cdb2a58-eb0c-4c10-bb91-a33f6fa5e946" /> Test: - CI Signed-off-by: can <[email protected]>

When an asyncio task creates another asyncio task, raising `AsyncioActorExit` cannot make the caller exit because they are not the same task. Therefore, this PR makes `exit_actor` to request actor exit in core worker context, which will be checked regularly by core worker. Closes: #49451 --------- Signed-off-by: Chi-Sheng Liu <[email protected]> Co-authored-by: Edward Oakes <[email protected]>

`logging.warn` is legacy and deprecated Signed-off-by: Chi-Sheng Liu <[email protected]>

Resolves #51135 Error message pasted here: ``` [2025-03-06T19:49:02Z] > assert file_info.filename == str(tpu_log_dir / tpu_device_log_file) [2025-03-06T19:49:02Z] E AssertionError: assert 'C:\\Users\\C...pu-device.log' == 'C:\\Users\\C...pu-device.log' [2025-03-06T19:49:02Z] E - C:\Users\ContainerAdministrator\AppData\Local\Temp\pytest-of-ContainerAdministrator\pytest-1\test_tpu_logs0\logs\tpu_logs\tpu-device.log [2025-03-06T19:49:02Z] E ? ^ [2025-03-06T19:49:02Z] E + C:\Users\ContainerAdministrator\AppData\Local\Temp\pytest-of-ContainerAdministrator\pytest-1\test_tpu_logs0\logs/tpu_logs\tpu-device.log [2025-03-06T19:49:02Z] E ? ``` Signed-off-by: dentiny <[email protected]>

Signed-off-by: Cody Yu <[email protected]>

A lot of things change coregpubuild so the multi gpu tests run more often than they need to. We're moving to manually running these multi-gpu tests. --------- Signed-off-by: dayshah <[email protected]>

This docs code was not passing multi-gpu ci step. --------- Signed-off-by: dayshah <[email protected]>

- `https://docs.ray.io/en/master/serve/model_composition.html#visualizing-the-graph` is the wrong url for `Visualization of DAGs`. - I think `https://docs.ray.io/en/master/ray-core/compiled-graph/visualization.html` is the right. --------- Signed-off-by: Sangyeon Cho <[email protected]> Co-authored-by: Dhyey Shah <[email protected]>

Somehow a bad import snuck in... ## Related issue number Closes #49634 Closes #49632 Closes #49638 Closes #49642 --------- Signed-off-by: Edward Oakes <[email protected]>

## Why are these changes needed? - Make the num_blocks argument optional. So no need to set `num_blocks=None` when using `target_num_rows_per_block`. - Add type hint for none value - Fix formatting in [docs page](https://docs.ray.io/en/latest/data/api/doc/ray.data.Dataset.repartition.html) ![image](https://github.com/user-attachments/assets/bfe8a845-3c37-4be6-a2dc-ef78d56c80d4) --------- Signed-off-by: Praveen Gorthy <[email protected]> Signed-off-by: Praveen <[email protected]> Co-authored-by: Hao Chen <[email protected]> Co-authored-by: Alexey Kudinkin <[email protected]>

Update the document to include the feature of python standard attributes in log lines. The PR also fixes all applicable errors/warnings in the doc. Closes #49502 --------- Signed-off-by: Mengjin Yan <[email protected]> Co-authored-by: Dhyey Shah <[email protected]>

* Use `psutil.process_iter` to replace `psutil.pids`. * Use `proc.info["name"]` instead of `proc.name()`. * I'm not sure whether `proc.name()` uses the cache set by `process_iter`, but I'm certain that using info is correct since the official docs frequently use `proc.info[...]` with `process_iter`. * I asked a question on giampaolo/psutil#2518, but I'm not sure how long it will take to get an answer from the community. For now, I think we can merge it, and I'll update the use of psutil if the maintainers have any suggestions. --------- Signed-off-by: kaihsun <[email protected]>

I tried to reproduce the ASAN errors in `scheduling_queue_test.cc` for #51516 by running: ``` bazel test --features=asan -c dbg //:scheduling_queue_test --test_output=all ``` However, I got the following ODR error instead of the actual data racing error. <img width="1728" alt="image" src="https://github.com/user-attachments/assets/0d495f26-efaa-4586-a6ec-be1729b185da" /> It looks like we have two packages `@com_github_madler_zlib//:zlib` and `@net_zlib_zlib//:zlib` in our C++ codebase. ```sh bazel query --noimplicit_deps \ 'allpaths(//:scheduling_queue_test, @net_zlib_zlib//:zlib)' # Output //:core_worker_lib //:scheduling_queue_test //src/ray/util:pipe_logger //src/ray/util:stream_redirection_utils @boost//:iostreams @net_zlib_zlib//:zlib Loading: 7 packages loaded ``` My initial thought was to avoid having `scheduling_queue_test` use `@net_zlib_zlib//:zlib`, so I tried separating CoreWorker into smaller BAZEL targets. However, I found it non-trivial and eventually gave up, using the following command as a workaround. ```sh bazel test --features=asan -c dbg //:scheduling_queue_test --test_output=all --test_env=ASAN_OPTIONS="detect_odr_violation=0" ``` --------- Signed-off-by: kaihsun <[email protected]>

…lugins (#51565) Fixes #51196   ## Why are these changes needed?  ## Related issue number  ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :(

…shboard_[module_name].err (#51545) Signed-off-by: Chi-Sheng Liu <[email protected]>

…e and supports websocket handler returns normal HTTP response (#51552) Signed-off-by: Chi-Sheng Liu <[email protected]>

Created by release automation bot. Update with commit 8ee3f00 Signed-off-by: Lonnie Liu <[email protected]> Co-authored-by: Lonnie Liu <[email protected]>

…P API (#51555) Signed-off-by: Chi-Sheng Liu <[email protected]>

This PR is stacked upon #51179, to make redirection stream unit testable. Basically a no-op change, to extract redirection logic out into a separate file, and leave exit hook and global registry where they're now. --------- Signed-off-by: dentiny <[email protected]>

…nerated code and `custom_types.py` are inconsistent (#51568) The validation fails when I update a `.proto` file, compile the Ray codebase, and run a Ray program. However, the original error message instructs me to generate the Protobuf code again, which I have already done. Instead, I need to update `custom_types.py` to fix the issue. The error message with this PR: <img width="1132" alt="Screenshot 2025-03-20 at 2 49 18 PM" src="https://github.com/user-attachments/assets/96ca439d-6e35-49c0-aabf-759ed618cd91" /> Signed-off-by: Kai-Hsun Chen <[email protected]>

``` REGRESSION 9.35%: client__get_calls (THROUGHPUT) regresses from 1094.7883444776185 to 992.4456902391204 in microbenchmark.json REGRESSION 7.87%: tasks_per_second (THROUGHPUT) regresses from 399.43954902981744 to 367.9840802358416 in benchmarks/many_tasks.json REGRESSION 6.60%: multi_client_put_gigabytes (THROUGHPUT) regresses from 43.246981615749526 to 40.39150444280067 in microbenchmark.json REGRESSION 5.16%: client__tasks_and_put_batch (THROUGHPUT) regresses from 14341.529664523765 to 13601.436104861408 in microbenchmark.json REGRESSION 5.03%: 1_1_actor_calls_concurrent (THROUGHPUT) regresses from 5402.532852540871 to 5130.570133178275 in microbenchmark.json REGRESSION 4.83%: 1_1_actor_calls_async (THROUGHPUT) regresses from 8588.075503140139 to 8173.653446206568 in microbenchmark.json REGRESSION 4.71%: single_client_tasks_and_get_batch (THROUGHPUT) regresses from 6.116479739439202 to 5.828378076935622 in microbenchmark.json REGRESSION 4.06%: single_client_get_calls_Plasma_Store (THROUGHPUT) regresses from 10975.200393255369 to 10529.193272608605 in microbenchmark.json REGRESSION 3.71%: client__tasks_and_get_batch (THROUGHPUT) regresses from 0.9551721070094008 to 0.9197513826205774 in microbenchmark.json REGRESSION 3.25%: 1_1_actor_calls_sync (THROUGHPUT) regresses from 2024.9514970549762 to 1959.1925407193576 in microbenchmark.json REGRESSION 2.78%: single_client_put_gigabytes (THROUGHPUT) regresses from 18.30617444315663 to 17.79739662942353 in microbenchmark.json REGRESSION 1.46%: client__1_1_actor_calls_async (THROUGHPUT) regresses from 1057.2932167754398 to 1041.8730021547178 in microbenchmark.json REGRESSION 1.32%: 1_n_actor_calls_async (THROUGHPUT) regresses from 8168.440029557936 to 8060.698907411474 in microbenchmark.json REGRESSION 1.19%: single_client_tasks_sync (THROUGHPUT) regresses from 981.51641421362 to 969.8384217890384 in microbenchmark.json REGRESSION 0.89%: client__1_1_actor_calls_concurrent (THROUGHPUT) regresses from 1056.4662855748954 to 1047.1016344870811 in microbenchmark.json REGRESSION 0.58%: actors_per_second (THROUGHPUT) regresses from 591.3775923644333 to 587.9457127979538 in benchmarks/many_actors.json REGRESSION 0.56%: 1_1_async_actor_calls_sync (THROUGHPUT) regresses from 1434.2085547024217 to 1426.2018801386466 in microbenchmark.json REGRESSION 116.92%: dashboard_p50_latency_ms (LATENCY) regresses from 32.123 to 69.681 in benchmarks/many_actors.json REGRESSION 59.07%: dashboard_p99_latency_ms (LATENCY) regresses from 589.9 to 938.359 in benchmarks/many_tasks.json REGRESSION 57.53%: dashboard_p95_latency_ms (LATENCY) regresses from 398.245 to 627.361 in benchmarks/many_tasks.json REGRESSION 53.36%: dashboard_p50_latency_ms (LATENCY) regresses from 89.962 to 137.963 in benchmarks/many_tasks.json REGRESSION 37.60%: dashboard_p99_latency_ms (LATENCY) regresses from 3067.405 to 4220.801 in benchmarks/many_actors.json REGRESSION 12.91%: stage_0_time (LATENCY) regresses from 6.343268156051636 to 7.161974191665649 in stress_tests/stress_test_many_tasks.json REGRESSION 10.77%: dashboard_p95_latency_ms (LATENCY) regresses from 2575.96 to 2853.454 in benchmarks/many_actors.json REGRESSION 6.85%: dashboard_p99_latency_ms (LATENCY) regresses from 252.85 to 270.166 in benchmarks/many_pgs.json REGRESSION 2.52%: 10000_get_time (LATENCY) regresses from 23.620077062999997 to 24.215384834000005 in scalability/single_node.json REGRESSION 2.22%: avg_iteration_time (LATENCY) regresses from 1.1939783954620362 to 1.220467975139618 in stress_tests/stress_test_dead_actors.json REGRESSION 1.80%: stage_3_time (LATENCY) regresses from 1829.902144908905 to 1862.925583600998 in stress_tests/stress_test_many_tasks.json REGRESSION 1.73%: 1000000_queued_time (LATENCY) regresses from 191.976472028 to 195.30269835 in scalability/single_node.json REGRESSION 1.56%: time_to_broadcast_1073741824_bytes_to_50_nodes (LATENCY) regresses from 17.602684142 to 17.87641767999999 in scalability/object_store.json REGRESSION 1.12%: 10000_args_time (LATENCY) regresses from 18.656692702999997 to 18.865748501 in scalability/single_node.json REGRESSION 0.60%: stage_2_avg_iteration_time (LATENCY) regresses from 39.46649179458618 to 39.70143375396729 in stress_tests/stress_test_many_tasks.json REGRESSION 0.45%: 107374182400_large_object_time (LATENCY) regresses from 29.23165342300001 to 29.36276392100001 in scalability/single_node.json ``` Signed-off-by: Lonnie Liu <[email protected]> Co-authored-by: Lonnie Liu <[email protected]>

They're owned by core team. --------- Signed-off-by: Edward Oakes <[email protected]>

## Why are these changes needed? It's tricky for users to implement `preprocess` function when constructing a Processor, because users may not have an idea about what's the input dataset should look like (i.e. what's the expected schema). This PR proposes a new API `log_input_column_names()` that logs the expected schema. Example: ```python import ray from ray.data.llm import build_llm_processor, vLLMEngineProcessorConfig processor_config = vLLMEngineProcessorConfig(...) processor = build_llm_processor(...) processor.log_input_column_names() # The first stage of the processor is ChatTemplateStage. # Required input columns: # messages: A list of messages in OpenAI chat format. See https://platform.openai.com/docs/api-reference/chat/create for details. processor_config = vLLMEngineProcessorConfig( apply_chat_template=False, tokenize=False, ) processor = build_llm_processor(...) processor.log_input_column_names() # The first stage of the processor is vLLMEngineStage. # Required input columns: # prompt: The text prompt (str). # sampling_params: The sampling parameters. See https://docs.vllm.ai/en/latest/api/inference_params.html#sampling-parameters for details. # Optional input columns: # tokenized_prompt: The tokenized prompt. If provided, the prompt will not be tokenized by the vLLM engine. # images: The images to generate text from. If provided, the prompt will be a multimodal prompt. # model: The model to use for this request. If the model is different from the model set in the stage, then this is a LoRA request. ``` ## Related issue number  ## Checks - [x] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [x] I've run `scripts/format.sh` to lint the changes in this PR. - [x] I've included any doc changes needed for https://docs.ray.io/en/master/. - [x] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [x] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: Cody Yu <[email protected]>

## Why are these changes needed? 1. Adding more ops to `BlockColumnAccessor` 2. Fixing circular imports in Ray Data 3. Fixing AggregateFnV2 to be proper ABC 4. Simplifying `accumulate_block` op --------- Signed-off-by: Alexey Kudinkin <[email protected]>

)   ## Why are these changes needed? Fixes #51195 ## Related issue number  ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :(

Signed-off-by: liuxsh9 <[email protected]> Signed-off-by: Kourosh Hakhamaneshi <[email protected]> Co-authored-by: Kourosh Hakhamaneshi <[email protected]>

…51563) ## Why are these changes needed? `use_legacy_format` has been deprecated since Arrow 15.0.0 and [is deleted from the repo](https://github.com/apache/arrow/pull/45742/files). Provided that it defaults to `use_legacy_format=False`, removing it from the Ray repo completely. --------- Signed-off-by: Alexey Kudinkin <[email protected]>

Signed-off-by: Chi-Sheng Liu <[email protected]>

## Why are these changes needed? Add gen config related doc ## Related issue number Closes https://anyscale1.atlassian.net/browse/LLM-1786?atlOrigin=eyJpIjoiZDg2MWMxNmU0YTY2NDRhMGJiN2JmNDk0NmNjYjE3OWIiLCJwIjoiaiJ9 ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: Gene Su <[email protected]>

The compiled graphs quickstart was misusing `testcode`, which cannot be composed with `literalinclude`. The included code is already tested, so no concern about missed coverage. I've also split off the `core` and `tune` doctests into builds tagged with the appropriate team. --------- Signed-off-by: Edward Oakes <[email protected]>

## Why are these changes needed? The locust request duration is milliseconds. ## Related issue number  ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( Signed-off-by: akyang-anyscale <[email protected]>

…s to pypi (#51517) - Add helper function to add build tag (e.g. `-1`) right after Ray version in the wheel name, in cases where original wheels uploaded to pypi/test pypi are corrupted. --------- Signed-off-by: kevin <[email protected]>

The correct destination is stderr but not stdout. - We've mentioned effect to stream to stderr here: https://github.com/ray-project/ray/blob/a42e6580a59dff3291a56595a74ff27c04d9e29d/python/ray/_private/services.py#L1142-L1144 - Stream handler is used when logging filename not specified, which streams to stderr: https://docs.python.org/3/library/logging.handlers.html#logging.StreamHandler Signed-off-by: dentiny <[email protected]>

## Why are these changes needed? Add TorchDataLoader to Train Benchmark. --------- Signed-off-by: Srinath Krishnamachari <[email protected]>

Signed-off-by: Chi-Sheng Liu <[email protected]>

…50984) Signed-off-by: Chi-Sheng Liu <[email protected]>

…51557) Signed-off-by: Tatsuya Nishiyama <[email protected]>

Signed-off-by: Chi-Sheng Liu <[email protected]>

…y in RayService (#51095) Signed-off-by: Cheng-Yeh Chung <[email protected]>

Signed-off-by: dentiny <[email protected]>

Just a missing comma and equal sign Signed-off-by: Jonathan Dumaine <[email protected]>

pull bot added the ⤵️ pull label Jun 29, 2023

aslonnie force-pushed the master branch from 966c10e to 3d32f98 Compare December 25, 2023 04:13

kevin85421 and others added 28 commits March 6, 2025 07:21

[core] delete UNORDERED_VS_ABSL_MAPS_EVALUATION (#51115)

89846eb

The macro is never used in our codebase, so delete it. Signed-off-by: dentiny <[email protected]>

[serve.llm] Added benchmark release tests (#51106)

1b3ec51

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>

[ci] Revert UV pip compile for LLM requirements (#51118)

cad5fe7

UV doesn't seem to carry over the right environment markers in `requirements_compiled.txt`. We can temporarily revert this until we find a fix for the issue. Signed-off-by: Kevin H. Luu <[email protected]>

Fix typos in comments and strings (#51079)

e49c0e3

Fix typos in comments and strings Signed-off-by: co63oc <[email protected]>

[train] use correct mean and standard deviation norm values in image …

f6c800d

…tutorials (#50240) Updates docs to use correct normalization values in image datasets. --------- Signed-off-by: Ricardo Decal <[email protected]>

[docs] change order of libraries based on popularity (#51136)

292a62e

changes ordering of libraries based on popularity (pytorch first, then xgboost, then the rest) --------- Signed-off-by: Ricardo Decal <[email protected]>

[core] Fix double checked locking (#51073)

e56a4cd

Double checked locking is known to be buggy (check if null and update pointer leads to data race), `std::once_flag` is the solution. --------- Signed-off-by: dentiny <[email protected]>

[data] Add dataset/operator state, progress, total metrics (#50770)

0cf2a4b

Add various metrics that are captured in the progress bar but are not captured in the prometheus metrics emitted. --------- Signed-off-by: Matthew Owen <[email protected]>

[CI] Enable python-no-log-warn precommit rule (#51099)

8b26f5c

`logging.warn` is legacy and deprecated Signed-off-by: Chi-Sheng Liu <[email protected]>

[ray.data.llm] Add release tests (#51063)

e1b62db

Signed-off-by: Cody Yu <[email protected]>

[core] Block before running gpu tests (#51131)

47f930f

A lot of things change coregpubuild so the multi gpu tests run more often than they need to. We're moving to manually running these multi-gpu tests. --------- Signed-off-by: dayshah <[email protected]>

[core][cgraph][docs] Fix cgraph gpu docs code (#51132)

779fc73

This docs code was not passing multi-gpu ci step. --------- Signed-off-by: dayshah <[email protected]>

[serve] Fix bad mock import (#51170)

87f47e2

Somehow a bad import snuck in... ## Related issue number Closes #49634 Closes #49632 Closes #49638 Closes #49642 --------- Signed-off-by: Edward Oakes <[email protected]>

kevin85421 and others added 30 commits March 20, 2025 17:27

[Feat][Core/Dashboard] Redirect child process stdout and stderr to da…

5b83764

…shboard_[module_name].err (#51545) Signed-off-by: Chi-Sheng Liu <[email protected]>

[Feat][Core/Dashboard] Add more shared properties to subprocess modul…

74a456e

…e and supports websocket handler returns normal HTTP response (#51552) Signed-off-by: Chi-Sheng Liu <[email protected]>

[docker] Update latest Docker dependencies for 2.44.0 release (#51580)

6bb9cef

Created by release automation bot. Update with commit 8ee3f00 Signed-off-by: Lonnie Liu <[email protected]> Co-authored-by: Lonnie Liu <[email protected]>

[Feat][Core/Dashboard] Remove ReportEventService and replace with HTT…

fc830cc

…P API (#51555) Signed-off-by: Chi-Sheng Liu <[email protected]>

Move experimental and OOM tests to core builds (#51525)

46e8c25

They're owned by core team. --------- Signed-off-by: Edward Oakes <[email protected]>

[llm] ray.llm support custom accelerators (#51359)

360ede3

Signed-off-by: liuxsh9 <[email protected]> Signed-off-by: Kourosh Hakhamaneshi <[email protected]> Co-authored-by: Kourosh Hakhamaneshi <[email protected]>

[CI] Upgrade pytest-aiohttp to 1.1.0 (#51556)

02e4c91

Signed-off-by: Chi-Sheng Liu <[email protected]>

Add TorchDataLoader to Train Benchmark (#51456)

66cc801

## Why are these changes needed? Add TorchDataLoader to Train Benchmark. --------- Signed-off-by: Srinath Krishnamachari <[email protected]>

[Feat][Core/Dashboard] Convert DataHead to subprocess module (#51507)

90c5a48

Signed-off-by: Chi-Sheng Liu <[email protected]>

[Docs][Core] Update system logs doc for dashboard subprocess module (#…

49629ef

…50984) Signed-off-by: Chi-Sheng Liu <[email protected]>

[Core] Cover cpplint for /src/ray/core_worker (excluding transport) (#…

07c24e9

…51557) Signed-off-by: Tatsuya Nishiyama <[email protected]>

[Feat][Core/Dashboard] Convert EventHead to subprocess module (#51587)

2765db7

Signed-off-by: Chi-Sheng Liu <[email protected]>

[Doc][KubeRay] Add a doc to explain why some worker Pods are not read…

8978adf

…y in RayService (#51095) Signed-off-by: Cheng-Yeh Chung <[email protected]>

[core] [easy] [noop] Add comments on client call (#51614)

6ecf992

Signed-off-by: dentiny <[email protected]>

Fix syntax errors in Ray Tune example pbt_ppo_example.ipynb (#51626)

358909f

Just a missing comma and equal sign Signed-off-by: Jonathan Dumaine <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] master from ray-project:master #140

[pull] master from ray-project:master #140

pull bot commented Jun 29, 2023 •

edited

Loading

[pull] master from ray-project:master #140

Are you sure you want to change the base?

[pull] master from ray-project:master #140

Conversation

pull bot commented Jun 29, 2023 • edited Loading

pull bot commented Jun 29, 2023 •

edited

Loading