Skip to content

Conversation

@ryanaoleary
Copy link
Contributor

@ryanaoleary ryanaoleary commented Oct 4, 2025

Why are these changes needed?

With #55207 Ray Train now has support for training functions with a JAX backend through the new JaxTrainer API. This guide provides a short overview of the API, how to configure with TPUs, and how to edit a JAX script to use Ray Train.

TODO: I will link a longer e2e guide with KubeRay, MaxText, and the JaxTrainer on TPUs in GKE

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run pre-commit jobs to lint the changes in this PR. (pre-commit setup)
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Ryan O'Leary <[email protected]>

Finish jax guide

Signed-off-by: Ryan O'Leary <[email protected]>
@ryanaoleary ryanaoleary requested review from a team as code owners October 4, 2025 02:03
@ryanaoleary
Copy link
Contributor Author

cc: @liulehui @matthewdeng @chiayi

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds a new documentation page for JaxTrainer, which is a great addition. The guide provides a good overview of the API and how to use it with TPUs. I've made a few suggestions to improve the document's correctness and consistency, mainly around fixing some formatting issues, correcting a version number, and aligning the code examples with the recommended public APIs. Addressing these points will make the guide clearer and more accurate for users.

Signed-off-by: Ryan O'Leary <[email protected]>
cursor[bot]

This comment was marked as outdated.

@ray-gardener ray-gardener bot added docs An issue or change related to documentation train Ray Train Related Issue community-contribution Contributed by the community labels Oct 4, 2025
Copy link
Contributor

@liulehui liulehui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much!!! 🚀

ryanaoleary and others added 2 commits October 9, 2025 13:38
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Ryan O'Leary <[email protected]>
Signed-off-by: Ryan O'Leary <[email protected]>
@ryanaoleary ryanaoleary requested a review from liulehui October 9, 2025 21:36
Signed-off-by: Ryan O'Leary <[email protected]>
cursor[bot]

This comment was marked as outdated.


JaxTrainer API
--------------
The `JaxTrainer` is the core component for orchestrating distributed JAX training in Ray Train with TPUs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add these to the API references so we can link them here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it in f634b93., I only linked it in two places though not every time I used JaxTrainer since it seemed excessive.

Co-authored-by: matthewdeng <[email protected]>
Signed-off-by: Ryan O'Leary <[email protected]>
ryanaoleary and others added 2 commits October 16, 2025 04:48
Co-authored-by: matthewdeng <[email protected]>
Signed-off-by: Ryan O'Leary <[email protected]>
Signed-off-by: Ryan O'Leary <[email protected]>
Copy link
Contributor

@matthewdeng matthewdeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@matthewdeng matthewdeng enabled auto-merge (squash) October 21, 2025 05:08
@github-actions github-actions bot added the go add ONLY when ready to merge, run all tests label Oct 21, 2025
@github-actions github-actions bot disabled auto-merge October 21, 2025 16:59
@matthewdeng matthewdeng merged commit eead991 into ray-project:master Oct 21, 2025
6 checks passed
elliot-barn pushed a commit that referenced this pull request Oct 23, 2025
## Why are these changes needed?

With #55207 Ray Train now has
support for training functions with a JAX backend through the new
`JaxTrainer` API. This guide provides a short overview of the API, how
to configure with TPUs, and how to edit a JAX script to use Ray Train.

TODO: I will link a longer e2e guide with KubeRay, MaxText, and the
JaxTrainer on TPUs in GKE

---------

Signed-off-by: Ryan O'Leary <[email protected]>
Signed-off-by: Ryan O'Leary <[email protected]>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: matthewdeng <[email protected]>
Co-authored-by: matthewdeng <[email protected]>
Signed-off-by: elliot-barn <[email protected]>
landscapepainter pushed a commit to landscapepainter/ray that referenced this pull request Nov 17, 2025
## Why are these changes needed?

With ray-project#55207 Ray Train now has
support for training functions with a JAX backend through the new
`JaxTrainer` API. This guide provides a short overview of the API, how
to configure with TPUs, and how to edit a JAX script to use Ray Train.

TODO: I will link a longer e2e guide with KubeRay, MaxText, and the
JaxTrainer on TPUs in GKE

---------

Signed-off-by: Ryan O'Leary <[email protected]>
Signed-off-by: Ryan O'Leary <[email protected]>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: matthewdeng <[email protected]>
Co-authored-by: matthewdeng <[email protected]>
Aydin-ab pushed a commit to Aydin-ab/ray-aydin that referenced this pull request Nov 19, 2025
## Why are these changes needed?

With ray-project#55207 Ray Train now has
support for training functions with a JAX backend through the new
`JaxTrainer` API. This guide provides a short overview of the API, how
to configure with TPUs, and how to edit a JAX script to use Ray Train.

TODO: I will link a longer e2e guide with KubeRay, MaxText, and the
JaxTrainer on TPUs in GKE

---------

Signed-off-by: Ryan O'Leary <[email protected]>
Signed-off-by: Ryan O'Leary <[email protected]>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: matthewdeng <[email protected]>
Co-authored-by: matthewdeng <[email protected]>
Signed-off-by: Aydin Abiar <[email protected]>
Future-Outlier pushed a commit to Future-Outlier/ray that referenced this pull request Dec 7, 2025
## Why are these changes needed?

With ray-project#55207 Ray Train now has
support for training functions with a JAX backend through the new
`JaxTrainer` API. This guide provides a short overview of the API, how
to configure with TPUs, and how to edit a JAX script to use Ray Train.

TODO: I will link a longer e2e guide with KubeRay, MaxText, and the
JaxTrainer on TPUs in GKE

---------

Signed-off-by: Ryan O'Leary <[email protected]>
Signed-off-by: Ryan O'Leary <[email protected]>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: matthewdeng <[email protected]>
Co-authored-by: matthewdeng <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Contributed by the community docs An issue or change related to documentation go add ONLY when ready to merge, run all tests train Ray Train Related Issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants