Skip to content

Conversation

@kouroshHakha
Copy link
Contributor

@kouroshHakha kouroshHakha commented Oct 30, 2025

Summary

This PR adds data parallel attention support to the public API, creates comprehensive documentation, and sets up CI testing infrastructure for multi-GPU examples.

Changes

Public API

  • Made DPRankAssigner private (_DPRankAssigner)
  • Added build_dp_deployment() and build_dp_openai_app() to ray.serve.llm public API
  • Exposed DPServer as public API in ray.serve.llm.deployment

Documentation

  • Created new user guide: data-parallel-attention.md covering DP patterns, configuration, and usage
  • Added code examples with CI-tested snippets:
    • dp_basic_example.py - Basic DP deployment with OpenAI ingress
    • dp_pd_example.py - DP + Prefill-decode disaggregation

CI Infrastructure

  • Created multi_gpu/ folder for examples requiring 4+ GPUs
  • Split GPU tests: standard tests on g6-large, multi-GPU tests on gpu-large (4 GPUs)
  • Updated BUILD.bazel to handle multi_gpu tests separately with multi_gpu_4 tag

Testing

  • All code examples are CI-tested and included via literalinclude directives
  • Multi-GPU examples run in separate CI step with proper GPU allocation

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
@kouroshHakha kouroshHakha changed the title wip [serve][llm] Data Parallel Attention: Public API and Documentation Oct 30, 2025
@kouroshHakha kouroshHakha added the go add ONLY when ready to merge, run all tests label Oct 30, 2025
Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
@kouroshHakha kouroshHakha marked this pull request as ready for review October 30, 2025 21:22
@kouroshHakha kouroshHakha requested review from a team as code owners October 30, 2025 21:22
cursor[bot]

This comment was marked as outdated.

@ray-gardener ray-gardener bot added serve Ray Serve Related Issue docs An issue or change related to documentation llm labels Oct 31, 2025
@richardliaw richardliaw merged commit 2691094 into ray-project:master Nov 8, 2025
6 checks passed
- gpu
instance_type: gpu-large
commands:
- RAYCI_DISABLE_TEST_DB=1 bazel run //ci/ray_ci:test_in_docker -- //doc/... llm
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add

//python/ray/llm/...

so potential multi_gpu_4 targets added there will be picked up too?

Deploy with:
```bash
serve deploy dp_config.yaml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
serve deploy dp_config.yaml
serve run dp_config.yaml

Aydin-ab pushed a commit to Aydin-ab/ray-aydin that referenced this pull request Nov 19, 2025
ykdojo pushed a commit to ykdojo/ray that referenced this pull request Nov 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs An issue or change related to documentation go add ONLY when ready to merge, run all tests llm serve Ray Serve Related Issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants