Skip to content

🐛 fix tensor parallel#301

Merged
prashantgupta24 merged 4 commits intomainfrom
fix-tp
Jul 11, 2025
Merged

🐛 fix tensor parallel#301
prashantgupta24 merged 4 commits intomainfrom
fix-tp

Conversation

@joerunde
Copy link
Copy Markdown
Collaborator

@joerunde joerunde commented Jul 10, 2025

Description

Fixes a bug introduced in #283 where the non-driver workers did not cache the output tokens for the next decode iteration.

This also allows TP tests with TP=2 to run on cpu, so that we can catch these bugs on GHA runs.

Signed-off-by: Joe Runde <joe@joerun.de>
@github-actions
Copy link
Copy Markdown

👋 Hi! Thank you for contributing to vLLM support on Spyre.
Just a reminder: Make sure that your code passes all the linting checks, otherwise your PR won't be able to be merged. To do so, first install the linting requirements, then run format.sh and commit the changes. This can be done with uv directly:

uv sync --frozen --group lint --active --inexact

Or this can be done with pip:

uv pip compile --group lint > requirements-lint.txt
pip install -r requirements-lint.txt
bash format.sh

Now you are good to go 🚀

Signed-off-by: Joe Runde <joe@joerun.de>
@prashantgupta24
Copy link
Copy Markdown
Collaborator

bot:test
MARKERS="spyre and not quantized and not multi and not embedding"

@prashantgupta24
Copy link
Copy Markdown
Collaborator

bot:test
MARKERS="spyre and multi"

Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
Copy link
Copy Markdown
Collaborator

@prashantgupta24 prashantgupta24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@prashantgupta24 prashantgupta24 enabled auto-merge (squash) July 11, 2025 17:19
@github-actions github-actions bot added the ready Runs the full CI test suite. Only add to PRs once ready to merge to limit public GHA usage label Jul 11, 2025
@prashantgupta24 prashantgupta24 disabled auto-merge July 11, 2025 17:30
@prashantgupta24
Copy link
Copy Markdown
Collaborator

prashantgupta24 commented Jul 11, 2025

actually let me try to see if CB works with TP first before merging - edit: doesn't seem to work

@prashantgupta24 prashantgupta24 merged commit e9604ff into main Jul 11, 2025
15 of 19 checks passed
@prashantgupta24 prashantgupta24 deleted the fix-tp branch July 11, 2025 17:47
rafvasq pushed a commit to rafvasq/sendnn-inference that referenced this pull request Mar 11, 2026
Signed-off-by: waleedqk <waleedqk@ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready Runs the full CI test suite. Only add to PRs once ready to merge to limit public GHA usage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants