-
Notifications
You must be signed in to change notification settings - Fork 359
Fix VLLM data-parallel #541
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
# note: this has changed on 0.3.3, and it only works now if num_gpus are set. | ||
# but then tensor_parallel breaks | ||
@ray.remote | ||
# Hynek: With the newest vllm, it actually breaks when tensor_parallel_size == 1 and num_gpus not set, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to fix vllm version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, it should be lower-bounded from v0.7.0
} | ||
if int(config.data_parallel_size) > 1: | ||
self.model_args["worker_use_ray"] = True | ||
self.model_args["distributed_executor_backend"] = "ray" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this work with both versions of vllm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will break for versions less than v0.7.0
. A solution would be to have an if/else statement that checks the vllm
version and falls back to the old logic. Personally I would just lower bound vllm
since it's gonna be a pain to keep track of all the changes over time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would work for me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great fix, LGTM!
Don't merge before fixing vllm import ^^ |
should be good now :) |
* make bleur lazy * make tokenizer lazy too * fix_ray * fix tensor_paralel > 1 * remove debug statements * bump vllm --------- Co-authored-by: Hynek Kydlicek <[email protected]>
* make bleur lazy * make tokenizer lazy too * fix_ray * fix tensor_paralel > 1 * remove debug statements * bump vllm --------- Co-authored-by: Hynek Kydlicek <[email protected]>
Fixes #532