chore: add support for v0.18.0#857
Conversation
|
👋 Hi! Thank you for contributing to vLLM support on Spyre. We also recommend installing prek and configuring it to check your code before every local commit. |
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
6fb6d7e to
98e417b
Compare
Co-authored-by: Max de Bayser <mbayser@br.ibm.com> Co-authored-by: Rafael Vasquez <rafvasq21@gmail.com> Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
77179e2 to
5e3cabf
Compare
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
Now there is a `user_specified_block_size` variable in CacheConfig. Probably it was introduced to figure out whether the user changed the block size or not. In vllm-spyre platform.py we're not technically the user, but only 64 is valid on Spyre anyway and for some reason, setting the block_size directly no longer workes because it's overwritten with the default of 16. Signed-off-by: Max de Bayser <maxdebayser@gmail.com>
Signed-off-by: Max de Bayser <maxdebayser@gmail.com>
Signed-off-by: Max de Bayser <maxdebayser@gmail.com>
|
bot:test |
|
bot:test |
In 2.10, emtpy_cache() is available but throws an error because in vllm-spyre we don´t allocate any accelerator. Since in spyre-next we will do so, I think it's better to add a check before calling empty_cache() instead of just replacing the whole thing by a noop Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
|
bot:test |
|
@rafvasq , @tjohnson31415 , except for the readthedocs check, all other tests are passing. Since this PR doesn't touch the docs, I think it's not a blocker. |
|
Thanks @maxdebayser, yes the unrelated doc failure is fixed in #860. |
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
|
bot:test |
1 similar comment
|
bot:test |
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Description
2.7.1--swap-spaceparam ([V0 Deprecation] Remove unused swap_space parameter vllm-project/vllm#36216)cache_config.user_specified_block_size = Trueto avoid block size overrides.Related Issues
Checklist
bash format.sh)Signed-off-by:line (DCO compliance)