Enable support for prefill side kv_layout and block_size update#867
Enable support for prefill side kv_layout and block_size update#867xuechendi merged 4 commits intovllm-project:mainfrom
Conversation
enable support for prefill side kv_layout and block_size update 1. update example to support prefill HND and agreed_block_size 2. enable prefill side kv_layout and block_size update Signed-off-by: Chendi Xue <chendi.xue@intel.com> Signed-off-by: Yeonsil Yoon <yeon.sil.yoon@intel.com>
|
@yeonsily , I would suggest create a new nixl_connector file only for Gaudi to CUDA scenario instead of override current hpu_nixl_connector.py one. |
Signed-off-by: Yeonsil Yoon <yeon.sil.yoon@intel.com>
🚧 CI BlockedThe main CI workflow was not started for the following reason:
|
|
Did you missed the switch for flag check and which nixl_connector.py to import |
|
@xuechendi The change in hpu_nixl_connector is common regardless of heterogeneous run. And the ones in hetero_hpu_nixl_connector is extra for hetero. |
I see how to use |
|
@xuechendi @michalkuligowski Thank you for your review! I see CI is failed but don't think it's from my side as my change won't trigger without the flag. It seems CI is broken now. Is there any way to re-trigger CI without commit any change? |
Signed-off-by: Yeonsil Yoon <yeon.sil.yoon@intel.com>
✅ CI PassedAll checks passed successfully against the following vllm commit: |
…-project#867) 1. update example to support prefill HND and agreed_block_size 2. enable prefill side kv_layout and block_size update Port vllm-project/vllm#30448 to vllm-gaudi --------- Signed-off-by: Chendi Xue <chendi.xue@intel.com> Signed-off-by: Yeonsil Yoon <yeon.sil.yoon@intel.com> Signed-off-by: Wang, Zheng W <zheng.w.wang@intel.com>
…-project#867) 1. update example to support prefill HND and agreed_block_size 2. enable prefill side kv_layout and block_size update Port vllm-project/vllm#30448 to vllm-gaudi --------- Signed-off-by: Chendi Xue <chendi.xue@intel.com> Signed-off-by: Yeonsil Yoon <yeon.sil.yoon@intel.com>
…-project#867) 1. update example to support prefill HND and agreed_block_size 2. enable prefill side kv_layout and block_size update Port vllm-project/vllm#30448 to vllm-gaudi --------- Signed-off-by: Chendi Xue <chendi.xue@intel.com> Signed-off-by: Yeonsil Yoon <yeon.sil.yoon@intel.com> Signed-off-by: slokesha <slokeshappa@habana.ai>
1. update example to support prefill HND and agreed_block_size 2. enable prefill side kv_layout and block_size update Port vllm-project/vllm#30448 to vllm-gaudi --------- Signed-off-by: Chendi Xue <chendi.xue@intel.com> Signed-off-by: Yeonsil Yoon <yeon.sil.yoon@intel.com>
Port vllm-project/vllm#30448 to vllm-gaudi