Skip to content

feat: Reduce number of Cuda IPC in Refit#568

Closed
guyueh1 wants to merge 8 commits intoyifu/deepseek_ep_mainfrom
guyueh/yifu/deepseek_ep_main
Closed

feat: Reduce number of Cuda IPC in Refit#568
guyueh1 wants to merge 8 commits intoyifu/deepseek_ep_mainfrom
guyueh/yifu/deepseek_ep_main

Conversation

@guyueh1
Copy link
Contributor

@guyueh1 guyueh1 commented Jun 26, 2025

What does this PR do ?

Reduce number of cuda IPC in refit, packing parameters into big tensors and unpack in Vllm.

Issues

List issues that this PR closes (syntax):

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this 

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
  • Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

  • ...

Signed-off-by: Guyue Huang <guyueh@nvidia.com>

# pack tensors in gathered_hf_params to a big tensor
type_to_packed_big_tensor_size = defaultdict(lambda : 0)
key_to_type_and_offset_and_size_in_big_tensor = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest running this code through cursor to simplify the code, var names, typos..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes this is so far a draft, i will do some cleanup

guyueh1 added 2 commits June 27, 2025 11:05
Signed-off-by: Guyue Huang <guyueh@nvidia.com>
Signed-off-by: Guyue Huang <guyueh@nvidia.com>
@guyueh1 guyueh1 changed the title Reduce number of Cuda IPC in Refit feat: Reduce number of Cuda IPC in Refit Jun 27, 2025
@guyueh1 guyueh1 requested review from parthchadha and yfw June 27, 2025 19:51
@guyueh1
Copy link
Contributor Author

guyueh1 commented Jul 1, 2025

DO NOT merge, raised a new PR to main #589

guyueh1 added 2 commits July 7, 2025 09:52
…EFIT

Signed-off-by: Guyue Huang <guyueh@nvidia.com>
Signed-off-by: Guyue Huang <guyueh@nvidia.com>
@guyueh1
Copy link
Contributor Author

guyueh1 commented Jul 9, 2025

closing because #589 is merged

@guyueh1 guyueh1 closed this Jul 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants