Add large RAM GPU runner for `vllm` by shermansiu · Pull Request #1596 · conda-forge/admin-requests

shermansiu · 2025-07-29T05:42:14Z

vllm is a package that uses caching to enable high-throughput LLM inference. Building the CUDA wheel takes a lot of RAM which causes regular hosts to crash, which is why I'd like to request a bigger one for building the wheel.

Checklist:

I want to request (or revoke) access to an opt-in CI resource:
- Pinged the relevant feedstock team(s)
- Added a small description explaining why access is needed

@conda-forge/vllm (I am one of the maintainers)

This can be merged after:

The maintainers (me and @maresb) get added as allowed open-gpu-server users:
- shermansiu: Add shermansiu to the list of allowed users Quansight/open-gpu-server#63
- maresb: Add @maresb to the list of allowed users Quansight/open-gpu-server#64
The vllm feedstock gets created: Add vllm staged-recipes#28931

h-vetinari · 2025-07-29T21:53:03Z

+feedstocks:
+  - vllm
+resources:
+  - cirun-openstack-gpu-2xlarge


I'm fine to start off like this, but if vllm can be compiled with CPU-agents (compiling CUDA doesn't need a GPU), then we should do that.

There'll be one more PR to https://github.com/conda-forge/.cirun, which is also where we can then add further resource policies in the future

conda-forge/.cirun#109

Thanks for creating the cirun PR! Do we need any additional PRs to use the larger CPU runners like ci_2xlarge?

[...] https://github.com/conda-forge/.cirun, which is also where we can then add further resource policies in the future

☝️

Check out the pull request I linked. It creates a policy for vllm and adds it to a list of policies that are enabled for cirun-openstack-gpu-2xlarge runners. To be able to use other (e.g. smaller and/or CPU-only) runners, you need to add the vllm policy to the respective list.

Ah, so as I understand it, subsequent PRs will be made directly to https://github.com/conda-forge/.cirun and not https://github.com/conda-forge/admin-requests? Because now, the CI policy has been created?

Yes. Please don't make me repeat myself N times

Add large RAM GPU runner for vllm

eed4c3a

shermansiu requested a review from a team as a code owner July 29, 2025 05:42

shermansiu mentioned this pull request Jul 29, 2025

Add vllm conda-forge/staged-recipes#28931

Merged

19 tasks

Merge branch 'main' into feat/add_runner_for_vllm

40566d7

h-vetinari merged commit d96220e into conda-forge:main Jul 29, 2025
1 check passed

h-vetinari reviewed Jul 29, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add large RAM GPU runner for `vllm`#1596

Add large RAM GPU runner for `vllm`#1596
h-vetinari merged 2 commits intoconda-forge:mainfrom
shermansiu:feat/add_runner_for_vllm

shermansiu commented Jul 29, 2025 •

edited

Loading

Uh oh!

Uh oh!

h-vetinari Jul 29, 2025

Uh oh!

h-vetinari Jul 29, 2025

Uh oh!

shermansiu Jul 29, 2025

Uh oh!

h-vetinari Jul 30, 2025

Uh oh!

shermansiu Jul 30, 2025

Uh oh!

h-vetinari Jul 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

shermansiu commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist:

Uh oh!

Uh oh!

h-vetinari Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

h-vetinari Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

shermansiu Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

h-vetinari Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

shermansiu Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

h-vetinari Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

shermansiu commented Jul 29, 2025 •

edited

Loading