Skip to content

Deepspeed v0.18.5, disable osx-64 builds, add sdvillal as maintainer#109

Merged
weiji14 merged 6 commits into
conda-forge:mainfrom
sdvillal:deepspeed-v0.18.5
Jan 31, 2026
Merged

Deepspeed v0.18.5, disable osx-64 builds, add sdvillal as maintainer#109
weiji14 merged 6 commits into
conda-forge:mainfrom
sdvillal:deepspeed-v0.18.5

Conversation

@sdvillal
Copy link
Copy Markdown
Contributor

@sdvillal sdvillal commented Jan 31, 2026

Checklist

  • Used a personal fork of the feedstock to propose changes
  • Bumped the build number (if the version is unchanged)
  • Reset the build number to 0 (if the version changed)
  • Re-rendered with the latest conda-smithy (Use the phrase @conda-forge-admin, please rerender in a comment in this PR for automated rerendering)
  • Ensured the license file is being packaged.

Closes: #103 #104 #108

We will need to rerun these migrations in order, which should all work when rebased on this PR and given that pytorch 2.10 has finally its build matrix complete (bringing CUDA 13 and python 3.14 support):

See https://conda-forge.org/status/

Notes

As this still segfaults in osx-64 I suggest to remove it from the build matrix from the time being, open an issue and a linked PR to try to bring it back.

v0.18.5 version brings:

making it possible to have build Evoformer attention, a corner case, to play well with the conda ecosystem (see here for an example). We could consider adding the pertinent configuration:
CUTLASS_PATH="DS_IGNORE_CUTLASS_DETECTION"
to the activation of the package - also adding ninja, compilers and cutlass to our runtime dependencies, which is another story. Otherwise we could document this nicer setup in our package itself, maybe pointing to the updated documentation upstream.

@sdvillal sdvillal changed the title Deepspeed v0.18.5 deepspeed v0.18.5 Jan 31, 2026
@conda-forge-admin
Copy link
Copy Markdown
Contributor

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe/recipe.yaml) and found it was in an excellent condition.

@sdvillal sdvillal marked this pull request as draft January 31, 2026 02:56
@sdvillal sdvillal changed the title deepspeed v0.18.5 Deepspeed v0.18.5, disable osx-64 builds Jan 31, 2026
@sdvillal sdvillal changed the title Deepspeed v0.18.5, disable osx-64 builds Deepspeed v0.18.5, disable osx-64 builds, add sdvillal as maintainer Jan 31, 2026
@sdvillal sdvillal marked this pull request as ready for review January 31, 2026 03:24
@sdvillal
Copy link
Copy Markdown
Contributor Author

@weiji14 @loadams I suggest merging this. Then I can try to lend a hand with migrations.

Copy link
Copy Markdown
Member

@weiji14 weiji14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sdvillal for offering to help, and agree with disabling osx-64 builds for now. Could you actually just cherry-pick the Pytorch 2.10 + CUDA 13.0 + python 3.14 migrations into this PR (then rerender)? Or did you want to preserve a pytorch 2.9 + CUDA 12.9 build for now?

Copy link
Copy Markdown
Member

@weiji14 weiji14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, let's just get you in first, and then we can decide on next steps.

making it possible to have build Evoformer attention, a corner case, to play well with the conda ecosystem (aqlaboratory/openfold-3#34 for an example). We could consider adding the pertinent configuration:
CUTLASS_PATH="DS_IGNORE_CUTLASS_DETECTION"
to the activation of the package - also adding ninja, compilers and cutlass to our runtime dependencies, which is another story. Otherwise we could document this nicer setup in our package itself, maybe pointing to the updated documentation upstream.

If you can work on CUTLASS, that would be great! I'm not so sure about adding ninja as a runtime dep, but see the links in #1 and we could reopen and discuss there.

@weiji14 weiji14 merged commit 038e6c6 into conda-forge:main Jan 31, 2026
15 checks passed
@weiji14 weiji14 mentioned this pull request Jan 31, 2026
3 tasks
@sdvillal
Copy link
Copy Markdown
Contributor Author

Thanks a lot for the superfast reaction @weiji14 :-)

I believe we should indeed build against all current setups in torch's matrix. But the main reason for not cherry-picking, that I thought of, is that I am not experienced with the consequences. My poor understanding is that the CF bot keeps record of the migrating issues it opens, and we have "official" mechanisms to rerun migrations. Do you know if cherry-picking here and closing these issues could have worked as well?

We do not need anymore to worry about #1, because we have made deepspeed not depending on ninja python bindings. But adding ninja to the runtime dependencies, together with all the extension building dependencies, should be optional. So maybe creating another output "deepspeed-build" in this recipe that simply adds all these dependencies to our runtime, or hoping for the conda ecosystem to support optional/extra dependencies (I am not current with what is the status here). A good middle ground would be to document somewhere easy to find how to fully isolate extension building within a conda environment - a sort of best-practices how-to. Another option could be to try to precompile, which sounds daunting to me.

@weiji14
Copy link
Copy Markdown
Member

weiji14 commented Jan 31, 2026

But the main reason for not cherry-picking, that I thought of, is that I am not experienced with the consequences. My poor understanding is that the CF bot keeps record of the migrating issues it opens, and we have "official" mechanisms to rerun migrations. Do you know if cherry-picking here and closing these issues could have worked as well?

Yes, it is ok to cherry-pick multiple migrations into a single PR (see e.g. comment at conda-forge/flash-attn-feedstock#44 (comment)). Once the mega PR is merged, you can close the individual migration PRs, and the CF bot will assume that the migration has went through, and continue on to downstream dependencies.

@weiji14
Copy link
Copy Markdown
Member

weiji14 commented Jan 31, 2026

We do not need anymore to worry about #1, because we have made deepspeed not depending on ninja python bindings. But adding ninja to the runtime dependencies, together with all the extension building dependencies, should be optional. So maybe creating another output "deepspeed-build" in this recipe that simply adds all these dependencies to our runtime, or hoping for the conda ecosystem to support optional/extra dependencies (I am not current with what is the status here). A good middle ground would be to document somewhere easy to find how to fully isolate extension building within a conda environment - a sort of best-practices how-to. Another option could be to try to precompile, which sounds daunting to me.

Could you open a separate issue to discuss this? This could be a long conversation :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Auto bot version updates failing with ValueError

3 participants