feat: move to uv first#3545
Conversation
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThe PR migrates the project from pip-based dependency management to uv-based workflow, consolidating dependency declarations from multiple requirements files into Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
📖 Documentation Preview: https://69e5d728cd92690f35692b20--resonant-treacle-0fd729.netlify.app Deployed on Netlify from commit 0d6e834 |
There was a problem hiding this comment.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
cicd/Dockerfile-uv.jinja (1)
26-32:⚠️ Potential issue | 🟠 MajorSame nightly build issue as in Dockerfile.jinja.
The sed commands targeting
requirements.txtwill be ineffective if that file is empty or removed. This needs the same fix as the non-uv Dockerfile.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cicd/Dockerfile-uv.jinja` around lines 26 - 32, The sed replacement commands inside the NIGHTLY_BUILD branch can fail if requirements.txt is missing or empty; ensure requirements.txt exists before running the sed commands (e.g., create or touch the file) so the transformations for transformers/peft/accelerate/trl/datasets always run reliably when NIGHTLY_BUILD is "true". Reference the NIGHTLY_BUILD conditional and the sed replacement lines for transformers, peft, accelerate, trl, and datasets to locate where to add the pre-check/creation.cicd/Dockerfile.jinja (1)
27-33:⚠️ Potential issue | 🟠 MajorNightly build sed commands will fail—
requirements.txtdoes not exist.The repository uses
pyproject.tomlanduvfor dependency management; there is norequirements.txtfile. The sed commands at lines 27-33 will fail when executed in the Docker build. Update the nightly build logic to modify the dependency specifications inpyproject.tomlinstead, or inject the development versions through environment variables passed to the pip install step at lines 37-41.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cicd/Dockerfile.jinja` around lines 27 - 33, The nightly build block uses sed to patch requirements.txt which doesn't exist; update the NIGHTLY_BUILD branch to either mutate pyproject.toml or set override env vars used by the later pip install step: replace the sed commands that reference requirements.txt with commands that (a) use a small Python one-liner or sed to replace the package entries (transformers, peft, accelerate, trl, datasets) inside pyproject.toml, or (b) export an env var (e.g. NIGHTLY_DEP_OVERRIDES) containing the git+ URLs and append that var to the pip install invocation in the pip install step so those dev versions are installed; ensure you reference NIGHTLY_BUILD, pyproject.toml, and the pip install step when making the change.
🧹 Nitpick comments (3)
.github/workflows/lint.yml (1)
9-9: Include lockfile changes in PR trigger paths.Line 9 now tracks
pyproject.toml, but dependency-only updates may land via lockfile changes too. Consider addinguv.locktopull_request.pathsso lint always runs on dependency updates.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/lint.yml at line 9, CI workflow pull_request.paths currently includes 'pyproject.toml' but misses the lockfile so dependency-only updates can bypass lint; update the pull_request.paths list in .github/workflows/lint.yml to also include the lockfile (uv.lock) so lint runs on dependency changes, i.e., add 'uv.lock' alongside 'pyproject.toml' under the pull_request.paths entry..github/workflows/tests-nightly.yml (1)
83-90: Consider the risk of--no-depswith nightly packages.Installing nightly HF packages with
--no-depsrelies on the project's pinned dependencies being sufficient. If a@mainversion introduces a new transitive dependency not declared inpyproject.toml, tests could fail with import errors.This is likely intentional for nightly testing (to catch such issues early), but worth documenting or monitoring.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/tests-nightly.yml around lines 83 - 90, The nightly override step ("Override with nightly HF packages") uses pip install with --no-deps which can hide new transitive requirements from `@main` that cause import errors; either remove --no-deps so pip can pull transitive deps when running the pip command that installs the nightly HF packages, or if you intend to keep --no-deps for stricter testing, add a clear comment above the pip install (the line invoking "transformers @ git+https://...@main" etc.) documenting this risk and add a follow-up monitoring action (or a fallback pip install without --no-deps on failure) to catch and surface missing transitive dependencies..github/workflows/tests.yml (1)
111-112: Replace hardcoded package list with dependency group installation.The packages at lines 111-112 and 202-203 duplicate the
devandtestdependency groups frompyproject.toml:141-159. Instead of maintaining this second source of truth, useuv syncto install from the declared groups. This eliminates drift risk and ensures the workflow always matches the project's dependency configuration.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/tests.yml around lines 111 - 112, Replace the hardcoded pip install command (the line invoking "uv pip install black mypy ..." in the workflow) with a call to uv sync that installs the project's declared dependency groups (dev and test) from pyproject.toml; specifically, remove the explicit package list and run uv sync targeting the dev and test groups (e.g., "uv sync" with the appropriate flags to install groups dev and test) so the workflow uses the single source of truth in pyproject.toml.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/multi-gpu-e2e.yml:
- Around line 7-8: Remove the duplicate path entry under on.pull_request.paths:
there are two identical 'pyproject.toml' entries—keep only one. Edit the
workflow (the on.pull_request.paths array) to remove the redundant
'pyproject.toml' line so the array contains a single 'pyproject.toml' entry.
In @.github/workflows/tests.yml:
- Around line 8-10: Update the GitHub Actions workflow file's paths filter to
include the lockfile so dependency-only changes trigger tests: add 'uv.lock' to
the paths array alongside '**.py' and 'pyproject.toml' in the workflow (the two
occurrences shown around the existing paths entries) so lockfile-only updates
won't skip the tests.
In `@docs/installation.qmd`:
- Around line 60-62: Update the default Docker run example in the
docs/installation.qmd so the shown docker run command includes the --ipc=host
flag to avoid shared-memory/DataLoader failures; locate the existing snippet
containing the Docker command (the line starting with docker run --gpus) and add
--ipc=host alongside --rm -it (keeping the advanced example unchanged).
- Around line 142-144: The sentence claiming "Any existing `pip install` command
can be run with `uv pip install`" overstates compatibility; change the copy to
soften the claim (e.g., describe `uv pip` as a drop-in replacement for common
`pip` workflows or for most `pip install` use-cases) and add a brief note or
parenthetical that `uv pip` is not an exact clone of `pip` and that some
options/behaviors (for example `--user`, `--prefix`,
resolver/pre-release/config/env differences) are intentionally different—update
the `uv pip` callout text to reflect this more accurate compatibility statement
and optionally point readers to the official uv documentation for details.
In `@examples/granite4/README.md`:
- Line 18: Clarify that the command "uv pip install --no-build-isolation -e
'.[flash-attn]'" requires an active virtual environment by instructing the
reader to create and activate one first (e.g., python -m venv .venv && source
.venv/bin/activate) or by adding a prior bullet/line in the README code block
that shows creating and activating a venv before running the uv pip install
command.
In `@pyproject.toml`:
- Around line 220-249: In the conflicts array update each conflict entry that
currently uses { package = "axolotl" } to instead list extras only by replacing
that object with { extra = "axolotl" } (i.e., change entries like [{ package =
"axolotl" }, { extra = "vllm" }] to [{ extra = "axolotl" }, { extra = "vllm" }])
so every conflict list contains only { extra = "..."} objects; apply this change
to all entries in the conflicts block.
- Around line 20-23: Update the pinned datasets dependency in pyproject.toml
from "datasets==4.5.0" to "datasets==4.8.3" to match transformers==5.3.0 and
accelerate==1.13.0; after updating the "datasets" line, regenerate any
lock/poetry files or run your dependency install (poetry install / pip-compile /
pip-sync) and run the test suite to verify compatibility with the updated
datasets package.
In `@README.md`:
- Line 90: The README's "PyTorch ≥2.9.0" and pyproject.toml's dependency line
"torch>=2.6.0" disagree; make them consistent by deciding the true minimum
supported version and updating the other file to match (either change README's
"PyTorch ≥2.9.0" to "PyTorch ≥2.6.0" or update pyproject.toml's "torch>=2.6.0"
to "torch>=2.9.0"). Edit the string in the README or the "torch" entry in
pyproject.toml so both files state the same minimum version.
---
Outside diff comments:
In `@cicd/Dockerfile-uv.jinja`:
- Around line 26-32: The sed replacement commands inside the NIGHTLY_BUILD
branch can fail if requirements.txt is missing or empty; ensure requirements.txt
exists before running the sed commands (e.g., create or touch the file) so the
transformations for transformers/peft/accelerate/trl/datasets always run
reliably when NIGHTLY_BUILD is "true". Reference the NIGHTLY_BUILD conditional
and the sed replacement lines for transformers, peft, accelerate, trl, and
datasets to locate where to add the pre-check/creation.
In `@cicd/Dockerfile.jinja`:
- Around line 27-33: The nightly build block uses sed to patch requirements.txt
which doesn't exist; update the NIGHTLY_BUILD branch to either mutate
pyproject.toml or set override env vars used by the later pip install step:
replace the sed commands that reference requirements.txt with commands that (a)
use a small Python one-liner or sed to replace the package entries
(transformers, peft, accelerate, trl, datasets) inside pyproject.toml, or (b)
export an env var (e.g. NIGHTLY_DEP_OVERRIDES) containing the git+ URLs and
append that var to the pip install invocation in the pip install step so those
dev versions are installed; ensure you reference NIGHTLY_BUILD, pyproject.toml,
and the pip install step when making the change.
---
Nitpick comments:
In @.github/workflows/lint.yml:
- Line 9: CI workflow pull_request.paths currently includes 'pyproject.toml' but
misses the lockfile so dependency-only updates can bypass lint; update the
pull_request.paths list in .github/workflows/lint.yml to also include the
lockfile (uv.lock) so lint runs on dependency changes, i.e., add 'uv.lock'
alongside 'pyproject.toml' under the pull_request.paths entry.
In @.github/workflows/tests-nightly.yml:
- Around line 83-90: The nightly override step ("Override with nightly HF
packages") uses pip install with --no-deps which can hide new transitive
requirements from `@main` that cause import errors; either remove --no-deps so pip
can pull transitive deps when running the pip command that installs the nightly
HF packages, or if you intend to keep --no-deps for stricter testing, add a
clear comment above the pip install (the line invoking "transformers @
git+https://...@main" etc.) documenting this risk and add a follow-up monitoring
action (or a fallback pip install without --no-deps on failure) to catch and
surface missing transitive dependencies.
In @.github/workflows/tests.yml:
- Around line 111-112: Replace the hardcoded pip install command (the line
invoking "uv pip install black mypy ..." in the workflow) with a call to uv sync
that installs the project's declared dependency groups (dev and test) from
pyproject.toml; specifically, remove the explicit package list and run uv sync
targeting the dev and test groups (e.g., "uv sync" with the appropriate flags to
install groups dev and test) so the workflow uses the single source of truth in
pyproject.toml.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: a3af9e30-0089-46d9-af9c-ab2188fdc9d7
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (39)
.github/CONTRIBUTING.md.github/workflows/lint.yml.github/workflows/multi-gpu-e2e.yml.github/workflows/pypi.yml.github/workflows/tests-nightly.yml.github/workflows/tests.ymlMANIFEST.inREADME.mdcicd/Dockerfile-uv.jinjacicd/Dockerfile.jinjadocs/debugging.qmddocs/docker.qmddocs/installation.qmdexamples/LiquidAI/README.mdexamples/apertus/README.mdexamples/arcee/README.mdexamples/devstral/README.mdexamples/gemma3n/README.mdexamples/gpt-oss/README.mdexamples/granite4/README.mdexamples/hunyuan/README.mdexamples/internvl3_5/README.mdexamples/magistral/README.mdexamples/magistral/vision/README.mdexamples/ministral3/README.mdexamples/ministral3/vision/README.mdexamples/mistral-small/README.mdexamples/mistral4/README.mdexamples/qwen3-next/README.mdexamples/qwen3.5/README.mdexamples/seed-oss/README.mdexamples/smolvlm2/README.mdexamples/voxtral/README.mdpyproject.tomlrequirements-dev.txtrequirements-tests.txtrequirements.txtsetup.pysrc/setuptools_axolotl_dynamic_dependencies.py
💤 Files with no reviewable changes (5)
- requirements-dev.txt
- requirements-tests.txt
- requirements.txt
- src/setuptools_axolotl_dynamic_dependencies.py
- setup.py
| ```{.bash} | ||
| docker run --gpus '"all"' --rm -it axolotlai/axolotl:main-latest | ||
| docker run --gpus '"all"' --rm -it axolotlai/axolotl-uv:main-latest | ||
| ``` |
There was a problem hiding this comment.
Keep --ipc=host in the default Docker run example.
The advanced example still includes it, but the first command users copy no longer does. That makes the default path the one most likely to hit shared-memory/DataLoader failures.
Suggested change
-docker run --gpus '"all"' --rm -it axolotlai/axolotl-uv:main-latest
+docker run --gpus '"all"' --ipc=host --rm -it axolotlai/axolotl-uv:main-latest📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ```{.bash} | |
| docker run --gpus '"all"' --rm -it axolotlai/axolotl:main-latest | |
| docker run --gpus '"all"' --rm -it axolotlai/axolotl-uv:main-latest | |
| ``` |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/installation.qmd` around lines 60 - 62, Update the default Docker run
example in the docs/installation.qmd so the shown docker run command includes
the --ipc=host flag to avoid shared-memory/DataLoader failures; locate the
existing snippet containing the Docker command (the line starting with docker
run --gpus) and add --ipc=host alongside --rm -it (keeping the advanced example
unchanged).
|
|
||
| pip3 install packaging==26.0 setuptools==75.8.0 wheel ninja | ||
| pip3 install --no-build-isolation -e '.[flash-attn]' | ||
| uv pip install --no-build-isolation -e '.[flash-attn]' |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In current uv documentation, does uv pip installrequire an active virtual environment by default, and when should--systembe used for editable installs likeuv pip install -e '.[flash-attn]'?
💡 Result:
Yes, in the current uv documentation, uv pip install requires an active virtual environment by default. Unlike pip, uv pip will error with "No virtual environment found" or prompt to create one if no virtual environment is detected (via VIRTUAL_ENV, .venv in current/parent dirs, etc.). It searches for virtual environments in a specific order and skips if none found without --system. The --system flag should be used for editable installs like uv pip install -e '.[flash-attn]' when you intentionally want to install into the system Python environment, such as in continuous integration (CI), containerized environments, or when modifying non-virtual Python installations (e.g., --python 3.12 on a system interpreter). uv docs warn that --system skips virtual environments and targets system Python (first in PATH), but recommend virtual environments generally. Use with caution as it mutates system Python, which can be problematic on some distros (PEP 668 externally managed).
Citations:
- 1: https://docs.astral.sh/uv/pip/environments/
- 2: https://docs.astral.sh/uv/pip/packages/
- 3: https://stackoverflow.com/questions/78726750/how-can-i-install-packages-using-uv-pip-install-without-creating-a-virtual-env
🏁 Script executed:
cd examples/granite4 && wc -l README.mdRepository: axolotl-ai-cloud/axolotl
Length of output: 80
🏁 Script executed:
cd examples/granite4 && cat -n README.md | head -30Repository: axolotl-ai-cloud/axolotl
Length of output: 1364
🏁 Script executed:
cd examples/granite4 && cat -n README.md | grep -i "venv\|virtual\|environment" -A 2 -B 2Repository: axolotl-ai-cloud/axolotl
Length of output: 50
🏁 Script executed:
cd examples/granite4 && cat -n README.md | tail -35Repository: axolotl-ai-cloud/axolotl
Length of output: 1544
🏁 Script executed:
cd examples/granite4 && git log --oneline -10 -- README.md 2>/dev/null | head -5Repository: axolotl-ai-cloud/axolotl
Length of output: 96
🏁 Script executed:
cd examples/granite4 && git show HEAD:README.md 2>/dev/null | sed -n '13,22p'Repository: axolotl-ai-cloud/axolotl
Length of output: 1325
🏁 Script executed:
# Check if there's a main installation guide referenced that covers venv setup
grep -r "installation.html\|installation guide" README.mdRepository: axolotl-ai-cloud/axolotl
Length of output: 297
Clarify that an active virtual environment is required before running uv pip install.
The uv pip install command on line 18 requires an active virtual environment by default. Without one, users will encounter "No virtual environment found" errors. While the README references the installation guide which presumably covers venv setup, this snippet should explicitly state that users need to create and activate a virtual environment first, or add that step to the code block.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@examples/granite4/README.md` at line 18, Clarify that the command "uv pip
install --no-build-isolation -e '.[flash-attn]'" requires an active virtual
environment by instructing the reader to create and activate one first (e.g.,
python -m venv .venv && source .venv/bin/activate) or by adding a prior
bullet/line in the README code block that shows creating and activating a venv
before running the uv pip install command.
| conflicts = [ | ||
| [ | ||
| { package = "axolotl" }, | ||
| { extra = "vllm" }, | ||
| ], | ||
| [ | ||
| { package = "axolotl" }, | ||
| { extra = "flash-attn" }, | ||
| ], | ||
| [ | ||
| { package = "axolotl" }, | ||
| { extra = "ring-flash-attn" }, | ||
| ], | ||
| [ | ||
| { package = "axolotl" }, | ||
| { extra = "mamba-ssm" }, | ||
| ], | ||
| [ | ||
| { package = "axolotl" }, | ||
| { extra = "auto-gptq" }, | ||
| ], | ||
| [ | ||
| { package = "axolotl" }, | ||
| { extra = "fbgemm-gpu" }, | ||
| ], | ||
| [ | ||
| { package = "axolotl" }, | ||
| { extra = "llmcompressor" }, | ||
| ], | ||
| ] |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
uv tool conflicts configuration syntax pyproject.toml
💡 Result:
The configuration syntax in pyproject.toml for handling dependency conflicts in uv is under the [tool.uv] section using the conflicts key. This declares mutually exclusive collections of extras or dependency groups, allowing uv to resolve them in separate forks without failing resolution. Syntax: [tool.uv] conflicts = [ [ { extra = "extra1" }, { extra = "extra2" }, ], [ { group = "group1" }, { group = "group2" }, ], ] This tells uv that "extra1" and "extra2" conflict (e.g., different numpy versions), and similarly for groups. Without this, uv fails if incompatible dependencies are detected across groups/extras during lock/sync. Use uv sync --extra extra1 or --group group1 to install specific non-conflicting sets. From official uv docs: conflicts are for cases where extras/groups have incompatible deps but aren't meant to be used together. uv resolves all groups together by default but uses separate resolution forks for declared conflicts. Note: Conflicts must be in pyproject.toml under [tool.uv]; not supported in uv.toml. Dependency groups use standard [dependency-groups] table (PEP 735).
Citations:
- 1: https://docs.astral.sh/uv/concepts/projects/config/
- 2: https://docs.astral.sh/uv/reference/settings/
- 3: https://docs.astral.sh/uv/concepts/projects/dependencies/
Fix uv conflicts syntax in [tool.uv] section.
The conflicts configuration syntax is incorrect. According to the official uv documentation, conflicts should directly list mutually exclusive extras or groups within each conflict entry, not nest them with a package specification.
Incorrect syntax:
[ { package = "axolotl" }, { extra = "vllm" } ]
Correct syntax:
[ { extra = "vllm" }, { extra = "flash-attn" } ]
Update each conflict entry to list only the conflicting extras, removing the { package = "axolotl" } constraint from each conflict list.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pyproject.toml` around lines 220 - 249, In the conflicts array update each
conflict entry that currently uses { package = "axolotl" } to instead list
extras only by replacing that object with { extra = "axolotl" } (i.e., change
entries like [{ package = "axolotl" }, { extra = "vllm" }] to [{ extra =
"axolotl" }, { extra = "vllm" }]) so every conflict list contains only { extra =
"..."} objects; apply this change to all entries in the conflicts block.
There was a problem hiding this comment.
@BrownianNotion , are you familiar whether this is an issue?
There was a problem hiding this comment.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
|
2 CI failures on nightly build upstream which this PR attempts to fix:
|
| axolotl = ["huggingface_hub"] | ||
| mamba-ssm = ["torch"] | ||
| causal-conv1d = ["torch"] | ||
| flash-attn = ["torch"] |
There was a problem hiding this comment.
Is it worth setting match_runtime = true to ensure packages like flash-attn/deepspeed use the same version of torch https://docs.astral.sh/uv/concepts/projects/config/#augmenting-build-dependencies?
This seems to be the preferred approach over using --no-build-isolation
There was a problem hiding this comment.
Thank you for the suggestion, I will look into this
There was a problem hiding this comment.
@BrownianNotion , I read through this, is this applicable only for uv sync? I'm deleting the lockfile and usually recommending folks to do uv pip install.. which may not be impacted by this?
It's tricky with lock file as we would need to maintain it updated per requirements change and it doesn't really work with N torch x cuda combinations. If anyone has better recommendation on how to handle this, we're all ears.
Edit: I'll add this anyways but I think it could be a no-op with no build isolation. I'll see if we still need that flag later.
There was a problem hiding this comment.
Thanks for looking into it! I think match_runtime is indeed for uv sync only.
Fair enough, I don't know of a nice way to support N torch x cuda combinations with the .lock file, you might have to enumerate all the combinations and their indices for example.
Description
Also fixes some nightly, see issues #3545 (comment)
Motivation and Context
How has this been tested?
AI Usage Disclaimer
Screenshots (if appropriate)
Types of changes
Social Handles (Optional)
Summary by CodeRabbit
Release Notes
New Features
uv-based installation workflow for simplified dependency management; new Docker image variants with-uvtag now available.Documentation
uvas the recommended package manager; added migration guide from pip touv; revised Docker image documentation with-uvvariant details.Chores