[WIP] Fix SP tests#3309
Conversation
📝 WalkthroughWalkthroughA substantial update spanning dependency versions (huggingface-hub, transformers, accelerate, deepspeed, trl), Hugging Face CLI command migration, Mistral tokenizer backend class refactoring, FSDP configuration with new versioning and bidirectional sync logic, context parallel monkeypatch additions, and test updates reflecting tokenization changes and transformers v5 compatibility adjustments. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Possibly related PRs
Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Warning Tools execution failed with the following error: Failed to run tools: 14 UNAVAILABLE: read ECONNRESET Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/axolotl/monkeypatch/transformers/trainer_context_parallel.py (1)
49-63: Guard against emptyitems_to_importto avoidexecsyntax errorsUsing an isolated
namespaceforexecis a good improvement, but ifitems_to_importis ever empty, this line:exec(f"from {module_name} import ({', '.join(items_to_import)})", namespace)will evaluate to
from {module_name} import (), which is aSyntaxErrorand will break patching.A minimal fix is to skip the import
execwhen there’s nothing to import:- # Use a separate namespace to capture the exec'd function - namespace = {} - exec(f"from {module_name} import ({', '.join(items_to_import)})", namespace) - exec(patched_source, namespace) + # Use a separate namespace to capture the exec'd function + namespace: dict[str, object] = {} + if items_to_import: + exec( + f"from {module_name} import ({', '.join(items_to_import)})", + namespace, + ) + exec(patched_source, namespace)
🧹 Nitpick comments (9)
tests/e2e/multigpu/test_gemma3.py (1)
31-33: Make the skip easier to revisit (add TODO-style phrasing).The skip reason is clear and references the upstream PR; consider explicitly wording it as a TODO (e.g., “TODO: unskip once transformers includes fix from …”) so it’s easy to grep for when upgrading transformers and deciding which tests to re-enable.
tests/e2e/multigpu/test_fsdp1.py (1)
247-247: Clarify skip reason and make it actionable to unskip later.The blanket reason
"broken in transformers v5"works, but you may want to align with the Gemma3 test by referencing the specific upstream issue/PR and/or adding TODO-style wording (e.g., “TODO: unskip when transformers bug X is fixed”) so it’s obvious when this test should be re-enabled.src/axolotl/core/builders/causal.py (1)
438-440: Clarify early-return semantics for pretraining collator whenmicro_batch_size == 1In the
training_args.pretrainingbranch withpretraining_sample_concatenation=Trueandmicro_batch_size==1, this condition now returnsNonefor all training calls (is_eval is False), regardless ofsample_packing/pretrain_multipack_attn, because the right-hand side of theoris always true in that case. For eval (is_eval=True), behavior remainsreturn Noneonly whennot (sample_packing and pretrain_multipack_attn).If this is the intended behavior (i.e., rely on the Trainer’s default collator for that pretraining + mbsz=1 training case), you might want to make it more explicit to avoid future misreads:
- if not (self.cfg.sample_packing and self.cfg.pretrain_multipack_attn) or ( - self.cfg.micro_batch_size == 1 and is_eval is False - ): - return None + if self.cfg.micro_batch_size == 1 and not is_eval: + return None + if not (self.cfg.sample_packing and self.cfg.pretrain_multipack_attn): + return NoneSame logic, but easier to reason about when tweaking pretraining/sample-packing configs later.
src/axolotl/core/builders/base.py (1)
207-241: Warmup ratio →warmup_stepsfallback looks correct for transformers v5The restructuring of warmup logic (typed
warmup_steps: int | floatplus thewarmup_ratio > 0.0 and warmup_steps == 0fallback) cleanly preserves existing precedence (warmup_steps>warmup_ratio> default) while switching to a singlewarmup_stepskwarg, and should keep behaviour consistent whentotal_num_stepsis unknown.If you expect future readers to be confused by
warmup_stepssometimes being a ratio (float), a short comment explaining that transformers v5 interpretswarmup_steps < 1.0as a ratio would help maintainability, but the runtime logic itself looks sound.src/axolotl/utils/schemas/validation.py (1)
865-876: Consider adding conflict detection for version mismatches.The bidirectional synchronization logic works correctly for the tested scenarios. However, it doesn't detect conflicts when both
fsdp_versionandfsdp_config.versionare set to different values. For example, if a user setsfsdp_version: 2andfsdp_config.version: 1, neither synchronization condition triggers, potentially leaving an inconsistent state.Consider adding a check to detect and handle version conflicts:
@model_validator(mode="before") @classmethod def check_fsdp_version_in_fsdp_config(cls, data): fsdp_config = data.get("fsdp_config") or {} fsdp_version = data.get("fsdp_version", None) + + # Check for conflicts between top-level and nested version + if fsdp_version and fsdp_config.get("version") and fsdp_version != fsdp_config.get("version"): + raise ValueError( + f"Conflicting FSDP versions: fsdp_version={fsdp_version} but " + f"fsdp_config.version={fsdp_config.get('version')}. " + "Please set only one or ensure they match." + ) + if not fsdp_version and fsdp_config and fsdp_config.get("version"): data["fsdp_version"] = fsdp_config.get("version") if not fsdp_version and fsdp_config and fsdp_config.get("fsdp_version"): data["fsdp_version"] = fsdp_config.get("fsdp_version") elif fsdp_version and fsdp_config and not fsdp_config.get("version"): data["fsdp_config"]["version"] = fsdp_version return datatests/prompt_strategies/test_raw_io.py (1)
79-88: Token‑level expectations are very tokenizer‑version specificThe updated decode string,
input_ids(e.g.21558for"hello"), and corresponding labels correctly reflect the new single‑token behavior, but they tightly couple this test to the exact tokenizer version and vocab. To reduce future breakage when upgrading models/tokenizers, consider:
- Deriving specific ids via the tokenizer (e.g., with an encode call) instead of hard‑coding numeric ids.
- Asserting key subsequences / structure (presence and ordering of segments and
<eot>tokens) rather than the full list of ids where possible.Also applies to: 102-116
src/axolotl/utils/mistral/mistral_tokenizer.py (1)
10-15: Base‑class switch toMistralCommonBackendlooks consistent with the new backend flowSub‑classing
MistralCommonBackendand updating thefrom_pretraineddocstrings / error messages to reference it lines up with the rest of the PR’s mistral‑common backend changes. A couple of follow‑ups:
- Please verify against the transformers version you pin that
transformers.tokenization_mistral_common.MistralCommonBackendexists and that its__init__/from_pretrainedexpectations are compatible with howHFMistralTokenizeris constructed here (in particular, themodekwarg andtokenizer_pathargument).- (Minor) The phrase
[`MistralCommonBackend.tokenization_mistral_common.save_pretrained`]in the docstring is slightly odd; consider simplifying to referenceMistralCommonBackend.save_pretrainedif that’s what you mean.Also applies to: 133-181
tests/prompt_strategies/test_chat_templates_advanced.py (1)
24-42: Prefer xfail/skip over commenting out thephi35_tokenizercaseCommenting out the
("phi35_tokenizer", "phi_35", None, "<|end|>")param avoids the current breakage but also removes coverage and makes it easy to forget. Consider keeping the case inPARAMETRIZE_PARAMSand marking it withpytest.mark.xfail(or a conditionalskip) explaining the transformers v5 incompatibility. That way CI still tracks when it starts passing again.src/axolotl/monkeypatch/accelerate/parallelism_config.py (1)
86-105: Consider guarding against missingcp_handlerattribute.If
cp_backendis not"deepspeed", line 94 accessesself.parallelism_config.cp_handler.cp_comm_strategy. Ifcp_handlerisNone, this will raiseAttributeError.Consider adding a guard:
def patched_prepare_cp(self, *args): if self.parallelism_config.cp_backend == "deepspeed": return args + if self.parallelism_config.cp_handler is None: + return args + from accelerate.big_modeling import _attach_context_parallel_hooks
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (37)
.github/workflows/preview-docs.yml(1 hunks).github/workflows/tests.yml(3 hunks)cicd/multigpu.sh(1 hunks)docs/amd_hpc.qmd(1 hunks)docs/installation.qmd(1 hunks)requirements.txt(1 hunks)src/axolotl/cli/checks.py(1 hunks)src/axolotl/core/builders/base.py(3 hunks)src/axolotl/core/builders/causal.py(1 hunks)src/axolotl/loaders/patch_manager.py(0 hunks)src/axolotl/loaders/processor.py(1 hunks)src/axolotl/monkeypatch/accelerate/parallelism_config.py(1 hunks)src/axolotl/monkeypatch/models/mistral3/mistral_common_tokenizer.py(4 hunks)src/axolotl/monkeypatch/transformers/trainer_context_parallel.py(1 hunks)src/axolotl/processing_strategies.py(2 hunks)src/axolotl/prompt_strategies/chat_template.py(1 hunks)src/axolotl/utils/callbacks/perplexity.py(1 hunks)src/axolotl/utils/mistral/mistral_tokenizer.py(6 hunks)src/axolotl/utils/schemas/fsdp.py(1 hunks)src/axolotl/utils/schemas/validation.py(1 hunks)src/axolotl/utils/trainer.py(1 hunks)tests/e2e/integrations/test_cut_cross_entropy.py(1 hunks)tests/e2e/integrations/test_hooks.py(1 hunks)tests/e2e/multigpu/test_fp8_fsdp2.py(2 hunks)tests/e2e/multigpu/test_fsdp1.py(1 hunks)tests/e2e/multigpu/test_fsdp2.py(1 hunks)tests/e2e/multigpu/test_gemma3.py(1 hunks)tests/e2e/utils.py(1 hunks)tests/hf_offline_utils.py(1 hunks)tests/monkeypatch/test_mistral_tokenizer_patch.py(0 hunks)tests/prompt_strategies/test_alpaca.py(3 hunks)tests/prompt_strategies/test_chat_templates.py(2 hunks)tests/prompt_strategies/test_chat_templates_advanced.py(1 hunks)tests/prompt_strategies/test_raw_io.py(2 hunks)tests/test_normalize_config.py(2 hunks)tests/test_tokenizers.py(3 hunks)tests/utils/schemas/validation/test_fsdp.py(1 hunks)
💤 Files with no reviewable changes (2)
- tests/monkeypatch/test_mistral_tokenizer_patch.py
- src/axolotl/loaders/patch_manager.py
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-07-31T11:48:46.313Z
Learnt from: SalmanMohammadi
Repo: axolotl-ai-cloud/axolotl PR: 2994
File: index.qmd:7-8
Timestamp: 2025-07-31T11:48:46.313Z
Learning: In documentation workflow testing PRs, temporary test content like "test test" may be intentionally added to trigger documentation preview actions and validate workflow fixes, rather than being accidental debug text.
Applied to files:
.github/workflows/preview-docs.yml
📚 Learning: 2025-09-09T13:50:26.197Z
Learnt from: djsaunde
Repo: axolotl-ai-cloud/axolotl PR: 3067
File: examples/llama-3/diffusion-3.2-1b-sft.yaml:12-18
Timestamp: 2025-09-09T13:50:26.197Z
Learning: Token ID 128002 in Llama 3.2 tokenizer corresponds to `<|reserved_special_token_0|>`, which is one of 250 reserved special tokens designed for custom usage in fine-tuning applications like diffusion masking.
Applied to files:
tests/test_tokenizers.py
📚 Learning: 2025-09-22T22:14:35.531Z
Learnt from: gholmes829
Repo: axolotl-ai-cloud/axolotl PR: 3167
File: src/axolotl/utils/schemas/validation.py:819-834
Timestamp: 2025-09-22T22:14:35.531Z
Learning: In the axolotl codebase, validation methods maintain separation of concerns - early validators focus on core logic while `check_fsdp_config_kwargs_prefix` handles deprecated prefix normalization. This pattern should be preserved for consistency rather than mixing prefix handling into individual validators.
Applied to files:
tests/test_normalize_config.pysrc/axolotl/utils/schemas/validation.py
🧬 Code graph analysis (10)
src/axolotl/utils/trainer.py (1)
src/axolotl/monkeypatch/accelerate/parallelism_config.py (1)
patch_prepare_cp(80-107)
src/axolotl/loaders/processor.py (1)
src/axolotl/utils/mistral/mistral_tokenizer.py (1)
HFMistralTokenizer(14-227)
tests/test_tokenizers.py (1)
tests/test_datasets.py (1)
tokenizer(31-35)
tests/e2e/integrations/test_hooks.py (1)
tests/e2e/utils.py (1)
check_model_output_exists(199-220)
tests/e2e/multigpu/test_fp8_fsdp2.py (1)
tests/e2e/utils.py (3)
most_recent_subdir(35-42)require_torch_2_7_0(81-90)supports_fp8(170-174)
src/axolotl/core/builders/causal.py (1)
src/axolotl/integrations/base.py (2)
cfg(339-340)cfg(343-344)
tests/e2e/integrations/test_cut_cross_entropy.py (1)
tests/e2e/utils.py (1)
check_model_output_exists(199-220)
src/axolotl/processing_strategies.py (1)
src/axolotl/utils/mistral/mistral3_processor.py (1)
Mistral3Processor(27-170)
src/axolotl/monkeypatch/models/mistral3/mistral_common_tokenizer.py (2)
src/axolotl/utils/mistral/mistral_tokenizer.py (1)
apply_chat_template(74-94)src/axolotl/utils/mistral/mistral3_processor.py (1)
apply_chat_template(74-148)
tests/utils/schemas/validation/test_fsdp.py (2)
src/axolotl/utils/dict.py (1)
DictDefault(6-38)src/axolotl/utils/config/__init__.py (1)
validate_config(259-305)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: test-axolotl-multigpu (126, 12.6.3, 3.11, 2.7.1, vllm, 2, true)
- GitHub Check: test-axolotl-multigpu (128, 12.8.1, 3.11, 2.9.0, fbgemm-gpu, 2, true)
- GitHub Check: test-axolotl-multigpu (128, 12.8.1, 3.11, 2.8.0, fbgemm-gpu, 2, true)
- GitHub Check: preview
- GitHub Check: PyTest (3.11, 2.7.1)
- GitHub Check: PyTest from Source Dist (3.11, 2.9.0)
- GitHub Check: PyTest from Source Dist (3.11, 2.7.1)
- GitHub Check: PyTest from Source Dist (3.11, 2.8.0)
- GitHub Check: PyTest (3.11, 2.8.0)
- GitHub Check: PyTest (3.11, 2.9.0)
🔇 Additional comments (34)
tests/hf_offline_utils.py (1)
16-16: Remove the commented import and function call or provide an explanation.The
reset_sessionsfunction fromhuggingface_hub.utilshas not been removed—it remains available in the library. Commenting out working code without explanation (on lines 16 and 23) should be avoided. Either:
- Uncomment and restore the original functionality with a note explaining why it's needed
- Remove the lines entirely if this functionality is no longer required for the test
If there's a specific issue with
reset_sessionsin the updated huggingface-hub version, document it in the code or commit message.Also applies to: 23-23
src/axolotl/core/builders/base.py (1)
524-531: Alias from configinclude_tokens_per_secondto HFinclude_num_input_tokens_seenis consistentThe new
arg_mapentry correctly routes the existing config fieldinclude_tokens_per_secondinto the updated transformers kwarginclude_num_input_tokens_seen, preserving backward compatibility while aligning with the v5 API surface. This looks good, assuming the HF arg name is exactlyinclude_num_input_tokens_seenand there’s no need to support a new config key yet.src/axolotl/utils/schemas/fsdp.py (1)
15-18: LGTM!The new
versionfield is well-defined with appropriate type annotation, default value, and schema metadata.tests/test_normalize_config.py (2)
131-131: LGTM!The assertion change correctly reflects the new behavior where
versionis preserved infsdp_configafter migration, following the prefix normalization and bidirectional synchronization logic.
198-198: LGTM!Consistent with the previous test update, correctly asserting that
versionis retained infsdp_configafter migration.tests/utils/schemas/validation/test_fsdp.py (2)
16-25: LGTM!The new test appropriately validates that
fsdp_versionis correctly derived fromfsdp_config.version, exercising the upward synchronization logic.
27-38: LGTM!The updated test correctly validates bidirectional synchronization, ensuring that when
fsdp_versionis set at the top level, it propagates tofsdp_config.version.src/axolotl/utils/schemas/validation.py (1)
858-858: LGTM!The updated condition correctly includes
fsdp_versionin the prefix normalization, stripping thefsdp_prefix from all keys includingfsdp_versionto transform it toversion.tests/e2e/integrations/test_cut_cross_entropy.py (1)
13-13: Import path change to shared test utility looks goodUsing
tests.e2e.utils.check_model_output_existskeeps imports consistent across e2e tests and reduces relative‑import fragility.tests/e2e/integrations/test_hooks.py (1)
14-14: Consistent use of central test utility moduleUpdating the import to
tests.e2e.utils.check_model_output_existsaligns this test with the rest of the suite and should aid discoverability.tests/e2e/multigpu/test_fsdp2.py (1)
153-156: Explicitly disabling LoRA kernels in baseline test is a good stabilizationMaking
test_lora_sftexplicitly set all LoRA kernel flags toFalseisolates it from auto‑enable behavior and leavestest_lora_sft_kernelsto cover the kernel‑enabled path. This should reduce hardware/driver‑dependent flakiness in the baseline FSDP2 LoRA test.tests/e2e/multigpu/test_fp8_fsdp2.py (1)
14-15: Switch tosupports_fp8keeps FP8 gating centralizedUsing
supports_fp8instead of a Hopper‑specific decorator centralizes FP8 hardware gating intests.e2e.utilswhile leaving the smoke test logic unchanged. This should make future FP8 capability tweaks easier without touching individual tests.Also applies to: 51-53
docs/installation.qmd (1)
168-168: Authentication command updated consistently.The login command has been updated to
hf auth login, consistent with the change insrc/axolotl/cli/checks.py. This aligns with the HF CLI v2 command structure.Please verify this command works with the updated
huggingface_hub>=1.1.7dependency (same verification as requested forsrc/axolotl/cli/checks.py).requirements.txt (3)
16-16: Transformers installed from git main for v5 testing.The
transformerslibrary is now installed directly from the main branch of the git repository. This aligns with the PR title "[WIP] Fix SP tests" and the branch name "transformers-v5-fix-accelerate-cp", indicating this is intentional for testing transformers v5 compatibility.Since this is a WIP PR, installing from git main is appropriate for testing. However, verify that this doesn't introduce breaking changes by running the test suite. Consider pinning to a specific commit hash for reproducibility once testing is complete.
13-13: Verify compatibility with major huggingface_hub version update.The
huggingface_hubrequirement has been updated from>=0.36.0to>=1.1.7, which is a significant version jump. Please verify that allhuggingface_hubAPIs used in the codebase are compatible with v1.1.7 and that this update aligns with the HF CLI v2 migration referenced in this PR.
17-22: Multiple dependency version updates require manual verification.Several key dependencies have been updated:
accelerate: 1.11.0 → 1.12.0deepspeed: >=0.17.0 → >=0.18.2trl: 0.25.0 → 0.25.1kernels: >=0.9.0 → >=0.11.2These updates align with the transformers v5 compatibility work mentioned in the PR. Compatibility between these versions should be verified, particularly regarding the
kernelsconstraint change (>=0.9.0 to >=0.11.2) and whether any hardcoded version checks in the codebase need adjustment..github/workflows/preview-docs.yml (1)
30-33: Standard CI disk space optimization.This cleanup step removes unnecessary system tools to free up disk space on the GitHub Actions runner before building documentation. This is a common and recommended practice, especially for workflows that may download large models or datasets.
The same cleanup step is consistently applied in
.github/workflows/tests.yml, which is good for maintainability.cicd/multigpu.sh (1)
5-5: Fast-fail added for quicker CI feedback.The
--maxfail=3flag has been added, which will stop the test run after 3 failures. This speeds up CI feedback, especially useful for this WIP PR.Note that this only affects the first pytest run (line 5). The solo and patched test runs (lines 12 and 17) will continue to run all tests, ensuring full coverage when the initial tests pass.
.github/workflows/tests.yml (4)
112-112: HF CLI download command updated.The dataset download command has been updated from
huggingface-cli downloadtohf download, consistent with the HF CLI v2 migration across the repository.Verify this command syntax works with
huggingface_hub>=1.1.7(same verification requested for docs/amd_hpc.qmd).
116-116: Pytest parallelism reduced from -n8 to -n4.The test parallelism has been reduced in two places:
- Line 116: Main test run
- Line 195: Source dist test run
This reduction may be to address stability issues or resource constraints on CI runners.
Please verify whether this change is:
- Addressing flakiness/race conditions in tests with high parallelism
- Accommodating resource limitations on CI runners
- Or a temporary measure for debugging
If this is addressing test stability, consider documenting the reason in a comment for future reference.
Also applies to: 195-195
191-191: HF cache command updated.The command to show HF cache has been updated from
huggingface-cli scan-cachetohf cache ls, consistent with the HF CLI v2 migration.
200-200: Job dependency changes optimize CI execution flow.The dependency structure has been adjusted:
- Line 200:
gate-skip-e2enow only depends on[pre-commit](removedpytest)- Line 236:
docker-e2e-tests-1stnow depends on[pre-commit, pytest](addedpytest)This change allows the gate-skip-e2e job to run in parallel with pytest, while ensuring that expensive E2E tests only run after unit tests pass. This is a good optimization for CI cost and execution time.
The logic ensures:
- Gate evaluation can happen quickly without waiting for pytest
- E2E tests wait for both pre-commit and pytest to pass
- Expensive GPU resources are only used after cheaper validation passes
Also applies to: 236-236
src/axolotl/cli/checks.py (1)
47-47: The error message correctly instructs users to usehf auth login, which is the official authentication command forhuggingface_hub1.1.7 and later as documented by HuggingFace. No changes needed.tests/test_tokenizers.py (1)
20-42: Avoid hard‑coding tokenizer‑specific IDs in expectationsThe updated expectation to
[1, 32000, 1792]will be fragile to future tokenizer / added_tokens changes. To make this test more robust, derive the id dynamically from the tokenizer rather than hard‑coding1792(e.g., assert that the third element equals the tokenizer’s id for"<|im_start|>", or thatlen(tokenizer)increased by 1 and that encoding"<|im_start|>user"round‑trips as expected).[ suggest_recommended_refactor ]
Also applies to: 78-88
src/axolotl/utils/callbacks/perplexity.py (1)
10-14: Conditional PreTrainedTokenizer import looks goodThe try/except around
transformers.tokenization_pythonwith fallback totransformers.tokenization_utilsis a reasonable way to straddle older and newer transformers versions without changing the rest of the logic. Please just double‑check against the exact transformers versions you intend to support that the symbol is available in one of these two locations.src/axolotl/prompt_strategies/chat_template.py (1)
147-152: Tokenizedapply_chat_templateusage aligns with strategy expectationsSwitching the non‑processor path to:
self.tokenizer.apply_chat_template( conversation, tokenize=True, return_dict=False, **chat_template_kwargs, )matches what
_tokenize_single_promptexpects (alist[int]in the simple path) and is consistent with the updated tests. Just ensure all supported tokenizer backends (including mistral‑common viaHFMistralTokenizer) supporttokenize/return_dictinapply_chat_templateunder the transformers versions you target.src/axolotl/utils/trainer.py (1)
622-640: Context‑parallel patch hook is wired correctly; validate against runtime stackHooking
patch_prepare_cp()whencontext_parallel_size > 1after setting:
PARALLELISM_CONFIG_CP_SIZEACCELERATE_ALLOW_CP_STANDALONEis a reasonable place to integrate the custom
_prepare_cplogic shown in the monkeypatch. Since this path depends on specificaccelerateandtorch.distributed.tensor.experimentalAPIs, please sanity‑check on the full range of versions/setups you intend to support (e.g., CP backend"deepspeed"vs"torch", and older torch without CP) to ensure no unexpectedImportError/attribute errors at runtime.tests/prompt_strategies/test_alpaca.py (1)
71-120: Token ID updates are internally consistent across all three test cases.The updated token IDs are consistent:
input_idsintest_no_double_im_endmatchlabelsintest_w_train_on_input- The
-100label counts intest_no_train_on_inputcorrectly correspond to the token counts in the other testsThese changes align with the broader tokenization updates across the PR for transformers v5 compatibility.
src/axolotl/loaders/processor.py (1)
24-36: Patch target correctly updated to MistralCommonBackend.The change from
MistralCommonTokenizertoMistralCommonBackendis consistent with:
HFMistralTokenizernow inheriting fromMistralCommonBackend- The monkeypatch in
mistral_common_tokenizer.pytargeting the same classThe docstring accurately describes the purpose for transformers v5 compatibility.
tests/prompt_strategies/test_chat_templates.py (1)
140-172: Phi-3.5 token ID updates are consistent between input_ids and labels.The updated token IDs for the phi-3.5 test are internally consistent:
expected_labelsline 159 correctly mirrorsexpected_input_idsline 149 for the trained assistant output- All non-trained positions correctly use
-100These changes align with the broader token ID updates across the PR.
src/axolotl/monkeypatch/accelerate/parallelism_config.py (1)
90-92: Experimental PyTorch APIs require version compatibility monitoring.The imports
torch.distributed.tensor.experimental.context_parallelandtorch.distributed.tensor.experimental._attention.set_rotate_methodare explicitly experimental/prototype APIs. Per PyTorch documentation, these are subject to change across versions, making this code sensitive to PyTorch version upgrades. Ensure PyTorch version is pinned in dependencies or add compatibility checks to handle potential API breaking changes in future releases.src/axolotl/monkeypatch/models/mistral3/mistral_common_tokenizer.py (1)
14-19: Patch target updated to MistralCommonBackend - implementation details unverified.The change from
MistralCommonTokenizertoMistralCommonBackendaligns with the class hierarchy updates.MistralCommonBackend.apply_chat_templateexists in the transformers library and does support image handling for multimodal inputs.However, the exact string being patched (
pixel_values = torch.tensor(images)) could not be verified against the transformers source code. If this string does not match the actual implementation in the specific transformers version being used, the patch will silently fail without raising an error (line 85's warning path). Verify the patch target string against the transformers version used in this project.src/axolotl/processing_strategies.py (2)
431-431: Appropriate removal of type annotation to avoid import-time issues.The removal of the
Mistral3Processortype annotation aligns with moving the import to function scope (lines 463-464), which is a standard pattern for avoiding circular dependencies or optional dependency issues during module initialization.
463-464: LGTM: Local import pattern avoids import-time dependencies.Moving the
Mistral3Processorimport to function scope is an appropriate pattern for deferring imports and avoiding circular dependencies or issues with optional dependencies. This change is consistent with the untyped processor parameter at line 431.
|
|
||
| ```bash | ||
| huggingface-cli download meta-llama/Meta-Llama-3.1-8B --local-dir ~/hfdata/llama3.1-8B | ||
| hf download meta-llama/Meta-Llama-3.1-8B --local-dir ~/hfdata/llama3.1-8B |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
fd requirements.txtRepository: axolotl-ai-cloud/axolotl
Length of output: 258
🌐 Web query:
huggingface_hub version 1.1.7 hf download command syntax documentation
💡 Result:
Syntax (CLI, huggingface_hub v1.1.7)
- Basic: hf download <repo_id> [filename1 filename2 ...]
- Example: hf download gpt2 config.json
- Download multiple files (shows progress and returns snapshot path): hf download <repo_id> file1 file2
- Download to a local folder: --local-dir
- Example: hf download adept/fuyu-8b model-00001-of-00002.safetensors --local-dir fuyu. [1]
Common options
- --cache-dir (specify custom cache directory).
- --token <hf_token> (use explicit token for private/gated repos).
- --quiet (only print final path).
- Dry-run and other CLI helpers are available in the CLI guide. For download timeouts, use HF_HUB_DOWNLOAD_TIMEOUT environment variable. [1][2]
References
[1] Hugging Face Hub — Download guide and CLI (v1.1.7).
[2] Hugging Face Hub — CLI guide (hf download options).
Correct the hf download command syntax for huggingface_hub v1.1.7.
The command on line 89 omits required filenames. According to huggingface_hub v1.1.7 documentation, hf download requires specific filenames to be listed: hf download <repo_id> file1 file2 --local-dir <path>. Either specify the model files to download (e.g., model.safetensors, config.json) or verify if a different approach is needed for downloading entire repositories.
🤖 Prompt for AI Agents
In docs/amd_hpc.qmd around line 89, the hf download command is missing required
filenames for huggingface_hub v1.1.7; replace the single-repo argument with an
explicit list of files to fetch (for example the model binary like
model.safetensors or pytorch_model.bin, config.json, tokenizer files) or change
the approach to download the whole repo via the appropriate huggingface_hub
helper; update the command to include the exact filenames you want to download
and keep --local-dir <path> as the destination.
| def supports_fp8(test_case): | ||
| compute_capability = torch.cuda.get_device_capability() | ||
| return unittest.skipUnless( | ||
| compute_capability >= (9, 0), "test requires h100 or newer GPU" | ||
| )(test_case) |
There was a problem hiding this comment.
Guard supports_fp8 against non‑CUDA environments to prevent errors
Currently supports_fp8 calls torch.cuda.get_device_capability() unconditionally. On CPU-only or non-CUDA PyTorch builds this will raise a RuntimeError instead of cleanly skipping the test, unlike the existing skip_if_no_cuda pattern in the file which guards with torch.cuda.is_available().
Align with the established guard pattern so FP8 tests are properly skipped when CUDA is unavailable.
Suggested change:
def supports_fp8(test_case):
+ if not torch.cuda.is_available():
+ return unittest.skip("CUDA not available")(test_case)
compute_capability = torch.cuda.get_device_capability()
return unittest.skipUnless(
compute_capability >= (9, 0), "test requires h100 or newer GPU"
)(test_case)🤖 Prompt for AI Agents
In tests/e2e/utils.py around lines 170-174, the supports_fp8 decorator calls
torch.cuda.get_device_capability() unconditionally which raises on non-CUDA
builds; first guard with torch.cuda.is_available() like the existing
skip_if_no_cuda pattern, e.g. if not torch.cuda.is_available(): return
unittest.skip("CUDA not available")(test_case), otherwise fetch
compute_capability and then return unittest.skipUnless(compute_capability >= (9,
0), "test requires h100 or newer GPU")(test_case); this ensures tests are
cleanly skipped on CPU-only environments and still checks capability when CUDA
is present.
|
📖 Documentation Preview: https://693747a8b9f13a5dc02143dd--resonant-treacle-0fd729.netlify.app Deployed on Netlify from commit d483aae |
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
* upgrade dependencies * don't use reset sessions * downgrade transformers, upgrade other deps * upgrade bnb to 0.49.0 * restore s3 cache * explicit use local files w hub * decompress and strip top level dir * use 2 levels for strip components * try to preserve permissions for symlinks * use updated tar * fix #3293 for distributed * downgrade bnb * fast fail after 4 * fix total tokens device * patch accelerate CP/SP (#3309) --------- Co-authored-by: salman <salman.mohammadi@outlook.com>
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
hfsyntax throughout documentation and codebase.Chores
✏️ Tip: You can customize this high-level summary in your review settings.