Upgrade transformers to 5.5.3 and refactor hf_transformers_utils into subpackage#21569
Conversation
|
/tag-and-rerun-ci |
|
/rerun-failed-ci |
There was a problem hiding this comment.
Code Review
This pull request updates the transformers dependency version from 5.3.0 to 5.4.0 across multiple configuration files, including pyproject.toml, pyproject_cpu.toml, pyproject_npu.toml, pyproject_other.toml, and pyproject_xpu.toml. I have no feedback to provide.
|
/rerun-failed-ci |
Transformers v5.4.0 validates that rope_parameters contains rope_theta for yarn/llama3/longrope types. For unregistered model types (e.g. deepseek_v32), the generic PretrainedConfig lacks rope_parameters so the conversion that injects rope_theta is skipped, causing a KeyError. Patch PretrainedConfig.from_dict to inject rope_theta into rope_scaling before __init__ validation runs.
9782730 to
926cbda
Compare
|
/rerun-failed-ci |
2 similar comments
|
/rerun-failed-ci |
|
/rerun-failed-ci |
Transformers v5.4.0 now validates imports in remote model code during config loading (check_imports in dynamic_module_utils.py). DeepSeek-OCR's remote modeling code imports easydict, which must be installed.
83a632b to
2106153
Compare
Transformers v5.4.0 removed LlamaFlashAttention2. Remote model code (e.g. DeepSeek-OCR) imports it, and check_imports validates imports before our lazy shim in get_config() runs. Apply the patch at module load time so the symbol exists before any from_pretrained call.
9bde741 to
a225590
Compare
The XPU pyproject was missing addict, needed by DeepSeek-OCR remote model code. Transformers v5.4.0 now validates remote code imports eagerly, surfacing this missing dependency.
|
/rerun-failed-ci |
Transformers v5.4.0 passes new kwargs (e.g. device) to image processor preprocess(). Remote model code (e.g. KimiVL) that defines preprocess() without **kwargs crashes with TypeError. Patch __call__ to strip unsupported kwargs on TypeError and retry.
- Fix isort ordering in import patch block - Extend TokenizersBackend detection to also apply when trust_remote_code=True, as transformers 5.4.0 may return TokenizersBackend even with trust_remote_code enabled
- Fix black formatting for one-line any() expression - Retry tokenizer loading with use_fast=False when TokenizersBackend is returned, as transformers 5.4.0 may fail to resolve remote tokenizer classes with the fast tokenizer backend
Move all monkey-patches from hf_transformers_utils.py top-level into sglang/srt/utils/transformers_v54_compat.py with: - Clear categorization (transformers bugs vs remote-model-code compat) - Upstream issue references for each patch - Idempotent apply_all() entry point - Individual patch functions for easy removal when upstream fixes land
e6e96aa to
76d74fe
Compare
Small refactors before splitting hf_transformers_utils.py into modules.
Break the 1270-line monolith into focused modules: - compat.py: All transformers v5/v5.4 compatibility patches - common.py: Shared helpers (download, config detection, context length) - config.py: get_config() and config loading logic - tokenizer.py: get_tokenizer() and tokenizer v5 fixes - processor.py: get_processor() and processor loading logic - __init__.py: Re-exports all public symbols hf_transformers_utils.py is now a thin shim that re-exports from the package, so all existing import paths continue to work unchanged. The old transformers_v54_compat.py is merged into compat.py alongside other v5 patches (clean_up_tokenization, is_torch_fx_available, etc.).
- Add AutoConfig, _fix_v5_add_bos_eos_token, _fix_added_tokens_encoding to __init__.py re-exports (imported by lora/ and test/runners.py) - Fix is_flash_attn_greater_or_equal_2_10 shim: "2.1.0" -> "2.10.0" - Replace placeholder TODO(upstream) URL with actionable text - Add logging to TokenizersBackend retry loop
|
/rerun-failed-ci [auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured. |
27 similar comments
|
/rerun-failed-ci [auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured. |
|
/rerun-failed-ci [auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured. |
|
/rerun-failed-ci [auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured. |
|
/rerun-failed-ci [auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured. |
|
/rerun-failed-ci [auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured. |
|
/rerun-failed-ci [auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured. |
|
/rerun-failed-ci [auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured. |
|
/rerun-failed-ci [auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured. |
|
/rerun-failed-ci [auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured. |
|
/rerun-failed-ci [auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured. |
|
/rerun-failed-ci [auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured. |
|
/rerun-failed-ci [auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured. |
|
/rerun-failed-ci [auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured. |
|
/rerun-failed-ci [auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured. |
|
/rerun-failed-ci [auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured. |
|
/rerun-failed-ci [auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured. |
|
/rerun-failed-ci [auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured. |
|
/rerun-failed-ci [auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured. |
|
/rerun-failed-ci [auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured. |
|
/rerun-failed-ci [auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured. |
|
/rerun-failed-ci [auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured. |
|
/rerun-failed-ci [auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured. |
|
/rerun-failed-ci [auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured. |
|
/rerun-failed-ci [auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured. |
|
/rerun-failed-ci [auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured. |
|
/rerun-failed-ci [auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured. |
|
/rerun-failed-ci [auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured. |
Summary
Refactor
hf_transformers_utils.pyintohf_transformers/subpackage and upgrade pinnedtransformersfrom5.3.0to5.5.3with compatibility patches.Refactoring
Split the monolithic
hf_transformers_utils.pyinto focused modules:hf_transformers_utils.pynow remains as a thin re-export shim, so existing import paths continue to work.Transformers upgrade (
5.3.0->5.5.3)transformers==5.5.3in all platformpyproject*.tomlfileseasydictwhere needed andaddicton XPU for stricter remote-code import validationhf_transformers/compat.pyfor v5 regressions and removed symbolsCompatibility fixes included
rope_thetahandling)LlamaFlashAttention2,is_flash_attn_greater_or_equal_2_10)deviceand other unsupported kwargs)TokenizersBackendfallback handling with tokenizer class resolution and retrieshybrid_override_patternclean_up_tokenization,is_torch_fx_available)Additional testing
test/registered/unit/utils/test_hf_transformers.pyfor:Test plan
PYTHONPATH=python pytest -q test/registered/unit/utils/test_hf_transformers.py