Skip to content

Upgrade transformers to 5.5.3 and refactor hf_transformers_utils into subpackage#21569

Merged
Kangyan-Zhou merged 70 commits intosgl-project:mainfrom
JustinTong0323:upgrade/transformers-5.4.0
Apr 16, 2026
Merged

Upgrade transformers to 5.5.3 and refactor hf_transformers_utils into subpackage#21569
Kangyan-Zhou merged 70 commits intosgl-project:mainfrom
JustinTong0323:upgrade/transformers-5.4.0

Conversation

@JustinTong0323
Copy link
Copy Markdown
Collaborator

@JustinTong0323 JustinTong0323 commented Mar 27, 2026

Summary

Refactor hf_transformers_utils.py into hf_transformers/ subpackage and upgrade pinned transformers from 5.3.0 to 5.5.3 with compatibility patches.

Refactoring

Split the monolithic hf_transformers_utils.py into focused modules:

python/sglang/srt/utils/hf_transformers/
├── __init__.py
├── compat.py
├── common.py
├── config.py
├── tokenizer.py
├── processor.py
└── mistral_utils.py

hf_transformers_utils.py now remains as a thin re-export shim, so existing import paths continue to work.

Transformers upgrade (5.3.0 -> 5.5.3)

  • Bump transformers==5.5.3 in all platform pyproject*.toml files
  • Add easydict where needed and addict on XPU for stricter remote-code import validation
  • Add/organize compatibility patches in hf_transformers/compat.py for v5 regressions and removed symbols

Compatibility fixes included

  • Rope config compatibility for unregistered model types (rope_theta handling)
  • Removed symbol shims (LlamaFlashAttention2, is_flash_attn_greater_or_equal_2_10)
  • Image processor kwargs filtering (device and other unsupported kwargs)
  • CUDA tensor handling patch in image processing backend
  • TokenizersBackend fallback handling with tokenizer class resolution and retries
  • Nemotron-H pattern parsing patch for hybrid_override_pattern
  • Reintroduced helpers relied on by remote code (clean_up_tokenization, is_torch_fx_available)

Additional testing

  • Add test/registered/unit/utils/test_hf_transformers.py for:
    • subpackage re-export/shim behavior
    • rope/config helpers
    • compatibility patches
    • tokenizer special-token fixes

Test plan

  • PYTHONPATH=python pytest -q test/registered/unit/utils/test_hf_transformers.py
  • Full CI pass on all platforms

@github-actions github-actions Bot added dependencies Pull requests that update a dependency file npu labels Mar 27, 2026
@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/tag-and-rerun-ci

@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the transformers dependency version from 5.3.0 to 5.4.0 across multiple configuration files, including pyproject.toml, pyproject_cpu.toml, pyproject_npu.toml, pyproject_other.toml, and pyproject_xpu.toml. I have no feedback to provide.

@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

Transformers v5.4.0 validates that rope_parameters contains rope_theta
for yarn/llama3/longrope types. For unregistered model types (e.g.
deepseek_v32), the generic PretrainedConfig lacks rope_parameters so
the conversion that injects rope_theta is skipped, causing a KeyError.

Patch PretrainedConfig.from_dict to inject rope_theta into rope_scaling
before __init__ validation runs.
@JustinTong0323 JustinTong0323 force-pushed the upgrade/transformers-5.4.0 branch from 9782730 to 926cbda Compare March 27, 2026 23:18
@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

2 similar comments
@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

Transformers v5.4.0 now validates imports in remote model code during
config loading (check_imports in dynamic_module_utils.py). DeepSeek-OCR's
remote modeling code imports easydict, which must be installed.
@JustinTong0323 JustinTong0323 force-pushed the upgrade/transformers-5.4.0 branch from 83a632b to 2106153 Compare March 27, 2026 23:48
Transformers v5.4.0 removed LlamaFlashAttention2. Remote model code
(e.g. DeepSeek-OCR) imports it, and check_imports validates imports
before our lazy shim in get_config() runs. Apply the patch at module
load time so the symbol exists before any from_pretrained call.
@JustinTong0323 JustinTong0323 force-pushed the upgrade/transformers-5.4.0 branch from 9bde741 to a225590 Compare March 28, 2026 00:23
The XPU pyproject was missing addict, needed by DeepSeek-OCR remote
model code. Transformers v5.4.0 now validates remote code imports
eagerly, surfacing this missing dependency.
@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

Transformers v5.4.0 passes new kwargs (e.g. device) to image processor
preprocess(). Remote model code (e.g. KimiVL) that defines preprocess()
without **kwargs crashes with TypeError. Patch __call__ to strip
unsupported kwargs on TypeError and retry.
- Fix isort ordering in import patch block
- Extend TokenizersBackend detection to also apply when
  trust_remote_code=True, as transformers 5.4.0 may return
  TokenizersBackend even with trust_remote_code enabled
- Fix black formatting for one-line any() expression
- Retry tokenizer loading with use_fast=False when TokenizersBackend
  is returned, as transformers 5.4.0 may fail to resolve remote
  tokenizer classes with the fast tokenizer backend
Move all monkey-patches from hf_transformers_utils.py top-level into
sglang/srt/utils/transformers_v54_compat.py with:
- Clear categorization (transformers bugs vs remote-model-code compat)
- Upstream issue references for each patch
- Idempotent apply_all() entry point
- Individual patch functions for easy removal when upstream fixes land
@JustinTong0323 JustinTong0323 force-pushed the upgrade/transformers-5.4.0 branch from e6e96aa to 76d74fe Compare March 28, 2026 01:55
Small refactors before splitting hf_transformers_utils.py into modules.
Break the 1270-line monolith into focused modules:

- compat.py: All transformers v5/v5.4 compatibility patches
- common.py: Shared helpers (download, config detection, context length)
- config.py: get_config() and config loading logic
- tokenizer.py: get_tokenizer() and tokenizer v5 fixes
- processor.py: get_processor() and processor loading logic
- __init__.py: Re-exports all public symbols

hf_transformers_utils.py is now a thin shim that re-exports from the
package, so all existing import paths continue to work unchanged.

The old transformers_v54_compat.py is merged into compat.py alongside
other v5 patches (clean_up_tokenization, is_torch_fx_available, etc.).
- Add AutoConfig, _fix_v5_add_bos_eos_token, _fix_added_tokens_encoding
  to __init__.py re-exports (imported by lora/ and test/runners.py)
- Fix is_flash_attn_greater_or_equal_2_10 shim: "2.1.0" -> "2.10.0"
- Replace placeholder TODO(upstream) URL with actionable text
- Add logging to TokenizersBackend retry loop
@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

[auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured.

27 similar comments
@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

[auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured.

@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

[auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured.

@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

[auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured.

@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

[auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured.

@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

[auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured.

@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

[auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured.

@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

[auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured.

@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

[auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured.

@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

[auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured.

@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

[auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured.

@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

[auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured.

@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

[auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured.

@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

[auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured.

@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

[auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured.

@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

[auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured.

@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

[auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured.

@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

[auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured.

@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

[auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured.

@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

[auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured.

@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

[auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured.

@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

[auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured.

@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

[auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured.

@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

[auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured.

@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

[auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured.

@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

[auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured.

@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

[auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured.

@JustinTong0323
Copy link
Copy Markdown
Collaborator Author

/rerun-failed-ci

[auto-monitor] flaky/infra-only failures detected; local repro checks passed where configured.

@JustinTong0323 JustinTong0323 deleted the upgrade/transformers-5.4.0 branch April 29, 2026 06:39
@hnyls2002 hnyls2002 mentioned this pull request Apr 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file Multi-modal multi-modal language model npu run-ci

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants