Skip to content

Dynamic weight conversion is recursive#44300

Merged
zucchini-nlp merged 48 commits intohuggingface:mainfrom
zucchini-nlp:convert-weights-recursive
Mar 26, 2026
Merged

Dynamic weight conversion is recursive#44300
zucchini-nlp merged 48 commits intohuggingface:mainfrom
zucchini-nlp:convert-weights-recursive

Conversation

@zucchini-nlp
Copy link
Copy Markdown
Member

@zucchini-nlp zucchini-nlp commented Feb 26, 2026

What does this PR do?

The recursive feature is needed for me in #44252 to allow timm backbone define its conversion only once. Also it currently allows to delete "t5gemma2" from conversion, allowing its backbones to rename weight

Comment thread src/transformers/conversion_mapping.py Outdated
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@zucchini-nlp
Copy link
Copy Markdown
Member Author

run-slow: colpali, colqwen2, emu3, ernie4_5_vl_moe, fuyu, gemma3, gemma3n, glm4v, qwen2_vl, maskformer, llava, sam3, qwen3_5

@zucchini-nlp
Copy link
Copy Markdown
Member Author

run-slow: colpali, colqwen2, emu3, ernie4_5_vl_moe, fuyu, gemma3, gemma3n, glm4v, qwen2_vl, maskformer, llava, sam3, qwen3_5

@github-actions
Copy link
Copy Markdown
Contributor

Workflow Run ⚙️

This comment contains run-slow, running the specified jobs:

models: ["models/colpali", "models/colqwen2", "models/emu3", "models/ernie4_5_vl_moe", "models/fuyu", "models/gemma3", "models/gemma3n", "models/glm4v", "models/llava", "models/maskformer", "models/qwen2_vl", "models/qwen3_5", "models/sam3"]
quantizations: []

@github-actions
Copy link
Copy Markdown
Contributor

CI Results

Workflow Run ⚙️

Commit Info

Context Commit Description
RUN a7702f43 workflow commit (merge commit)
PR ea3ac63a branch commit (from PR)
main b812aa91 base commit (on main)

Model CI Report

8 new failed tests from this PR 😭

  • colpali:
    tests/models/colpali/test_modeling_colpali.py::ColPaliModelIntegrationTest::test_model_integration_test (✅ ⟹ ❌)

  • colqwen2:
    tests/models/colqwen2/test_modeling_colqwen2.py::ColQwen2ForRetrievalModelTest::test_reverse_loading_mapping (✅ ⟹ ❌)

  • emu3:
    tests/models/emu3/test_modeling_emu3.py::Emu3IntegrationTest::test_model_generate_images (❌ ⟹ ❌)
    tests/models/emu3/test_modeling_emu3.py::Emu3IntegrationTest::test_model_generation (❌ ⟹ ❌)
    tests/models/emu3/test_modeling_emu3.py::Emu3IntegrationTest::test_model_generation_batched (❌ ⟹ ❌)
    tests/models/emu3/test_modeling_emu3.py::Emu3IntegrationTest::test_model_generation_multi_image (❌ ⟹ ❌)

  • fuyu:
    tests/models/fuyu/test_modeling_fuyu.py::FuyuModelIntegrationTest::test_greedy_generation (❌ ⟹ ❌)

  • llava:
    tests/models/llava/test_modeling_llava.py::LlavaForConditionalGenerationIntegrationTest::test_pixtral (❌ ⟹ ❌)

@zucchini-nlp
Copy link
Copy Markdown
Member Author

run-slow: colpali, colqwen2, ernie4_5_vl_moe, fuyu, gemma3, gemma3n, glm4v, qwen2_vl, maskformer, llava, sam3, qwen3_5

@github-actions
Copy link
Copy Markdown
Contributor

Workflow Run ⚙️

This comment contains run-slow, running the specified jobs:

models: ["models/colpali", "models/colqwen2", "models/ernie4_5_vl_moe", "models/fuyu", "models/gemma3", "models/gemma3n", "models/glm4v", "models/llava", "models/maskformer", "models/qwen2_vl", "models/qwen3_5", "models/sam3"]
quantizations: []

Comment thread src/transformers/conversion_mapping.py
Comment thread src/transformers/conversion_mapping.py Outdated
Comment thread src/transformers/core_model_loading.py Outdated
Comment thread src/transformers/core_model_loading.py Outdated
Comment thread tests/models/mllama/test_modeling_mllama.py Outdated
@zucchini-nlp
Copy link
Copy Markdown
Member Author

Should be ready now, failing qwen3-5-moe is not related. It has been failing on main for a long time and is related to expert impl

@github-actions
Copy link
Copy Markdown
Contributor

CI Results

Workflow Run ⚙️

Commit Info

Context Commit Description
RUN a6785476 workflow commit (merge commit)
PR 500e96b0 branch commit (from PR)
main 7d511b6a base commit (on main)

⚠️ Model CI failed to report results

The test failure analysis could not be completed. Please check the workflow run for details.

@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: conditional_detr, detr, edgetam, fast_vlm, maskformer, pe_audio_video, pe_video, perception_lm

@github-actions
Copy link
Copy Markdown
Contributor

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=44300&sha=7e3d40

@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: conditional_detr, detr, edgetam, fast_vlm, maskformer, pe_audio_video, pe_video, perception_lm

@zucchini-nlp
Copy link
Copy Markdown
Member Author

@bot /repo

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 25, 2026

Repo. Consistency bot fixed some files and pushed the changes.

@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: conditional_detr, detr, edgetam, fast_vlm, maskformer, pe_audio_video, pe_video, perception_lm

@zucchini-nlp
Copy link
Copy Markdown
Member Author

@bot /style

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 25, 2026

Style fix fix runs successfully without any file modified.

@github-actions
Copy link
Copy Markdown
Contributor

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=44300&sha=df86ff

@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: conditional_detr, detr, edgetam, fast_vlm, maskformer, pe_audio_video, pe_video, perception_lm

@zucchini-nlp zucchini-nlp enabled auto-merge March 25, 2026 14:07
@github-actions
Copy link
Copy Markdown
Contributor

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=44300&sha=ebcb04

@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: conditional_detr, detr, edgetam, fast_vlm, maskformer, pe_audio_video, pe_video, perception_lm

@github-actions
Copy link
Copy Markdown
Contributor

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=44300&sha=99f0a0

@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: conditional_detr, detr, edgetam, fast_vlm, maskformer, pe_audio_video, pe_video, perception_lm

@zucchini-nlp zucchini-nlp added this pull request to the merge queue Mar 26, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Mar 26, 2026
@zucchini-nlp zucchini-nlp added this pull request to the merge queue Mar 26, 2026
Merged via the queue into huggingface:main with commit 09832b2 Mar 26, 2026
29 checks passed
@zucchini-nlp zucchini-nlp deleted the convert-weights-recursive branch March 26, 2026 11:59
@vasqu vasqu mentioned this pull request Mar 26, 2026
5 tasks
zucchini-nlp added a commit to zucchini-nlp/transformers that referenced this pull request Mar 27, 2026
* split out from timm PR

* all other VLMs

* timm backbone is not here

* oops, extra key is breaking eveerything

* .

* this test

* maybe

* fix missing keys when loading from hub

* now fix fast tests

* merge gone wrong

* fix repo

* refine the regex again!

* close the bracket

* Apply suggestions from code review

Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>

* revert unrelated

* !

* revert more

* add submodule prefix when recursing

* i'll need to fix maskformer later

* dont duplicate the same pattern twice

* fix modular

* detr

* colpali isn't working still!

* oke, so this can be fine for now

* !

* revert

* dot lost in regex and comments

* timm wrapper is weird

* skip these, timm wrapper

* bye bye timm

* make repo check happy

* Revert "bye bye timm"

This reverts commit ca68663.

* love timm!

* Apply repo consistency fixes

* oke, the bot can't fix it so here we go

---------

Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
NielsRogge pushed a commit to NielsRogge/transformers that referenced this pull request Mar 30, 2026
* split out from timm PR

* all other VLMs

* timm backbone is not here

* oops, extra key is breaking eveerything

* .

* this test

* maybe

* fix missing keys when loading from hub

* now fix fast tests

* merge gone wrong

* fix repo

* refine the regex again!

* close the bracket

* Apply suggestions from code review

Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>

* revert unrelated

* !

* revert more

* add submodule prefix when recursing

* i'll need to fix maskformer later

* dont duplicate the same pattern twice

* fix modular

* detr

* colpali isn't working still!

* oke, so this can be fine for now

* !

* revert

* dot lost in regex and comments

* timm wrapper is weird

* skip these, timm wrapper

* bye bye timm

* make repo check happy

* Revert "bye bye timm"

This reverts commit ca68663.

* love timm!

* Apply repo consistency fixes

* oke, the bot can't fix it so here we go

---------

Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants