Skip to content

Fixes configuration default values#43592

Merged
zucchini-nlp merged 14 commits intohuggingface:mainfrom
zucchini-nlp:pad-token-ids
Jan 30, 2026
Merged

Fixes configuration default values#43592
zucchini-nlp merged 14 commits intohuggingface:mainfrom
zucchini-nlp:pad-token-ids

Conversation

@zucchini-nlp
Copy link
Member

@zucchini-nlp zucchini-nlp commented Jan 29, 2026

What does this PR do?

Fixes #43525
Fixes #43572

Adds missing pad_token_id and tie_word_embeddings to config classes with their defaults

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@zucchini-nlp zucchini-nlp added the for patch Tag issues / labels that should be included in the next patch label Jan 29, 2026
@zucchini-nlp
Copy link
Member Author

Adding fix to tie_word_embeddings, don't merge!

@zucchini-nlp zucchini-nlp changed the title Fixes 'pad_token_id' issues Fixes configuration default values Jan 29, 2026
@zucchini-nlp
Copy link
Member Author

run-slow: cohere2, deformable_detr, emu3, exaone4, falcon_mamba, fast_vlm, flava, florence2, glm46v, got_ocr2, gpt_bigcode, gpt_neox, gptj, internvl, jetmoe, mamba

@github-actions
Copy link
Contributor

This comment contains run-slow, running the specified jobs:

models: ["models/cohere2", "models/deformable_detr", "models/emu3", "models/exaone4", "models/falcon_mamba", "models/fast_vlm", "models/flava", "models/florence2", "models/glm46v", "models/got_ocr2", "models/gpt_bigcode", "models/gpt_neox", "models/gptj", "models/internvl", "models/jetmoe", "models/mamba"]
quantizations: []

Copy link
Member

@Rocketknight1 Rocketknight1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with one comment!

@github-actions
Copy link
Contributor

CI Results

Workflow Run ⚙️

✅ No failing test specific to this PR 🎉 !

@zucchini-nlp
Copy link
Member Author

@bot /style

@zucchini-nlp
Copy link
Member Author

Deformable detr is flaky now, apparently related to the random order of tests 😢 Not reproducible locally if I run a single testcase

@zucchini-nlp
Copy link
Member Author

@bot /repo

@github-actions
Copy link
Contributor

github-actions bot commented Jan 29, 2026

Repo. Consistency bot fixed some files and pushed the changes.

@zucchini-nlp zucchini-nlp enabled auto-merge (squash) January 30, 2026 09:55
@github-actions
Copy link
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: cohere2, cohere2_vision, deepseek_vl, deepseek_vl_hybrid, deformable_detr, emu3, exaone4, falcon_mamba, fast_vlm, flava, florence2, glm46v, got_ocr2, gpt_bigcode, gpt_neox, gptj

@zucchini-nlp zucchini-nlp disabled auto-merge January 30, 2026 10:35
@zucchini-nlp
Copy link
Member Author

run-slow: llava_onevision, llava_next_video

@github-actions
Copy link
Contributor

This comment contains run-slow, running the specified jobs:

models: ["models/llava_next_video", "models/llava_onevision"]
quantizations: []

@github-actions
Copy link
Contributor

CI Results

Workflow Run ⚙️

✅ No failing test specific to this PR 🎉 !

@zucchini-nlp zucchini-nlp enabled auto-merge (squash) January 30, 2026 11:02
@zucchini-nlp zucchini-nlp merged commit 562106f into huggingface:main Jan 30, 2026
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

for patch Tag issues / labels that should be included in the next patch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

missing pad_token_idx in StableLmConfig after 5.0 update AttributeError: 'Llama4Config' object has no attribute 'pad_token_id'

3 participants