Refactor embedding input/output getter/setter #39339

molbap · 2025-07-10T13:19:22Z

What does this PR do?

TL;DR

PreTrainedModel now mixes in EmbeddingAccessMixin providing default
get_input_embeddings / set_input_embeddings / get_output_embeddings / set_output_embeddings for all models. These methods should be removed from the codebase unless exceptionally weird.

Details

Uses a new attribute __input_embed_layer = "embed_tokens" by default, and one can change it to set which layer the embeddings should be gotten/set to. Then, assuming embed_tokens is that layer for instance, resolution order is

self.model.embed_tokens
self.embed_tokens
delegate once via base_model_prefix
else raise (model must override)

get_output_embeddings now auto-returns lm_head only if input embeddings
resolve (so pure audio/vision backbones still return None).

What you usually have to do: nothing.

Override only if:

embeddings live under a different attribute
multiple embedding tables
you need to hide an lm_head (override get_output_embeddings retunring None)
helper head without token embeddings but still wants tying (override and return the head)

Potential breakages:

Exotic layouts with custom embedding attr names (must override)

cc @vasqu @zucchini-nlp, minor but to remember for composite models too

HuggingFaceDocBuilderDev · 2025-07-10T13:32:41Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ArthurZucker

Boom! unbloating at its best! let's get htis merged!

src/transformers/modeling_utils.py

molbap · 2025-07-17T17:29:30Z

Down to a few tests failing because of audio embeddings exceptions mostly, solving, then adding set_ and get_ encoder methods and merging this

ArthurZucker

small nit but good to go otherwise!

src/transformers/modeling_utils.py

ArthurZucker · 2025-07-21T11:26:33Z

src/transformers/models/beit/modeling_beit.py

+    def get_output_embeddings(self):
+        return None
+


self.lm_head is the output embedding no?

well the answer is annoying 😬 the input embeddings are defined as the conv2d patch layer

def get_input_embeddings(self): return self.embeddings.patch_embeddings

so if get_output_embeddings stays defined, since in the default config this param defaults to True

tie_word_embeddings (`bool`, *optional*, defaults to `True`):

then an attempt at tying the weights is always done at init

if getattr(self.config.get_text_config(decoder=True), "tie_word_embeddings", True): output_embeddings = self.get_output_embeddings() if output_embeddings is not None: self._tie_or_clone_weights(output_embeddings, self.get_input_embeddings())

which will fail because conv2 patches have no weight attribute. WDYT of reworking/removing this part from tie_weights?
simpler solution is enforcing return None which is what I have done here (and same for the other comment)

None works, needs a comment!

src/transformers/models/clvp/modeling_clvp.py

…into draft_VLM_apis

github-actions · 2025-07-21T15:30:07Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: arcee, aria, aya_vision, bamba, bark, bart, beit, bigbird_pegasus, biogpt, bitnet, blenderbot, blenderbot_small, bloom, chameleon, clvp, codegen

molbap · 2025-07-21T15:40:09Z

run-slow: beit, bark, bart, clvp, glm4, glm4_moe, zamba2

github-actions · 2025-07-21T15:41:32Z

This comment contains run-slow, running the specified jobs:

models: ['models/bark', 'models/bart', 'models/beit', 'models/clvp', 'models/glm4', 'models/glm4_moe', 'models/zamba2']
quantizations: [] ...

molbap · 2025-07-21T16:17:05Z

Tested on a few slow models, failings are identical to main. Merging and keeping an eye out.

* simplify common get/set * remove some noise * change some 5 years old modeling utils * update examples * fix copies * revert some changes * fixes, gah * format * move to Mixin * remove smolvlm specific require grad * skip * force defaults * remodularise some stuff * remodularise more stuff * add safety for audio models * style * have a correct fallback, you daft donkey * remove this argh * change heuristic for audio models * fixup * revert * this works * revert again * 🧠 * aaah ESM has two modelings aaah * add informative but short comment * add `input_embed_layer` mixin attribute * style * walrus has low precedence * modular fix * this was breaking parser

molbap added 5 commits July 10, 2025 15:08

simplify common get/set

d107f9e

remove some noise

98e5467

change some 5 years old modeling utils

cc8b91c

update examples

077af3d

fix copies

7d9b16c

molbap added 3 commits July 10, 2025 16:09

revert some changes

32c7c00

fixes, gah

33e0f06

format

91537aa

molbap mentioned this pull request Jul 10, 2025

2/2 More cleaning for the LlamaModel keeping only the core #38368

Closed

molbap added 4 commits July 11, 2025 17:48

move to Mixin

57aac34

remove smolvlm specific require grad

39322e6

Merge branch 'main' into draft_VLM_apis

951d269

skip

d2dd129

ArthurZucker reviewed Jul 15, 2025

View reviewed changes

src/transformers/modeling_utils.py Show resolved Hide resolved

molbap added 8 commits July 17, 2025 15:30

force defaults

92e2296

Merge branch 'main' into draft_VLM_apis

00b676f

remodularise some stuff

c5a3676

remodularise more stuff

1265140

add safety for audio models

b776481

style

d9693a3

have a correct fallback, you daft donkey

4ab8399

remove this argh

2bee084

molbap added 6 commits July 18, 2025 12:14

change heuristic for audio models

f0fe775

Merge branch 'main' into draft_VLM_apis

773a9ad

fixup

a233db2

revert

d0fbe20

this works

5436239

revert again

41221d8

molbap changed the title ~~[WIP] Refacto~~ Refactor embedding input/output getter/setter Jul 21, 2025

ArthurZucker approved these changes Jul 21, 2025

View reviewed changes

molbap mentioned this pull request Jul 21, 2025

[WIP] try to relax the tie_weights method #39555

Draft

molbap added 9 commits July 21, 2025 15:26

add informative but short comment

6e1fbdd

Merge branch 'draft_VLM_apis' of github.com:huggingface/transformers …

807eaf0

…into draft_VLM_apis

add input_embed_layer mixin attribute

3edfa42

style

b1af42a

Merge branch 'main' into draft_VLM_apis

8455e0f

walrus has low precedence

153a0b0

modular fix

31404b4

this was breaking parser

822dd44

Merge branch 'main' into draft_VLM_apis

395d3f9

molbap added the Core: Modeling Internals of the library; Models. label Jul 21, 2025

molbap merged commit 69b1582 into main Jul 21, 2025
27 of 28 checks passed

molbap deleted the draft_VLM_apis branch July 21, 2025 16:18

vasqu mentioned this pull request Jul 21, 2025

[CI] Fix post merge ernie 4.5 #39561

Merged

Cyrilvallez mentioned this pull request Jul 25, 2025

Delete bad rebasing functions #39672

Merged

hmellor mentioned this pull request Aug 4, 2025

Update transformers to v4.55 vllm-project/vllm#21931

Merged

Refactor embedding input/output getter/setter #39339

Refactor embedding input/output getter/setter #39339

Uh oh!

Conversation

molbap commented Jul 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Details

Uh oh!

HuggingFaceDocBuilderDev commented Jul 10, 2025

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

molbap commented Jul 17, 2025

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ArthurZucker Jul 21, 2025

Choose a reason for hiding this comment

Uh oh!

molbap Jul 21, 2025

Choose a reason for hiding this comment

Uh oh!

ArthurZucker Jul 21, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Jul 21, 2025

Uh oh!

molbap commented Jul 21, 2025

Uh oh!

github-actions bot commented Jul 21, 2025

Uh oh!

molbap commented Jul 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

molbap commented Jul 10, 2025 •

edited

Loading