Add AIMv2 to Transformers by AlanPonnachan · Pull Request #35550 · huggingface/transformers

AlanPonnachan · 2025-01-07T17:14:56Z

What does this PR do?

Fixes #35351

This PR adds AIMv2 support in Transformers. AIMv2 showed better performance than SigLIP.

TODO

Completed almost every steps mentioned in this guide

Who can review?

@qubvel
@Rocketknight1

AlanPonnachan · 2025-01-10T15:42:03Z

@qubvel @Rocketknight1 Could you help review this PR ? Let me know if you have any suggestions. Thank you in advance for your time and assistance.

qubvel

Hi @AlanPonnachan! Thanks for working on the model 🤗

Please see other model implementations in the repo to follow code style and patterns (e.g. how the attention module should be implemented).
It's better to reuse existing blocks rather than defining new ones. You can also utilize modular converter for inheritance.

Please see similar PRs:

Thanks!

docs/source/en/model_doc/aimv2.md

src/transformers/models/aimv2/__init__.py

qubvel · 2025-01-13T14:06:04Z

src/transformers/models/aimv2/convert_aimv2_to_hf.py

@@ -0,0 +1,225 @@
+# coding=utf-8


Please refactor it to follow mllama model conversion format. It should be a KEY_MAPPING dict instead of create_rename_keys

qubvel · 2025-01-13T14:06:35Z

src/transformers/models/aimv2/convert_aimv2_to_hf.py

+    return image
+
+
+@torch.no_grad()


Do we need no_grad here? The one is used in the code below

qubvel · 2025-01-13T14:08:52Z

src/transformers/models/aimv2/modeling_aimv2.py

+        )
+        query, key, value = qkv.unbind(0)
+
+        context_layer = F.scaled_dot_product_attention(query, key, value, attn_mask=mask)


Please see other models on how the attention module should be structured.

yaswanth19 · 2025-03-03T11:16:58Z

@AlanPonnachan Are you still working on this and if you don't have the bandwidth then I can take it from here and refine the code.

AlanPonnachan · 2025-03-03T13:13:31Z

@yaswanth19 you can try to do modular converter part. I am facing some issues on that part. Further I can help you

Add AIMv2- Updated all files

92a2223

AlanPonnachan mentioned this pull request Jan 7, 2025

Any plans to add AIMv2 in the model? #35351

Closed

2 tasks

AlanPonnachan added 2 commits January 7, 2025 23:40

further improvements

94352ce

add init to tests

14cea1d

qubvel added New model Multimodal labels Jan 7, 2025

qubvel reviewed Jan 13, 2025

View reviewed changes

modified init

d8ab0e8

yaswanth19 mentioned this pull request Mar 10, 2025

Add Aimv2 model #36625

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add AIMv2 to Transformers#35550

Add AIMv2 to Transformers#35550
AlanPonnachan wants to merge 4 commits intohuggingface:mainfrom
AlanPonnachan:add_aimv2_b2

AlanPonnachan commented Jan 7, 2025

Uh oh!

AlanPonnachan commented Jan 10, 2025

Uh oh!

qubvel left a comment

Uh oh!

Uh oh!

Uh oh!

qubvel Jan 13, 2025

Uh oh!

qubvel Jan 13, 2025

Uh oh!

qubvel Jan 13, 2025

Uh oh!

yaswanth19 commented Mar 3, 2025

Uh oh!

AlanPonnachan commented Mar 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

AlanPonnachan commented Jan 7, 2025

What does this PR do?

TODO

Who can review?

Uh oh!

AlanPonnachan commented Jan 10, 2025

Uh oh!

qubvel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

qubvel Jan 13, 2025

Choose a reason for hiding this comment

Uh oh!

qubvel Jan 13, 2025

Choose a reason for hiding this comment

Uh oh!

qubvel Jan 13, 2025

Choose a reason for hiding this comment

Uh oh!

yaswanth19 commented Mar 3, 2025

Uh oh!

AlanPonnachan commented Mar 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants