-
Notifications
You must be signed in to change notification settings - Fork 31.9k
[gemma3] support sequence classification task #39465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
| @unittest.skip("Loading nested configs with overwritten `kwargs` isn't supported yet, FIXME @raushan.") | ||
| def test_load_with_mismatched_shapes(self): | ||
| pass | ||
|
|
||
| @unittest.skip("Loading nested configs with overwritten `kwargs` isn't supported yet, FIXME @raushan.") | ||
| def test_mismatched_shapes_have_properly_initialized_weights(self): | ||
| pass | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I mean loading like Gemma3Config.from_dict(config_dict, vocab_size=100) where the vocab_size is actually part of config.text_config. It is a known issue and I have it in my plans to support
|
[For maintainers] Suggested jobs to run (before merge) run-slow: auto, gemma3 |
Cyrilvallez
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, but don't we have any other vlm from which we can inherit directly with modular?
|
Nope, we usually don't add when there's no official pretrained checkpoint |
* add seq clf class * fix docs and add in auto-map * skip tests * optional pixels
* add seq clf class * fix docs and add in auto-map * skip tests * optional pixels
* add seq clf class * fix docs and add in auto-map * skip tests * optional pixels
* add seq clf class * fix docs and add in auto-map * skip tests * optional pixels
* add seq clf class * fix docs and add in auto-map * skip tests * optional pixels
* add seq clf class * fix docs and add in auto-map * skip tests * optional pixels
* add seq clf class * fix docs and add in auto-map * skip tests * optional pixels
* add seq clf class * fix docs and add in auto-map * skip tests * optional pixels
What does this PR do?
As per title, we can't copy from llama or any other LLM because Gemma3 needs to obtain
text_configparams and needs to pass extra vision kwargs inforward. Thus the code was adapted from llama and the tests are greenFixes #36755