-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Fix OOM in CI by reducing intermediate_size and image token budget for tiny Gemma4 #5760
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
9d325ff
d89829c
5133955
e0971af
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -29,20 +29,35 @@ | |
| processor = AutoProcessor.from_pretrained(MODEL_ID) | ||
| generation_config = GenerationConfig.from_pretrained(MODEL_ID) | ||
|
|
||
| # Gemma4 image processor uses aspect-ratio-preserving resizing, not a fixed image size. max_soft_tokens controls | ||
| # the output token budget and must be one of (70, 140, 280, 560, 1120). The smallest value (70) gives | ||
| # max_patches = 70 × pooling_kernel_size² = 70 × 9 = 630, so position_embedding_size must be at least 630. | ||
| # intermediate_size mirrors Gemma3: without it the production value (text: 6144, vision: 3072) is inherited, causing | ||
| # training activations [batch, patches, intermediate_size] to dominate GPU memory and OOM in CI. | ||
| IMAGE_TOKENS = 70 # minimum supported max_soft_tokens | ||
| MAX_PATCHES = IMAGE_TOKENS * 3**2 # 630 | ||
|
|
||
| text_config = { | ||
| "num_hidden_layers": 2, | ||
| "hidden_size": 16, | ||
| "num_attention_heads": 4, | ||
| "num_key_value_heads": 2, | ||
| "intermediate_size": 32, | ||
| } | ||
| vision_config = { | ||
| "num_hidden_layers": 2, | ||
| "hidden_size": 16, | ||
| "num_attention_heads": 4, | ||
| "num_key_value_heads": 2, | ||
| "embed_dim": 64, | ||
| "intermediate_size": 32, | ||
| "position_embedding_size": MAX_PATCHES, # 630 | ||
| "default_output_length": IMAGE_TOKENS, # 70 | ||
| } | ||
|
|
||
| processor.image_processor.image_seq_length = IMAGE_TOKENS | ||
| processor.image_processor.max_soft_tokens = IMAGE_TOKENS | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Processor attributes leak into tokenizer_config.jsonLow Severity Setting Reviewed by Cursor Bugbot for commit e0971af. Configure here.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This was the case after 5133955, but I have reverted it with e0971af. I have checked that the generated |
||
|
|
||
| config = AutoConfig.from_pretrained(MODEL_ID) | ||
| for k, v in text_config.items(): | ||
| setattr(config.text_config, k, v) | ||
|
|
||


Uh oh!
There was an error while loading. Please reload this page.