-
Notifications
You must be signed in to change notification settings - Fork 31.6k
Description
System Info
transformersversion: 4.52.3- Platform: macOS-15.4.1-arm64-arm-64bit-Mach-O
- Python version: 3.13.0
- Huggingface_hub version: 0.32.0
- Safetensors version: 0.5.3
- Accelerate version: not installed
- Accelerate config: not found
- DeepSpeed version: not installed
- PyTorch version (GPU?): 2.7.0 (False)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using distributed or parallel set-up in script?:
Who can help?
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
- Install
pytorch,pillow, andtransformers=4.52.3using pip. - Execute the following script:
import torch
from transformers import AutoProcessor
processor = AutoProcessor.from_pretrained("google/paligemma-3b-pt-224")
batch_features = processor(
text="<image> What's in this image?",
images=torch.zeros(3, 224, 224),
suffix="Nothing",
return_tensors="pt"
)This yields an AttributeError with transformers==4.52.3
File "/private/tmp/venv/lib/python3.13/site-packages/transformers/models/paligemma/processing_paligemma.py", line 313, in __call__
labels = inputs["input_ids"].masked_fill(inputs["token_type_ids"] == 0, -100)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'list' object has no attribute 'masked_fill'
Expected behavior
The batch_features should be created without error.
There seems to be a recent bug in the __call__ method of many processors, including, e.g., PaliGemmaProcessor
This is likely caused by
transformers/src/transformers/models/paligemma/processing_paligemma.py
Lines 301 to 307 in 31f8a0f
| return_tensors = output_kwargs["text_kwargs"].pop("return_tensors", None) | |
| inputs = self.tokenizer( | |
| input_strings, | |
| text_pair=suffix, | |
| return_token_type_ids=return_token_type_ids, | |
| **output_kwargs["text_kwargs"], | |
| ) |
which was changed in commit 32eca71
I believe the intention was to call .get() instead of .pop() on text_kwargs on line 301. Calling .pop() modifies text_kwargs in-place and hence the tokenizer would return inputs["input_ids"] as list instead of pytorch tensors. The masked_fill call below will fail when it's a list.
transformers/src/transformers/models/paligemma/processing_paligemma.py
Lines 312 to 313 in 31f8a0f
| if return_token_type_ids: | |
| labels = inputs["input_ids"].masked_fill(inputs["token_type_ids"] == 0, -100) |