Skip to content
Merged
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 23 additions & 1 deletion docs/source/en/troubleshooting.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -173,4 +173,26 @@ tensor([[ 0.0082, -0.2307],
🤗 Transformers doesn't automatically create an `attention_mask` to mask a padding token if it is provided because:

- Some models don't have a padding token.
- For some use-cases, users want a model to attend to a padding token.
- For some use-cases, users want a model to attend to a padding token.

## ValueError: Unrecognized configuration class XYZ for this kind of AutoModel

Generally, we recommend using the [`AutoModel`] class to load pretrained instances of models. This class
can automatically infer and load the correct architecture from a given checkpoint based on the configuration. If you see
this `ValueError` when loading a model from a checkpoint, this means the Auto class couldn't find a mapping from
the configuration in the given checkpoint to the kind of model you are trying to load. Most commonly, this happens when a
checkpoint doesn't support a given task.
For instance, you'll see this error in the following example because there is no GPT2 for question answering:

```py
>>> from transformers import AutoProcessor, AutoModelForQuestionAnswering

>>> processor = AutoProcessor.from_pretrained("gpt2-medium")
>>> model = AutoModelForQuestionAnswering.from_pretrained("gpt2-medium")
ValueError: Unrecognized configuration class <class 'transformers.models.gpt2.configuration_gpt2.GPT2Config'> for this kind of AutoModel: AutoModelForQuestionAnswering.
Model type should be one of AlbertConfig, BartConfig, BertConfig, BigBirdConfig, BigBirdPegasusConfig, BloomConfig, ...
```

In rare cases, this can also happen when using some exotic models with architectures that don't map to any of the
AutoModelForXXX classes due to the specifics of their API. For example, you can use [`AutoProcessor`] to load BLIP-2's processor,
but to load a pretrained BLIP-2 model itself, you must explicitly use [`Blip2ForConditionalGeneration`].
Comment thread
MKhalusova marked this conversation as resolved.
Outdated