From f75709605f320101ce43cb923ee07e4486fafc48 Mon Sep 17 00:00:00 2001 From: MKhalusova Date: Mon, 20 Feb 2023 14:37:19 -0500 Subject: [PATCH 1/5] troubleshooting guide: added an error description for missing auto-mapping --- docs/source/en/troubleshooting.mdx | 31 +++++++++++++++++++++++++++++- 1 file changed, 30 insertions(+), 1 deletion(-) diff --git a/docs/source/en/troubleshooting.mdx b/docs/source/en/troubleshooting.mdx index 74346bccef97..b244a1e25868 100644 --- a/docs/source/en/troubleshooting.mdx +++ b/docs/source/en/troubleshooting.mdx @@ -173,4 +173,33 @@ tensor([[ 0.0082, -0.2307], 🤗 Transformers doesn't automatically create an `attention_mask` to mask a padding token if it is provided because: - Some models don't have a padding token. -- For some use-cases, users want a model to attend to a padding token. \ No newline at end of file +- For some use-cases, users want a model to attend to a padding token. + +## ValueError: Unrecognized configuration class for this kind of AutoModel + +Generally, we recommend using the AutoModel class to load pretrained instances of models. This class +can automatically infer and load the correct architecture from a given checkpoint based on the configuration. +However, there are a few exceptions to this. In rare cases, some models' architectures don't map to any of the +AutoModelForXXX classes due to the specifics of their API. When trying to load such an exotic model with the AutoModel +class, you'll get this `ValueError`. + +For example: + +```py +>>> from transformers import AutoProcessor, AutoModel + +>>> processor = AutoProcessor.from_pretrained("Salesforce/blip2-opt-2.7b") +>>> model = AutoModel.from_pretrained("Salesforce/blip2-opt-2.7b") +ValueError: Unrecognized configuration class for this kind of AutoModel: AutoModel. +Model type should be one of AlbertConfig, AltCLIPConfig, ASTConfig, BartConfig, BeitConfig, BertConfig, BertGenerationConfig, BigBirdConfig... +``` + +In this case, you can use `AutoProcessor` to load BLIP-2's processor, but to load a pretrained BLIP-2 model itself, you have to +use `Blip2ForConditionalGeneration` explicitly: + +```py +>>> from transformers import AutoProcessor, Blip2ForConditionalGeneration + +>>> processor = AutoProcessor.from_pretrained("Salesforce/blip2-opt-2.7b") +>>> model = Blip2ForConditionalGeneration.from_pretrained("Salesforce/blip2-opt-2.7b") +``` \ No newline at end of file From 514d961333e0035d679d02dbea605afd8340009e Mon Sep 17 00:00:00 2001 From: MKhalusova Date: Mon, 20 Feb 2023 14:43:18 -0500 Subject: [PATCH 2/5] minor polishing --- docs/source/en/troubleshooting.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/en/troubleshooting.mdx b/docs/source/en/troubleshooting.mdx index b244a1e25868..8d545b7a36df 100644 --- a/docs/source/en/troubleshooting.mdx +++ b/docs/source/en/troubleshooting.mdx @@ -175,7 +175,7 @@ tensor([[ 0.0082, -0.2307], - Some models don't have a padding token. - For some use-cases, users want a model to attend to a padding token. -## ValueError: Unrecognized configuration class for this kind of AutoModel +## ValueError: Unrecognized configuration class XYZ for this kind of AutoModel Generally, we recommend using the AutoModel class to load pretrained instances of models. This class can automatically infer and load the correct architecture from a given checkpoint based on the configuration. From 47fcfabdb8ce3a11057c8a06a551b41ae836bf42 Mon Sep 17 00:00:00 2001 From: MKhalusova Date: Tue, 21 Feb 2023 09:15:40 -0500 Subject: [PATCH 3/5] changed the example --- docs/source/en/troubleshooting.mdx | 33 ++++++++++++------------------ 1 file changed, 13 insertions(+), 20 deletions(-) diff --git a/docs/source/en/troubleshooting.mdx b/docs/source/en/troubleshooting.mdx index b244a1e25868..7649ec34909a 100644 --- a/docs/source/en/troubleshooting.mdx +++ b/docs/source/en/troubleshooting.mdx @@ -178,28 +178,21 @@ tensor([[ 0.0082, -0.2307], ## ValueError: Unrecognized configuration class for this kind of AutoModel Generally, we recommend using the AutoModel class to load pretrained instances of models. This class -can automatically infer and load the correct architecture from a given checkpoint based on the configuration. -However, there are a few exceptions to this. In rare cases, some models' architectures don't map to any of the -AutoModelForXXX classes due to the specifics of their API. When trying to load such an exotic model with the AutoModel -class, you'll get this `ValueError`. - -For example: +can automatically infer and load the correct architecture from a given checkpoint based on the configuration. If you see +this `ValueError` when loading a model from a checkpoint, this means that the Auto class couldn't find a mapping from +the configuration in the given checkpoint to a kind of model you are trying to load. Most commonly this happens when a +checkpoint doesn't support given task. +For instance, you'll see this error in the following example because there is no GPT2 for question answering: ```py ->>> from transformers import AutoProcessor, AutoModel +>>> from transformers import AutoProcessor, AutoModelForQuestionAnswering ->>> processor = AutoProcessor.from_pretrained("Salesforce/blip2-opt-2.7b") ->>> model = AutoModel.from_pretrained("Salesforce/blip2-opt-2.7b") -ValueError: Unrecognized configuration class for this kind of AutoModel: AutoModel. -Model type should be one of AlbertConfig, AltCLIPConfig, ASTConfig, BartConfig, BeitConfig, BertConfig, BertGenerationConfig, BigBirdConfig... +>>> processor = AutoProcessor.from_pretrained("gpt2-medium") +>>> model = AutoModelForQuestionAnswering.from_pretrained("gpt2-medium") +ValueError: Unrecognized configuration class for this kind of AutoModel: AutoModelForQuestionAnswering. +Model type should be one of AlbertConfig, BartConfig, BertConfig, BigBirdConfig, BigBirdPegasusConfig, BloomConfig, ... ``` -In this case, you can use `AutoProcessor` to load BLIP-2's processor, but to load a pretrained BLIP-2 model itself, you have to -use `Blip2ForConditionalGeneration` explicitly: - -```py ->>> from transformers import AutoProcessor, Blip2ForConditionalGeneration - ->>> processor = AutoProcessor.from_pretrained("Salesforce/blip2-opt-2.7b") ->>> model = Blip2ForConditionalGeneration.from_pretrained("Salesforce/blip2-opt-2.7b") -``` \ No newline at end of file +In rare cases, this can also happen when using some exotic models with architectures that don't map to any of the +AutoModelForXXX classes due to the specifics of their API. For example, you can use `AutoProcessor` to load BLIP-2's processor, +but to load a pretrained BLIP-2 model itself, you have to use `Blip2ForConditionalGeneration` explicitly. From f980ab5b0fa8f060e69d30732c7c3e0403706784 Mon Sep 17 00:00:00 2001 From: Maria Khalusova Date: Tue, 21 Feb 2023 13:51:18 -0500 Subject: [PATCH 4/5] Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --- docs/source/en/troubleshooting.mdx | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/source/en/troubleshooting.mdx b/docs/source/en/troubleshooting.mdx index 64d7c6fa736b..e19c0299344c 100644 --- a/docs/source/en/troubleshooting.mdx +++ b/docs/source/en/troubleshooting.mdx @@ -177,11 +177,11 @@ tensor([[ 0.0082, -0.2307], ## ValueError: Unrecognized configuration class XYZ for this kind of AutoModel -Generally, we recommend using the AutoModel class to load pretrained instances of models. This class +Generally, we recommend using the [`AutoModel`] class to load pretrained instances of models. This class can automatically infer and load the correct architecture from a given checkpoint based on the configuration. If you see -this `ValueError` when loading a model from a checkpoint, this means that the Auto class couldn't find a mapping from -the configuration in the given checkpoint to a kind of model you are trying to load. Most commonly this happens when a -checkpoint doesn't support given task. +this `ValueError` when loading a model from a checkpoint, this means the Auto class couldn't find a mapping from +the configuration in the given checkpoint to the kind of model you are trying to load. Most commonly, this happens when a +checkpoint doesn't support a given task. For instance, you'll see this error in the following example because there is no GPT2 for question answering: ```py @@ -194,5 +194,5 @@ Model type should be one of AlbertConfig, BartConfig, BertConfig, BigBirdConfig, ``` In rare cases, this can also happen when using some exotic models with architectures that don't map to any of the -AutoModelForXXX classes due to the specifics of their API. For example, you can use `AutoProcessor` to load BLIP-2's processor, -but to load a pretrained BLIP-2 model itself, you have to use `Blip2ForConditionalGeneration` explicitly. +AutoModelForXXX classes due to the specifics of their API. For example, you can use [`AutoProcessor`] to load BLIP-2's processor, +but to load a pretrained BLIP-2 model itself, you must explicitly use [`Blip2ForConditionalGeneration`]. From 35f16c7eec31d7eac2b2aa0ade6fc7dad7db58f8 Mon Sep 17 00:00:00 2001 From: Maria Khalusova Date: Thu, 23 Feb 2023 11:35:04 -0500 Subject: [PATCH 5/5] Update docs/source/en/troubleshooting.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --- docs/source/en/troubleshooting.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/en/troubleshooting.mdx b/docs/source/en/troubleshooting.mdx index e19c0299344c..068e382eec47 100644 --- a/docs/source/en/troubleshooting.mdx +++ b/docs/source/en/troubleshooting.mdx @@ -195,4 +195,4 @@ Model type should be one of AlbertConfig, BartConfig, BertConfig, BigBirdConfig, In rare cases, this can also happen when using some exotic models with architectures that don't map to any of the AutoModelForXXX classes due to the specifics of their API. For example, you can use [`AutoProcessor`] to load BLIP-2's processor, -but to load a pretrained BLIP-2 model itself, you must explicitly use [`Blip2ForConditionalGeneration`]. +but to load a pretrained BLIP-2 model itself, you must explicitly use [`Blip2ForConditionalGeneration`] as even [`AutoModel`] won't work.