ValueError: Tokenizer class LLaMATokenizer does not exist or is not currently imported. #22222

candowu · 2023-03-17T07:23:16Z

System Info

4.27.1

Who can help?

No response

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

i test llama in colab here is my code and output:

!pip install git+https://github.com/huggingface/transformers
!pip install sentencepiece

import torch
from transformers import pipeline,LlamaTokenizer,LlamaForCausalLM

device = "cuda:0" if torch.cuda.is_available() else "cpu"
print(device)

tokenizer = LlamaTokenizer.from_pretrained("decapoda-research/llama-7b-hf")

model = LlamaForCausalLM.from_pretrained("decapoda-research/llama-7b-hf")

generator = pipeline(model="decapoda-research/llama-7b-hf", device=device)
generator("I can't believe you did such a ")

ValueError Traceback (most recent call last)
in
7 # tokenizer = LlamaTokenizer.from_pretrained("decapoda-research/llama-7b-hf")
8 # model = LlamaForCausalLM.from_pretrained("decapoda-research/llama-7b-hf")
----> 9 generator = pipeline(model="decapoda-research/llama-7b-hf", device=device)
10 generator("I can't believe you did such a ")

1 frames
/usr/local/lib/python3.9/dist-packages/transformers/models/auto/tokenization_auto.py in from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs)
675
676 if tokenizer_class is None:
--> 677 raise ValueError(
678 f"Tokenizer class {tokenizer_class_candidate} does not exist or is not currently imported."
679 )

ValueError: Tokenizer class LLaMATokenizer does not exist or is not currently imported.

Expected behavior

expect output generated info

The text was updated successfully, but these errors were encountered:

yhifny · 2023-03-17T08:43:35Z

I face the same issue

amyeroberts · 2023-03-17T08:48:07Z

Hi @candowu, thanks for raising this issue. This is arising, because the tokenizer in the config on the hub points to LLaMATokenizer. However, the tokenizer in the library is LlamaTokenizer.

This is likely due to the configuration files being created before the final PR was merged in.

yhifny · 2023-03-17T09:45:56Z

I cloned the repo and changed the tokenizer in the config file to LlamaTokenizer
but I got
ValueError: Tokenizer class LlamaTokenizer does not exist or is not currently imported.

mbehm · 2023-03-17T10:17:38Z

For anybody interested I was able to load an earlier saved model with the same issue using my fork with the capitalization restored. That being said for future it's probably better to try find or save a new model with the new naming.

amyeroberts · 2023-03-17T13:58:39Z

@yhifny Are you able to import the tokenizer directly using from transformers import LlamaTokenizer ?

If not, can you make sure that you are working from the development branch in your environment using:
pip install git+https://github.com/huggingface/transformers

more details here.

nadahlberg · 2023-03-17T18:56:42Z

I can import the LlamaTokenizer class, but getting error that from_pretrained method is None. Anyone else having this issue?

sgugger · 2023-03-17T19:09:56Z

As the error message probably mentions, you need to install sentencepiece: pip install sentencepiece.

nadahlberg · 2023-03-17T19:23:03Z

Working now. I swear I had sentencepiece, but probably forgot to reset the runtime 🤦 My bad!

xhinker · 2023-03-17T22:31:34Z

For anybody interested I was able to load an earlier saved model with the same issue using my fork with the capitalization restored. That being said for future it's probably better to try find or save a new model with the new naming.

Thanks, man, your link solved all the problem

nameless0704 · 2023-03-21T01:50:06Z

Hi @candowu, thanks for raising this issue. This is arising, because the tokenizer in the config on the hub points to LLaMATokenizer. However, the tokenizer in the library is LlamaTokenizer.

This is likely due to the configuration files being created before the final PR was merged in.

Change the LLaMATokenizer in tokenizer_config.json into lowercase LlamaTokenizer and it works like a charm.

vdattwani2005 · 2023-03-26T05:58:58Z

For anybody interested I was able to load an earlier saved model with the same issue using my fork with the capitalization restored. That being said for future it's probably better to try find or save a new model with the new naming.

Thank you so much for this! Works!

sarrahbbh · 2023-03-29T13:08:38Z

Hi @candowu, thanks for raising this issue. This is arising, because the tokenizer in the config on the hub points to LLaMATokenizer. However, the tokenizer in the library is LlamaTokenizer.
This is likely due to the configuration files being created before the final PR was merged in.

Change the LLaMATokenizer in tokenizer_config.json into lowercase LlamaTokenizer and it works like a charm.

I assume this is applied to the llama-7b cloned repo from HuggingFace right? How can I instantiate the model and the tokenizer after doing that please?

thekevshow · 2023-03-30T05:42:17Z

you are a life saver. There docs on the site should be updated for this reference.

RiseInRose · 2023-03-31T06:12:45Z

Thank you so much for this! Works! That's amazing!

alvations · 2023-04-02T01:54:35Z

You can try this for a ather crazy way to find out what is the right casing for the module:

import transformers

from itertools import product
import importlib

def find_variable_case(s, max_tries=1000):
  var_permutations = list(map("".join, product(*zip(s.upper(), s.lower()))))
  # Intuitively, any camel casing should minimize the no. of upper chars.
  # From https://stackoverflow.com/a/58789587/610569
  var_permutations.sort(key=lambda ss: (sum(map(str.isupper, ss)), len(ss)))
  for i, v in enumerate(var_permutations):
    if i > max_tries:
      return
    try:
      dir(transformers).index(v)
      return v
    except:
      continue


v = find_variable_case('LLaMatokenizer')
exec(f"from transformers import {v}")
vars()[v]

[out]:

transformers.utils.dummy_sentencepiece_objects.LlamaTokenizer

FatCache · 2023-04-02T22:14:32Z

I encountered the same issue identified at the thread today 4/2/2023. The post #22222 (comment) fixed the problem for me.

Thank you.

qufy6 · 2023-04-12T07:52:04Z

Hi! I am facing the same problem. I try to import LlamaTokenizer,
But:---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
Cell In[27], line 1
----> 1 from transformers import LlamaTokenizer

ImportError: cannot import name 'LlamaTokenizer' from 'transformers' (/usr/local/anaconda3/envs/abc/lib/python3.10/site-packages/transformers/init.py)

and the version of transformers is "transformers 4.28.0.dev0 pypi_0 pypi"

plz tell me how to fix it.

sgugger · 2023-04-12T11:39:49Z

You need to install the library from source to be able to use the LLaMA model.

sarrahbbh · 2023-05-15T14:00:20Z

Hi @candowu, thanks for raising this issue. This is arising, because the tokenizer in the config on the hub points to LLaMATokenizer. However, the tokenizer in the library is LlamaTokenizer.
This is likely due to the configuration files being created before the final PR was merged in.

Change the LLaMATokenizer in tokenizer_config.json into lowercase LlamaTokenizer and it works like a charm.

Can you please enlighten me on how this could be achieved please? I'm new to this

zhanghanghitomi · 2023-06-13T05:58:32Z

Hi, @nameless0704. First, I would like to thank you for the insightful comment of changing the LLaMATokenizer in tokenizer_config.json into lowercase LlamaTokenizer . I am fairly new in this area. May I ask how to Change the LLaMATokenizer in tokenizer_config.json into lowercase LlamaTokenizer?

I could not figure it out and would like to seek your help. Any information is appreciated. Thank you very much in advance!

SupritYoung · 2023-06-13T08:40:39Z

I share a experiment, Just replace your llama model to https://huggingface.co/elinas/llama-7b-hf-transformers-4.29 will solve the error like ImportError: cannot import name 'LLaMATokenizer' from 'transformers'

JessicaLopezEspejel · 2023-06-26T13:09:57Z

Example of how to use LLaMA AutoTokenizer

!pip install tokenizers==0.13.3
!pip install sentencepiece

from transformers import AutoTokenizer, AutoModelForCausalLM

# model_name = "openlm-research/open_llama_3b"
# model_name = "openlm-research/open_llama_7b"
model_name = "openlm-research/open_llama_13b"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

MasterLivens · 2023-06-27T01:26:32Z

@MasterLivens hi, i am currently using colab, which file should i add this code?

the specified error file in the error message.

AkshayVerma26 · 2023-06-29T08:49:17Z

Hi @candowu, thanks for raising this issue. This is arising, because the tokenizer in the config on the hub points to LLaMATokenizer. However, the tokenizer in the library is LlamaTokenizer.
This is likely due to the configuration files being created before the final PR was merged in.

Change the LLaMATokenizer in tokenizer_config.json into lowercase LlamaTokenizer and it works like a charm.

Where is the tokenizer_config.json?

sifei · 2023-07-07T07:05:03Z

Hi @candowu, thanks for raising this issue. This is arising, because the tokenizer in the config on the hub points to LLaMATokenizer. However, the tokenizer in the library is LlamaTokenizer.
This is likely due to the configuration files being created before the final PR was merged in.

Change the LLaMATokenizer in tokenizer_config.json into lowercase LlamaTokenizer and it works like a charm.

Where is the tokenizer_config.json?

I think this is the location:
.cache/huggingface/hub/models--decapoda-research--llama-65b-hf/snapshots/47d2b93e8c0a3d5d6582bdec13f233ca0527499a/tokenizer_config.json

SanjayKotabagi · 2023-07-20T17:33:55Z

Please. Im facing the same issue. Can anyone help ? I tried all the above methods.

PawelFaron · 2023-07-21T11:43:34Z

Please. Im facing the same issue. Can anyone help ? I tried all the above methods.

I had the same issue and it was solved by:
pip uninstall transformers
pip install transformers

Nayahei · 2023-07-26T09:05:24Z

in my code, transformer==4.30.0 can fix it

calam1 · 2023-07-26T19:58:29Z

looked at tokenization_auto.py in the transformers package that was installed via pip install git+https://github.com/huggingface/transformers

 (
                "llama",
                (
                    "LlamaTokenizer" if is_sentencepiece_available() else None,
                    "LlamaTokenizerFast" if is_tokenizers_available() else None,
                ),
            ),

I had to install sentencepiece to bypass the not found error, running into other errors though :)

alexovai · 2023-08-16T03:25:25Z

I had similar issues. root cause was that I was using python3.7 where pip install transformers was older version 4.18. Once I upgraded to python3.9 pip3 install transformers install 4.30.0

Also make sure you have recent cuda drivers, running only on CPU was very slow for me.

youshikyou · 2023-08-23T20:18:57Z

Hi, I still have the error. I tried all the solutions above....

ndvbd · 2023-09-02T11:06:14Z

Do we need to load the tokenizer using LlamaTokenizer or can we use it with AutoTokenizer? If the latter is possible, what is the fully qualified tokenizer model on the hub? I get:
OSError: LLamaTokenizer is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
Where is the tokenizer_config.json stored?

ArthurZucker · 2023-09-04T13:09:49Z

@ndvbd you should be able to use AutoTokenizer with any tokenizers on the hub.
If you have an issue and want us to help you, we really need a small reproducer, and the full traceback.

For anyone still getting the same error:

make sure you are using the correct version of transformers. print(transformers.__version__)
if you are working on a notebook cc @youshikyou make sure to restart the kernel after you have installed the packages, to make sure the changes are taken into account.
make sure the repository you are loading from (for example meta-llama/Llama-2-7b-hg) has the correct LlamaTokenizer class if you are using AutoModel.

If you have a different issue, make sure to open a new issue and ping me 🤗

PGTBoos · 2023-09-18T18:47:58Z

same error on model codellama/CodeLlama-13b-hf
can onyone post a valid json config here ?

ArthurZucker · 2023-09-18T22:42:35Z

No, the json is valid, you are just not working on main or at least the release that included CodeLlamaTokenizer... (https://github.com/huggingface/transformers/releases/tag/v4.33.1)

dhruvsinha · 2023-09-25T14:44:56Z

I got the same error. I read that we have to edit 'tokenizer_config.json'. I did that on huggingface by changing the json file available in 'Files and versions' of the 'decapoda-research/llama-7b-hf', but I am not sure how to use the edited changes. This is my code:

`import torch
import transformers
from transformers import AutoTokenizer
from langchain import LLMChain, HuggingFacePipeline, PromptTemplate

print("Import Complete")

model = "meta-llama/Llama-2-7b-hf
access_token = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
tokenizer = AutoTokenizer.from_pretrained(model, use_auth_token=access_token)`

I get the same error- ValueError: Tokenizer class LLaMATokenizer does not exist or is not currently imported.

ArthurZucker · 2023-09-26T15:30:36Z

You should use model = "path/to/your/decapoda/modifided/model

k3ybladewielder · 2024-02-24T16:24:46Z

Any solution to this error ValueError: Tokenizer class LlamaTokenizer does not exist or is not currently imported. ?

mahsan-py · 2024-02-26T12:39:57Z

Install this library
pip install -U transformers

Splo2t mentioned this issue Mar 17, 2023

Fix llama_tokenizer #22232

Closed

technoqz mentioned this issue Mar 17, 2023

transformers error loading Llama oobabooga/text-generation-webui#368

Closed

1 task

candowu closed this as completed Mar 17, 2023

ilovedbsql mentioned this issue Mar 18, 2023

ValueError: Tokenizer class LLaMATokenizer does not exist or is not currently imported. juncongmoo/pyllama#35

Closed

ans92 mentioned this issue Mar 21, 2023

Tokenizer class LLaMATokenizer does not exist or is not currently imported. #22286

Closed

Belluxx mentioned this issue Mar 21, 2023

ValueError: Tokenizer class LLaMATokenizer does not exist or is not currently imported. After 7B foundational model download. serp-ai/LLaMA-8bit-LoRA#2

Closed

x22x22 mentioned this issue Mar 31, 2023

[BUG]: Can not train llama-7b-hf due to “Tokenizer class LLaMATokenizer does not exist or is not currently imported.” on 3090(24GB) hpcaitech/ColossalAI#3372

Closed

clxyder mentioned this issue Apr 7, 2023

Tokenizer class LLaMATokenizer does not exist or is not currently imported. qwopqwop200/GPTQ-for-LLaMa#18

Closed

TheMixu mentioned this issue Apr 8, 2023

"The tokenizer class you load from this checkpoint is 'LLaMATokenizer'." lxe/simple-llm-finetuner#40

Closed

Slowly-Grokking mentioned this issue Apr 10, 2023

Huggingface Automodel compliant LLAMA model nomic-ai/gpt4all#280

Closed

BillSchumacher mentioned this issue Apr 10, 2023

Maintain compatibilty with older transformers versions. lm-sys/FastChat#354

Merged

LayanNCAI mentioned this issue May 10, 2023

UnpicklingError: invalid load key, 'v'. lm-sys/FastChat#1132

Closed

This was referenced May 19, 2023

Fix example int8_inference_huggingface.py alexrs/bitsandbytes#1

Open

Fix example int8_inference_huggingface.py bitsandbytes-foundation/bitsandbytes#414

Merged

gaocegege mentioned this issue May 24, 2023

feat: Add llama tensorchord/modelz-llm#23

Merged

Swag19602 mentioned this issue Jun 3, 2023

Cannot import name 'LlamaTokenizer' PromtEngineer/localGPT#54

Closed

SoyGema mentioned this issue Jun 13, 2023

How to use language models such as LLaMA / Alpaca / Vicuna with BertViz? jessevig/bertviz#117

Closed

CharlieFRuan mentioned this issue Jul 1, 2023

Add support for WizardLM mlc-ai/mlc-llm#489

Merged

maxreciprocate mentioned this issue Sep 1, 2023

LLaMA sentiment example doesn't work CarperAI/trlx#476

Closed

kemingy mentioned this issue Oct 19, 2023

Tokenizer class LLaMATokenizer does not exist or is not currently imported. tensorchord/modelz-llm#101

Open

ValueError: Tokenizer class LLaMATokenizer does not exist or is not currently imported. #22222

ValueError: Tokenizer class LLaMATokenizer does not exist or is not currently imported. #22222

Comments

candowu commented Mar 17, 2023

System Info

Who can help?

Information

Tasks

Reproduction

tokenizer = LlamaTokenizer.from_pretrained("decapoda-research/llama-7b-hf")

model = LlamaForCausalLM.from_pretrained("decapoda-research/llama-7b-hf")

Expected behavior

yhifny commented Mar 17, 2023

amyeroberts commented Mar 17, 2023

yhifny commented Mar 17, 2023

mbehm commented Mar 17, 2023

amyeroberts commented Mar 17, 2023

nadahlberg commented Mar 17, 2023

sgugger commented Mar 17, 2023

nadahlberg commented Mar 17, 2023

xhinker commented Mar 17, 2023

nameless0704 commented Mar 21, 2023

vdattwani2005 commented Mar 26, 2023

sarrahbbh commented Mar 29, 2023

thekevshow commented Mar 30, 2023

RiseInRose commented Mar 31, 2023

alvations commented Apr 2, 2023 • edited Loading

FatCache commented Apr 2, 2023

qufy6 commented Apr 12, 2023

sgugger commented Apr 12, 2023

sarrahbbh commented May 15, 2023

zhanghanghitomi commented Jun 13, 2023 • edited Loading

SupritYoung commented Jun 13, 2023

JessicaLopezEspejel commented Jun 26, 2023 • edited Loading

MasterLivens commented Jun 27, 2023

AkshayVerma26 commented Jun 29, 2023

sifei commented Jul 7, 2023

SanjayKotabagi commented Jul 20, 2023

PawelFaron commented Jul 21, 2023

Nayahei commented Jul 26, 2023

calam1 commented Jul 26, 2023 • edited Loading

alexovai commented Aug 16, 2023

youshikyou commented Aug 23, 2023 • edited Loading

ndvbd commented Sep 2, 2023 • edited Loading

ArthurZucker commented Sep 4, 2023

PGTBoos commented Sep 18, 2023

ArthurZucker commented Sep 18, 2023

dhruvsinha commented Sep 25, 2023

ArthurZucker commented Sep 26, 2023

k3ybladewielder commented Feb 24, 2024 • edited Loading

mahsan-py commented Feb 26, 2024

alvations commented Apr 2, 2023 •

edited

Loading

zhanghanghitomi commented Jun 13, 2023 •

edited

Loading

JessicaLopezEspejel commented Jun 26, 2023 •

edited

Loading

calam1 commented Jul 26, 2023 •

edited

Loading

youshikyou commented Aug 23, 2023 •

edited

Loading

ndvbd commented Sep 2, 2023 •

edited

Loading

k3ybladewielder commented Feb 24, 2024 •

edited

Loading