Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encounting error when loading Valley2 7b with transformers 4.28.0 dev0 #12

Open
BinZhu-ece opened this issue Sep 1, 2023 · 1 comment

Comments

@BinZhu-ece
Copy link

BinZhu-ece commented Sep 1, 2023

I encountered the following error when loading Valley2 7b with transformers

Code:

”from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("luoruipu1/Valley2-7b", cache_dir='./')“

Error:

Traceback (most recent call last):
File "/remote-home/zhubin/A_LVLM/Valley/tmp.py", line 3, in
model = AutoModelForCausalLM.from_pretrained("luoruipu1/Valley2-7b", cache_dir='./')
File "/root/anaconda3/envs/valley/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 482, in from_pretrained
config, kwargs = AutoConfig.from_pretrained(
File "/root/anaconda3/envs/valley/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 1022, in from_pretrained
config_class = CONFIG_MAPPING[config_dict["model_type"]]
File "/root/anaconda3/envs/valley/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 723, in getitem
raise KeyError(key)
KeyError: 'valley'

pip list

torch 2.0.1
torchvision 0.15.2
tqdm 4.66.1
transformers 4.32.1
triton 2.0.0
typing_extensions 4.7.1
tzdata 2023.3
uc-micro-py 1.0.2
urllib3 2.0.4
uvicorn 0.23.2
valley 0.1.0 Valley
wandb 0.15.8
wavedrom 2.0.3.post3
wcwidth 0.2.6
websockets 11.0.3
wheel 0.38.4
yarl 1.9.2

@RupertLuo
Copy link
Owner

Because the model type of valley is not supported by AutoModelForCausalLM, you need to download the model weights locally and then call it with the following code

from transformers import AutoTokenizer
from valley.model.valley import ValleyLlamaForCausalLM
def init_vision_token(model,tokenizer):
    vision_config = model.get_model().vision_tower.config
    vision_config.im_start_token, vision_config.im_end_token = tokenizer.convert_tokens_to_ids([DEFAULT_IM_START_TOKEN, DEFAULT_IM_END_TOKEN])
    vision_config.vi_start_token, vision_config.vi_end_token = tokenizer.convert_tokens_to_ids([DEFAULT_VI_START_TOKEN, DEFAULT_VI_END_TOKEN])
    vision_config.vi_frame_token = tokenizer.convert_tokens_to_ids(DEFAULT_VIDEO_FRAME_TOKEN)
    vision_config.im_patch_token = tokenizer.convert_tokens_to_ids([DEFAULT_IMAGE_PATCH_TOKEN])[0]

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
# input the query
query = "Describe the video concisely."
# input the systemprompt
system_prompt = "You are Valley, a large language and vision assistant trained by ByteDance. You are able to understand the visual content or video that the user provides, and assist the user with a variety of tasks using natural language. Follow the instructions carefully and explain your answers in detail."

model_path = THE MODEL PATH
model = ValleyLlamaForCausalLM.from_pretrained(model_path, torch_dtype=torch.float16)
tokenizer = AutoTokenizer.from_pretrained(model_path)
init_vision_token(model,tokenizer)
model = model.to(device)
model.eval()

# we support openai format input
message = [ {"role":'system','content':system_prompt},
            {"role":"user", "content": 'Hi!'},
            {"role":"assistent", "content": 'Hi there! How can I help you today?'},
            {"role":"user", "content": query}]

gen_kwargs = dict(
    do_sample=True,
    temperature=0.2,
    max_new_tokens=1024,
)
response = model.completion(tokenizer, args.video_file, message, gen_kwargs, device)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants