You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello. Thank you for Great Work! I encountered an issue while using the Pangea-7B-hf fine-tuned multilingual model. The model itself is multimodal, but I only need to use text input and do not require the image functionality. When running the model, I received an error indicating that image input is required, but I do not intend to use image input. Is there a way to disable image input and only use the text functionality? Below is the code template I am using:
Assuming that you have text_input and image_path
from transformers import LlavaNextForConditionalGeneration, AutoProcessor
import torch
from PIL import Image
image_input = Image.open(image_path)
model = LlavaNextForConditionalGeneration.from_pretrained(
"neulab/Pangea-7B-hf",
torch_dtype=torch.float16
).to(0)
processor = AutoProcessor.from_pretrained("neulab/Pangea-7B-hf")
model.resize_token_embeddings(len(processor.tokenizer))
text_input = f"<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n<|im_start|>user\n\n{text_input}<|im_end|>\n<|im_start|>assistant\n"
model_inputs = processor(images=image_input, text=text_input, return_tensors='pt').to("cuda", torch.float16)
output = model.generate(**model_inputs, max_new_tokens=1024, min_new_tokens=32, temperature=1.0, top_p=0.9, do_sample=True)
output = output[0]
result = processor.decode(output, skip_special_tokens=True, clean_up_tokenization_spaces=False)
print(result)
The text was updated successfully, but these errors were encountered:
Hello. Thank you for Great Work! I encountered an issue while using the Pangea-7B-hf fine-tuned multilingual model. The model itself is multimodal, but I only need to use text input and do not require the image functionality. When running the model, I received an error indicating that image input is required, but I do not intend to use image input. Is there a way to disable image input and only use the text functionality? Below is the code template I am using:
Assuming that you have text_input and image_path
from transformers import LlavaNextForConditionalGeneration, AutoProcessor
import torch
from PIL import Image
image_input = Image.open(image_path)
model = LlavaNextForConditionalGeneration.from_pretrained(
"neulab/Pangea-7B-hf",
torch_dtype=torch.float16
).to(0)
processor = AutoProcessor.from_pretrained("neulab/Pangea-7B-hf")
model.resize_token_embeddings(len(processor.tokenizer))
text_input = f"<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n<|im_start|>user\n
\n{text_input}<|im_end|>\n<|im_start|>assistant\n"
model_inputs = processor(images=image_input, text=text_input, return_tensors='pt').to("cuda", torch.float16)
output = model.generate(**model_inputs, max_new_tokens=1024, min_new_tokens=32, temperature=1.0, top_p=0.9, do_sample=True)
output = output[0]
result = processor.decode(output, skip_special_tokens=True, clean_up_tokenization_spaces=False)
print(result)
The text was updated successfully, but these errors were encountered: