Replies: 1 comment
-
There seems to be a mismatch in the tensors names between this model and what the surgery script expects. The tensor names in
Whereas in https://huggingface.co/llava-hf/llava-v1.6-mistral-7b-hf/blob/main/model.safetensors.index.json they look like this:
Could you try the following and see if it works for you: $ git diff
diff --git a/examples/llava/llava_surgery_v2.py b/examples/llava/llava_surgery_v2.py
index 2d5b32fe..702ee16a 100644
--- a/examples/llava/llava_surgery_v2.py
+++ b/examples/llava/llava_surgery_v2.py
@@ -40,7 +40,7 @@ def clean_vision_tower_from_checkpoint(checkpoint_path):
# file_type = 'pytorch'
model_path = os.path.dirname(checkpoint_path)
print(f"Searching for vision tower tensors in {checkpoint_path}")
- clip_tensors = [k for k, v in checkpoint.items() if (k.startswith("model.vision_tower") or k.startswith("vit."))]
+ clip_tensors = [k for k, v in checkpoint.items() if (k.startswith("model.vision_tower") or (k.startswith("vision_tower")) or k.startswith("vit."))]
if len(clip_tensors) > 0:
print(f"Found {len(clip_tensors)} tensors to extract from {checkpoint_path}")
@@ -88,7 +88,7 @@ def newline_criteria(checkpoint):
return any(k.startswith("model.image_newline") for k in checkpoint.keys())
def proj_criteria(checkpoint):
- return any(k.startswith("model.mm_projector") or k.startswith("vision_proj.") for k in checkpoint.keys())
+ return any(k.startswith("model.mm_projector") or k.startswith("multi_modal_projector") or k.startswith("vision_proj.") for k in checkpoint.keys())
# Command-line interface setup
@@ -130,7 +130,7 @@ mm_tensors = []
last_checkpoint = None
if projector_checkpoint_path is not None:
last_checkpoint, file_type = load_model(projector_checkpoint_path)
- mm_tensors = [k for k, v in last_checkpoint.items() if k.startswith("model.mm_projector") or k.startswith("vision_proj.")]
+ mm_tensors = [k for k, v in last_checkpoint.items() if k.startswith("model.mm_projector") or k.startswith("multi_modal_projector") or k.startswith("vision_proj.")]
if len(mm_tensors) == 0:
if last_checkpoint is not None:
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I try convert the llava-v1.6-mistral-7b-hf(download from https://huggingface.co/llava-hf/llava-v1.6-mistral-7b-hf/tree/main) to gguf.
I followed this page https://github.com/ggerganov/llama.cpp/tree/master/examples/llava to convert.
When i run "python examples/llava/llava_surgery_v2.py -C -m C:\Users\DELL\llama-factory\LLaMA-Factory\model\llava-next-mistral"
it show
but i can convert https://huggingface.co/liuhaotian/llava-v1.6-mistral-7b successfully;
why?
Beta Was this translation helpful? Give feedback.
All reactions