Skip to content

MiniCPM-V-4_5 lora 微调 报错 RuntimeError: mat1 and mat2 shapes cannot be multiplied (128x4096 and 1x4194304) #9108

@litterGuy

Description

@litterGuy

Reminder

  • I have read the above rules and searched the existing issues.

System Info

Package Version Editable project location


accelerate 1.7.0
aiofiles 24.1.0
aiohappyeyeballs 2.6.1
aiohttp 3.12.15
aiosignal 1.4.0
annotated-types 0.7.0
anyio 4.10.0
attrs 25.3.0
audioread 3.0.1
av 15.1.0
bitsandbytes 0.47.0
brotli 1.1.0
certifi 2025.8.3
cffi 2.0.0b1
charset-normalizer 3.4.3
click 8.2.1
contourpy 1.3.3
cut-cross-entropy 25.1.1
cycler 0.12.1
datasets 3.6.0
decorator 5.2.1
diffusers 0.35.1
dill 0.3.8
docstring-parser 0.17.0
einops 0.8.1
fastapi 0.116.1
ffmpy 0.6.1
filelock 3.19.1
fire 0.7.1
fonttools 4.59.2
frozenlist 1.7.0
fsspec 2025.9.0
gradio 5.42.0
gradio-client 1.11.1
groovy 0.1.2
h11 0.16.0
hf-transfer 0.1.9
hf-xet 1.1.9
httpcore 1.0.9
httpx 0.28.1
huggingface-hub 0.35.0rc0
idna 3.10
importlib-metadata 8.7.0
jieba 0.42.1
jinja2 3.1.6
joblib 1.5.2
kiwisolver 1.4.10rc0
lazy-loader 0.4
librosa 0.11.0
llamafactory 0.9.4.dev0 /data/works/LLaMA-Factory
llvmlite 0.45.0rc1
markdown-it-py 4.0.0
markupsafe 3.0.2
matplotlib 3.10.6
mdurl 0.1.2
modelscope 1.29.1
mpmath 1.3.0
msgpack 1.1.1
msgspec 0.19.0
multidict 6.6.4
multiprocess 0.70.16
networkx 3.5
nltk 3.9.1
numba 0.62.0rc1
numpy 2.3.3
nvidia-cublas-cu12 12.9.1.4
nvidia-cuda-cupti-cu12 12.9.79
nvidia-cuda-nvrtc-cu12 12.9.86
nvidia-cuda-runtime-cu12 12.9.79
nvidia-cudnn-cu12 9.10.2.21
nvidia-cufft-cu12 11.4.1.4
nvidia-cufile-cu12 1.14.1.1
nvidia-curand-cu12 10.3.10.19
nvidia-cusolver-cu12 11.7.5.82
nvidia-cusparse-cu12 12.5.10.65
nvidia-cusparselt-cu12 0.7.1
nvidia-nccl-cu12 2.27.3
nvidia-nvjitlink-cu12 12.9.86
nvidia-nvtx-cu12 12.9.79
omegaconf 2.4.0.dev3
orjson 3.11.3
packaging 25.0
pandas 2.3.2
peft 0.17.2.dev0
pillow 11.3.0
platformdirs 4.4.0
pooch 1.8.2
propcache 0.3.2
protobuf 6.32.0
psutil 7.0.0
pyarrow 21.0.0
pycparser 2.22
pydantic 2.10.6
pydantic-core 2.27.2
pydub 0.25.1
pygments 2.19.2
pyparsing 3.2.3
python-dateutil 2.9.0.post0
python-multipart 0.0.20
pytz 2025.2
pyyaml 6.0.2
regex 2025.9.1
requests 2.32.5
rich 13.9.4
rouge-chinese 1.0.3
ruff 0.12.11
safehttpx 0.1.6
safetensors 0.6.2
scikit-learn 1.7.1
scipy 1.16.1
semantic-version 2.10.0
sentencepiece 0.2.1
setuptools 80.9.0
shellingham 1.5.4
shtab 1.7.2
six 1.17.0
sniffio 1.3.1
soundfile 0.13.1
soxr 0.5.0.post1
sse-starlette 3.0.2
starlette 0.47.3
sympy 1.14.0
termcolor 3.1.0
threadpoolctl 3.6.0
tiktoken 0.11.0
tokenizers 0.21.4
tomlkit 0.13.3
torch 2.8.0+cu129
torchao 0.13.0
torchaudio 2.8.0+cu129
torchvision 0.23.0+cu129
tqdm 4.67.1
transformers 4.55.0
triton 3.4.0
trl 0.9.6
typer 0.17.3
typing-extensions 4.15.0
tyro 0.8.14
tzdata 2025.2
unsloth 2025.9.1
unsloth-zoo 2025.9.2
urllib3 2.5.0
uvicorn 0.35.0
websockets 15.0.1
wheel 0.45.1
xformers 0.0.32.post2
xxhash 3.5.0
yarl 1.20.1
zipp 3.23.0

Reproduction

llamafactory-cli train \
    --stage sft \
    --do_train True \
    --model_name_or_path /data/models/openbmb/MiniCPM-V-4_5 \
    --preprocessing_num_workers 16 \
    --finetuning_type lora \
    --template minicpm_v \
    --flash_attn auto \
    --dataset_dir /data/woli/dataset \
    --dataset woli \
    --cutoff_len 2048 \
    --learning_rate 5e-05 \
    --num_train_epochs 3.0 \
    --max_samples 100000 \
    --per_device_train_batch_size 2 \
    --gradient_accumulation_steps 8 \
    --lr_scheduler_type cosine \
    --max_grad_norm 1.0 \
    --logging_steps 5 \
    --save_steps 100 \
    --warmup_steps 0 \
    --packing False \
    --enable_thinking True \
    --report_to none \
    --output_dir saves/MiniCPM-V-4_5/lora/train_2025-09-10-14-06-01 \
    --bf16 True \
    --plot_loss True \
    --trust_remote_code True \
    --ddp_timeout 180000000 \
    --include_num_input_tokens_seen True \
    --optim adamw_torch \
    --quantization_bit 4 \
    --quantization_method bnb \
    --double_quantization True \
    --lora_rank 8 \
    --lora_alpha 16 \
    --lora_dropout 0 \
    --lora_target all \
    --freeze_vision_tower True \
    --freeze_multi_modal_projector True \
    --image_max_pixels 589824 \
    --image_min_pixels 1024 \
    --video_max_pixels 65536 \
    --video_min_pixels 256

报错信息如下:

  File "/data/works/LLaMA-Factory/.venv/bin/llamafactory-cli", line 10, in <module>
    sys.exit(main())
             ^^^^^^
  File "/data/works/LLaMA-Factory/src/llamafactory/cli.py", line 151, in main
    COMMAND_MAP[command]()
  File "/data/works/LLaMA-Factory/src/llamafactory/train/tuner.py", line 110, in run_exp
    _training_function(config={"args": args, "callbacks": callbacks})
  File "/data/works/LLaMA-Factory/src/llamafactory/train/tuner.py", line 72, in _training_function
    run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
  File "/data/works/LLaMA-Factory/src/llamafactory/train/sft/workflow.py", line 96, in run_sft
    train_result = trainer.train(resume_from_checkpoint=training_args.resume_from_checkpoint)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/works/LLaMA-Factory/.venv/lib/python3.11/site-packages/transformers/trainer.py", line 2238, in train
    return inner_training_loop(
           ^^^^^^^^^^^^^^^^^^^^
  File "/data/works/LLaMA-Factory/.venv/lib/python3.11/site-packages/transformers/trainer.py", line 2582, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs, num_items_in_batch)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/works/LLaMA-Factory/.venv/lib/python3.11/site-packages/transformers/trainer.py", line 3796, in training_step
    loss = self.compute_loss(model, inputs, num_items_in_batch=num_items_in_batch)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/works/LLaMA-Factory/src/llamafactory/train/sft/trainer.py", line 108, in compute_loss
    return super().compute_loss(model, inputs, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/works/LLaMA-Factory/.venv/lib/python3.11/site-packages/transformers/trainer.py", line 3884, in compute_loss
    outputs = model(**inputs)
              ^^^^^^^^^^^^^^^
  File "/data/works/LLaMA-Factory/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/works/LLaMA-Factory/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/works/LLaMA-Factory/.venv/lib/python3.11/site-packages/accelerate/utils/operations.py", line 818, in forward
    return model_forward(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/works/LLaMA-Factory/.venv/lib/python3.11/site-packages/accelerate/utils/operations.py", line 806, in __call__
    return convert_to_fp32(self.model_forward(*args, **kwargs))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/works/LLaMA-Factory/.venv/lib/python3.11/site-packages/torch/amp/autocast_mode.py", line 44, in decorate_autocast
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/data/works/LLaMA-Factory/.venv/lib/python3.11/site-packages/peft/peft_model.py", line 1885, in forward
    return self.base_model(
           ^^^^^^^^^^^^^^^^
  File "/data/works/LLaMA-Factory/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/works/LLaMA-Factory/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/works/LLaMA-Factory/.venv/lib/python3.11/site-packages/peft/tuners/tuners_utils.py", line 228, in forward
    return self.model.forward(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.cache/huggingface/modules/transformers_modules/MiniCPM-V-4_5/modeling_minicpmv.py", line 206, in forward
    vllm_embedding, vision_hidden_states = self.get_vllm_embedding(data)
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.cache/huggingface/modules/transformers_modules/MiniCPM-V-4_5/modeling_minicpmv.py", line 127, in get_vllm_embedding
    vision_embedding = self.resampler(vision_embedding, tgt_sizes, all_temporal_ids)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/works/LLaMA-Factory/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/works/LLaMA-Factory/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1879, in _call_impl
    return inner()
           ^^^^^^^
  File "/data/works/LLaMA-Factory/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1827, in inner
    result = forward_call(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.cache/huggingface/modules/transformers_modules/MiniCPM-V-4_5/resampler.py", line 232, in forward
    out = self.batch_attn_forward(q, k, v, pos_embed_temporal, temporal_ids, key_padding_mask)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.cache/huggingface/modules/transformers_modules/MiniCPM-V-4_5/resampler.py", line 274, in batch_attn_forward
    out = self.attn(
          ^^^^^^^^^^
  File "/data/works/LLaMA-Factory/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/works/LLaMA-Factory/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/works/LLaMA-Factory/.venv/lib/python3.11/site-packages/torch/nn/modules/activation.py", line 1380, in forward
    attn_output, attn_output_weights = F.multi_head_attention_forward(
                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/works/LLaMA-Factory/.venv/lib/python3.11/site-packages/torch/nn/functional.py", line 6191, in multi_head_attention_forward
    return handle_torch_function(
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/data/works/LLaMA-Factory/.venv/lib/python3.11/site-packages/torch/overrides.py", line 1747, in handle_torch_function
    result = torch_func_method(public_api, types, args, kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/works/LLaMA-Factory/.venv/lib/python3.11/site-packages/bitsandbytes/nn/modules.py", line 397, in __torch_function__
    return super().__torch_function__(func, types, args, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/works/LLaMA-Factory/.venv/lib/python3.11/site-packages/torch/nn/functional.py", line 6457, in multi_head_attention_forward
    attn_output = linear(attn_output, out_proj_weight, out_proj_bias)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: mat1 and mat2 shapes cannot be multiplied (128x4096 and 1x4194304)

Others

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingpendingThis problem is yet to be addressed

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions