Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix check_share_embedding #2232

Closed
wants to merge 1 commit into from
Closed

Fix check_share_embedding #2232

wants to merge 1 commit into from

Conversation

lkm2835
Copy link
Contributor

@lkm2835 lkm2835 commented Sep 17, 2024

Related to #2226 "use_embedding_sharing" option not working for llama model.

Reproduce

I used open huggingface model HuggingFaceTB/SmolLM-1.7B. (tie_word_embedding=True)

python /app/tensorrt_llm/examples/llama/convert_checkpoint.py \
                            --model_dir ${MODEL_DIR} \
                            --output_dir ${MODEL_DIR}/tensorrt/${TP_SIZE}-gpu \
                            --tp_size 1 \
                            --use_embedding_sharing \
                            --load_model_on_cpu \
                            --dtype float16

Error Message

[TensorRT-LLM] TensorRT-LLM version: 0.14.0.dev2024091000
0.14.0.dev2024091000
[09/17/2024-15:30:41] [TRT-LLM] [I] Loading weights from Huggingface Llama safetensors...
[09/17/2024-15:30:44] [TRT-LLM] [I] Weights loaded. Total time: 00:00:02
Traceback (most recent call last):
  File "/app/tensorrt_llm/examples/llama/convert_checkpoint.py", line 497, in <module>
    main()
  File "/app/tensorrt_llm/examples/llama/convert_checkpoint.py", line 489, in main
    convert_and_save_hf(args)
  File "/app/tensorrt_llm/examples/llama/convert_checkpoint.py", line 431, in convert_and_save_hf
    execute(args.workers, [convert_and_save_rank] * world_size, args)
  File "/app/tensorrt_llm/examples/llama/convert_checkpoint.py", line 438, in execute
    f(args, rank)
  File "/app/tensorrt_llm/examples/llama/convert_checkpoint.py", line 417, in convert_and_save_rank
    llama = LLaMAForCausalLM.from_hugging_face(
  File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/models/llama/model.py", line 373, in from_hugging_face
    check_share_embedding(weights, config)
  File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/models/modeling_utils.py", line 1290, in check_share_embedding
    if (weights["lm_head.weight"] -
TypeError: unsupported operand type(s) for -: 'NoneType' and 'Tensor'

The error occurs because if tie_word_embedding is True, model.safetensors doesn't have lm_head.weight.
https://huggingface.co/HuggingFaceTB/SmolLM-1.7B/blob/main/model.safetensors.index.json

@Barry-Delaney
Copy link
Collaborator

Hi @lkm2835, thanks for the PR!
We are going to unify the conversion scripts with the ModelWeightsLoader, and this PR can help with the legacy path to convert checkpoints with lm_head and without vocab_embedding.
We will merge it first, and thanks for your contribution!

@kaiyux kaiyux mentioned this pull request Sep 24, 2024
@lkm2835 lkm2835 closed this Sep 26, 2024
@lkm2835 lkm2835 deleted the fix branch October 27, 2024 09:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants