Skip to content

[Bug] chatglm4 mlc_llm shows error "TVMError: Check failed: append_length > 0 (0 vs. 0) : Append with length 0 is not allowed." during mlc_llm chat CLI #2517

@lihaofd

Description

@lihaofd

mlc-ai-nightly-cu122 0.15.dev404
mlc-llm-nightly-cu122 0.1.dev1355
transformers 4.41.2

git clone https://huggingface.co/THUDM/glm-4-9b-chat
mlc_llm convert_weight ./dist/models/glm-4-9b-chat/ --quantization q4f16_1 -o dist/glm-4-9b-chat-MLC

mlc_llm gen_config ./dist/models/glm-4-9b-chat/ --quantization q4f16_1 --conv-template glm -o dist/glm-4-9b-chat-MLC/
It shows
The repository for dist/models/glm-4-9b-chat contains custom code which must be executed to correctly load the model. You can inspect the repository content at https://hf.co/dist/models/glm-4-9b-chat.
You can avoid this prompt in future by passing the argument trust_remote_code=True.

Do you wish to run the custom code? [y/N] y
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.

Add trust_remote_code=True
fast_tokenizer = AutoTokenizer.from_pretrained(str(config.parent), use_fast=True, trust_remote_code=True)

It shows error
AttributeError: 'ChatGLM4Tokenizer' object has no attribute 'backend_tokenizer'
/workspace/mlc-llm/cpp/tokenizers/tokenizers.cc:154: Warning: Tokenizer info is not detected as tokenizer.json is not found. The default tokenizer info will be used.
Segmentation fault (core dumped)

mlc_chat_config.tokenizer_info = asdict(Tokenizer.detect_tokenizer_info(str(output))) run into Segmentation fault

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugConfirmed bugs

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions