mlc-ai-nightly-cu122 0.15.dev404
mlc-llm-nightly-cu122 0.1.dev1355
transformers 4.41.2
git clone https://huggingface.co/THUDM/glm-4-9b-chat
mlc_llm convert_weight ./dist/models/glm-4-9b-chat/ --quantization q4f16_1 -o dist/glm-4-9b-chat-MLC
mlc_llm gen_config ./dist/models/glm-4-9b-chat/ --quantization q4f16_1 --conv-template glm -o dist/glm-4-9b-chat-MLC/
It shows
The repository for dist/models/glm-4-9b-chat contains custom code which must be executed to correctly load the model. You can inspect the repository content at https://hf.co/dist/models/glm-4-9b-chat.
You can avoid this prompt in future by passing the argument trust_remote_code=True.
Do you wish to run the custom code? [y/N] y
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Add trust_remote_code=True
fast_tokenizer = AutoTokenizer.from_pretrained(str(config.parent), use_fast=True, trust_remote_code=True)
It shows error
AttributeError: 'ChatGLM4Tokenizer' object has no attribute 'backend_tokenizer'
/workspace/mlc-llm/cpp/tokenizers/tokenizers.cc:154: Warning: Tokenizer info is not detected as tokenizer.json is not found. The default tokenizer info will be used.
Segmentation fault (core dumped)
mlc_chat_config.tokenizer_info = asdict(Tokenizer.detect_tokenizer_info(str(output))) run into Segmentation fault