Skip to content

[FT] Support GGUF in vllm, and use HF tokenizer together #943

@JIElite

Description

@JIElite

Issue encountered

Currently, vllm support load GGUF model directly, and in their document, they suggest to use the tokenizer from huggingface, which seems not been supported in the current lighteval.

Image

Solution/Feature

I want to use lighteval to evaluate GGUF model and use HF tokenizer rather than the tokenizer embedded in the GGUF model.

To support this feature, we can modify the file src/lighteval/models/vllm/vllm_model.py

First, add the tokenizer_name after model_name in the Line#147

tokenizer_name: str | None = None

Second, modify the _create_tokenizer in the Line#293

tokenizer = get_tokenizer(
            config.tokenizer_name if config.tokenizer_name else config.model_name,
            tokenizer_mode="auto",
            trust_remote_code=config.trust_remote_code,
            revision=config.revision,
        )

Third, move the self.use_chat_template = uses_chat_template( ... ) after self._tokenizer = self._create_auto_tokenizer(config) as

self._tokenizer = self._create_auto_tokenizer(config)
self.use_chat_template = uses_chat_template(
            model_name=config.model_name, tokenizer=self.tokenizer, override_chat_template=config.override_chat_template
        )

For some known issues

  1. We need to modify the log, because it says it is an error when loading the GGUF
Image

Currently, vLLM doesn't support full-precision GGUF, which means the GGUF in fp16, bf16, and fp32 can not been successfully loaded in vLLM, but the community developers are dealing with this issue. I think this problem will be resolved in the next released of vLLM.

Possible alternatives

A clear and concise description of any alternative solutions or features you've considered.
None, unknown.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions