Output shows <|endoftext|> tokens and #10604
Unanswered
UltraWelfare
asked this question in
Q&A
Replies: 2 comments 1 reply
-
This is most likely caused due to incorrect configuration of the model tokenizer. From a quick look at https://huggingface.co/openGPT-X/Teuken-7B-instruct-commercial-v0.4/blob/main/tokenizer_config.json it seems that this model does not have a |
Beta Was this translation helpful? Give feedback.
1 reply
-
See also the discussion I've started at #10539 llama.cpp doesn't actually support the Teuken models. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm trying two models converted to gguf using the
GGUF-my-repo
spaceModel 1
Model 2
They both face the same issue where they have
<|endoftext|>
or<|im_end>
tokens in their output and they start questioning and answering themselves.I'm starting the
llama-server
like this :.\llama-server --model .\teuken-7b-instruct-commercial-v0.4-q6_k.gguf
. Setting the context size (parameter-c
) doesn't change the output...Other models work correctly like Llama3.2 and Llama3.1.. I'm not sure what is up with these two specifically
Beta Was this translation helpful? Give feedback.
All reactions