Gibberish responses with Llama-2-13B #596

rlleshi · 2023-08-10T14:55:10Z

I am testing this nice python wrapper for llama.cpp. But the model's responses don't make much sense.

llm = Llama(model_path="./models/llama-2-13b.ggmlv3.q4_0.bin", n_gpu_layers=35, n_ctx=2048)
output = llm("What is the capital of Germany? Answer only with the name of the capital.", echo=True, temperature=0, max_tokens=512)

Gives the following output:

What is the capital of Germany? Answer only with the name of the capital.
What is the capital of France? Answer only with the name of the capital.
What is the capital of Italy? Answer only with the name of the capital.
What is the capital of Spain? Answer only with the name of the capital.
What is the capital of Portugal? Answer only with the name of the capital.
....

I wonder if the default hyperparameters of llama-cpp-python significantly differ from llama.cpp?

Either way this kind of response shouldn't be the case. I tested similar prompts and the model easily breaks down like above.

Needless to say the responses are as expected from using llama.cpp itself.

Am I missing something?

The text was updated successfully, but these errors were encountered:

rd-neosoft · 2023-08-11T12:56:45Z

Had similar issue in huggingface, adding repetition_penalty along with temperature, top_p and top_k solved my issue

rlleshi · 2023-08-14T06:23:33Z

I mean temperature shouldn't really play a role in gibberish responses, right?

Otherwise, the other parameters are set by default in llama-cpp-python (and I think repetition_penalty is being somehow used downstream). Did you use some different values from top_k=40 & top_p=0.95?

rd-neosoft · 2023-08-17T12:38:35Z

@rlleshi For repetition_penalty yes it is being used i think by default its 1.0 but when we are getting some gibberish responses its better to try some different repetition_penalty, top_k and top_p
In my case repetition_penalty = 1.2, top_k=10, top_p = 0.95 worked for me

rlleshi · 2023-08-18T07:17:36Z

So I've managed to avoid the repetition and gibberish output problems. But still, the output of the model using this python binding is far from being as robust and good as the output of llama.cpp.

Just one example below:

Prompt: What is 10+10 - 100?

llama.cpp: The calculation is as follows:\n\n10 + 10 = 20\n\n20 - 100 = -80
llama-cpp-python: 20

geoHeil · 2023-09-07T21:19:59Z

Interesting - I am facing similar issues. @rlleshi were you able to fully reslove these by now?

rlleshi · 2023-09-18T13:14:10Z

@geoHeil nope

nkgrush · 2023-09-20T12:34:57Z

@rlleshi @geoHeil I see that there is a problem with your llama prompting scheme, it was poorly documented on llama-2 release, but system and user prompt must always be enclosed in [INST] all text that is not generated by model goes here [/INST]. [INST] is a special sequence (multiple tokens) to mark user instructions. Otherwise you will get next-token completions rather than chat responses, it seems to be the case here, rather than an issue with llama-cpp-python.

See docs for the complete prompting scheme: https://huggingface.co/blog/llama2#how-to-prompt-llama-2

rlleshi · 2023-09-25T13:56:03Z

@nkgrush I actually tried out the official prompting. But I was actually getting worse results like that.

Experimented with a bunch of other prompts. What was working best was: """{user_input} \n\n### Response:\n""" with stop being: ["###"]. And the content of user input being like so: '### {m["role"]}: {m["content"]}'

earonesty · 2023-10-07T13:33:27Z

the version 0.2.7 got the chat./user/sys prompting stuff right. later versions seem to break. try downgrading.

gjmulder added model Model specific issue quality Quality of model output labels Aug 10, 2023

rlleshi changed the title ~~Gibberish responses~~ Gibberish responses with Llama-2-13B Aug 15, 2023

rlleshi mentioned this issue Aug 15, 2023

Llama2 #488

Closed

5 tasks

zhiyong9654 mentioned this issue Feb 29, 2024

Differing outputs for llama.cpp and llama-cpp-python on gemma-2b-it q4 #1235

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gibberish responses with Llama-2-13B #596

Gibberish responses with Llama-2-13B #596

rlleshi commented Aug 10, 2023 •

edited

Loading

rd-neosoft commented Aug 11, 2023

rlleshi commented Aug 14, 2023 •

edited

Loading

rd-neosoft commented Aug 17, 2023 •

edited

Loading

rlleshi commented Aug 18, 2023 •

edited

Loading

geoHeil commented Sep 7, 2023

rlleshi commented Sep 18, 2023

nkgrush commented Sep 20, 2023

rlleshi commented Sep 25, 2023 •

edited

Loading

earonesty commented Oct 7, 2023

Gibberish responses with Llama-2-13B #596

Gibberish responses with Llama-2-13B #596

Comments

rlleshi commented Aug 10, 2023 • edited Loading

rd-neosoft commented Aug 11, 2023

rlleshi commented Aug 14, 2023 • edited Loading

rd-neosoft commented Aug 17, 2023 • edited Loading

rlleshi commented Aug 18, 2023 • edited Loading

geoHeil commented Sep 7, 2023

rlleshi commented Sep 18, 2023

nkgrush commented Sep 20, 2023

rlleshi commented Sep 25, 2023 • edited Loading

earonesty commented Oct 7, 2023

rlleshi commented Aug 10, 2023 •

edited

Loading

rlleshi commented Aug 14, 2023 •

edited

Loading

rd-neosoft commented Aug 17, 2023 •

edited

Loading

rlleshi commented Aug 18, 2023 •

edited

Loading

rlleshi commented Sep 25, 2023 •

edited

Loading