-
Notifications
You must be signed in to change notification settings - Fork 932
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gibberish responses with Llama-2-13B #596
Comments
Had similar issue in huggingface, adding repetition_penalty along with temperature, top_p and top_k solved my issue |
I mean temperature shouldn't really play a role in gibberish responses, right? Otherwise, the other parameters are set by default in llama-cpp-python (and I think repetition_penalty is being somehow used downstream). Did you use some different values from |
@rlleshi For |
So I've managed to avoid the repetition and gibberish output problems. But still, the output of the model using this python binding is far from being as robust and good as the output of Just one example below: Prompt: What is 10+10 - 100?
|
Interesting - I am facing similar issues. @rlleshi were you able to fully reslove these by now? |
@geoHeil nope |
@rlleshi @geoHeil I see that there is a problem with your llama prompting scheme, it was poorly documented on llama-2 release, but system and user prompt must always be enclosed in See docs for the complete prompting scheme: https://huggingface.co/blog/llama2#how-to-prompt-llama-2 |
@nkgrush I actually tried out the official prompting. But I was actually getting worse results like that. Experimented with a bunch of other prompts. What was working best was: |
the version 0.2.7 got the chat./user/sys prompting stuff right. later versions seem to break. try downgrading. |
I am testing this nice python wrapper for llama.cpp. But the model's responses don't make much sense.
Gives the following output:
I wonder if the default hyperparameters of llama-cpp-python significantly differ from llama.cpp?
Either way this kind of response shouldn't be the case. I tested similar prompts and the model easily breaks down like above.
Needless to say the responses are as expected from using llama.cpp itself.
Am I missing something?
The text was updated successfully, but these errors were encountered: