Some slightly odd results running inference (may just be prompt or tokenizing issue) #5
-
Continuing from #1 (comment) After converting the model to GGUF, I tried a few generations but the results were a bit strange. I don't know that this implies any problem with the model, however I generally can get pretty reasonable results from LLaMA models with similar prompting. Output from a few generations: https://gist.github.com/KerfuffleV2/a7dcbc9adab5506c0cb77de37653e51b 1It started out pretty good, but then it got quite weird. The main thing I'd call strange in this one is "The fox jumped up and ran away with the two dead wolves in his mouth. [...] He was too weak to run very fast" - I'm not surprised he couldn't run fast considering the size difference between wolves and foxes. 2"小白总是会出现在小白的面前" — He always popped up in front of himself? "这个时候小黑才意识到自己已经死了" — It didn't seem to affect him too much. I can say something positive as well though — I'm just learning Mandarin so I don't know how much my impression is worth but compared to the output from every other local LM I've tried, Yi's Mandarin writing feels really natural. The flow, use of particles, etc. I haven't seen other models write like that. When the rabbit said "我没事我只是被一只狼给咬了一口而已没什么大不了的", I had to laugh. I also like how it came up with a 道理: "害人之心不可有防人之心不可无!" (not sure if that's a real saying or if it made that part up.) 3The first weird thing here is how it wrote some random numbers before the chapters, like "1.4 第二章". A lot of this one is actually really good, really imaginative. "于是,它们决定要好好照顾这只小鸟。" — But you guys just watched him die. How are you going to 好好照顾 him now? The main thing I'd call strange in those examples is when it writes something that contradicts what it just wrote previously. I wonder if it's possible there's something weird going on with the tokenizing of the initial prompt, or inclusion of stuff like the initial BOS token. Here's an example of how
|
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 8 replies
-
In the base model, we just use EOS token to seperate documents (no BOS at all). |
Beta Was this translation helpful? Give feedback.
-
I test some benchmarks with BOS token added in the front of prompt, and find that it does affect model performance(about 10% ~ 20%). By the way, the chat model will be released late this month, we can see if it will do better~ |
Beta Was this translation helpful? Give feedback.
-
The necessary support is now in the latest |
Beta Was this translation helpful? Give feedback.
I test some benchmarks with BOS token added in the front of prompt, and find that it does affect model performance(about 10% ~ 20%).
I'm not very familiar with llama.cpp, so I can just do some hard code(https://github.com/ggerganov/llama.cpp/blob/master/examples/main/main.cpp#L232). In my limit tests, I feel the results are a bit better, so I think you can have a try~
By the way, the chat model will be released late this month, we can see if it will do better~