Skip to content

Some slightly odd results running inference (may just be prompt or tokenizing issue) #5

Closed Answered by loofahcus
KerfuffleV2 asked this question in Q&A
Discussion options

You must be logged in to vote

I test some benchmarks with BOS token added in the front of prompt, and find that it does affect model performance(about 10% ~ 20%).
I'm not very familiar with llama.cpp, so I can just do some hard code(https://github.com/ggerganov/llama.cpp/blob/master/examples/main/main.cpp#L232). In my limit tests, I feel the results are a bit better, so I think you can have a try~

By the way, the chat model will be released late this month, we can see if it will do better~

Replies: 3 comments 8 replies

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
8 replies
@jezzarax
Comment options

@KerfuffleV2
Comment options

@KerfuffleV2
Comment options

@jezzarax
Comment options

@KerfuffleV2
Comment options

Answer selected by KerfuffleV2
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants