-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bug in main.cpp (penalize_nl=false doesn't work). Supress warning on mingw. #1528
Conversation
…s the underlying logits array, but at this point we are already working on the candidates copy.
…on, this macro is already defined by /usr/lib/gcc/x86_64-w64-mingw32/11/include/c++/x86_64-w64-mingw32/bits/os_defines.h:45.
So many little mingw oddities cropping up. I'm curious though, any idea what's including |
examples/main/main.cpp
Outdated
candidates_p.data[idx].logit = nl_logit; | ||
break; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, the order of candidates_p
does not seem to be changed.
Why do you think it has?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not that the order of candiates_p
has changed, although IMO it's not great to rely on the fact that llama_sample_repetition_penalty
and llama_sample_frequency_and_presence_penalties
preserve the order (without at least documenting that). The main issue is that on line 418 we've copied the logits into local candidates vector (of llama_token_data_array, not raw floats). So modifying the original logit array from the context does nothing, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tom7 Apologise for the delay and not paying more attention to this - I think you are right and we've had this bug for quite some time now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not see bug here. Order of logits is not changed after penalization
Thanks for the quick look!
Apparently it's via
|
It looks like this PR didn't gain much traction, but I think I just discovered the same bug that this PR fixes. As I understand it the current situation is:
However, as far as I can see This PR changes it to find the nl token in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@martindevans Thanks for bringing attention again to this
@tom7 Sorry for the delay of about ~3 months 😄
* master: (773 commits) server : add `/detokenize` endpoint (ggerganov#2802) convert.py : advanced option (ggerganov#2753) llama : use Unicode Escape Sequence to replace encoded characters (ggerganov#2814) flake.nix : add rocm support and cleanup (ggerganov#2808) llama : move #includes out of _GNU_SOURCE conditional (ggerganov#2817) main : fix bug (penalize_nl=false doesn't work) + suppress warning on mingw (ggerganov#1528) llama : use std::abs in llama_sample_tail_free (ggerganov#2800) k-quants : remove unnecessary tensor shape restrictions (ggerganov#2811) Better perplexity for 2- and 3-bit quantization for LLaMA-v2-70B (ggerganov#2807) Fix HellaSwag (ggerganov#2805) flake : build llama.cpp on Intel with nix (ggerganov#2795) Handle null rope scaling value (ggerganov#2793) Fix spm whitespaces (ggerganov#2806) examples : skip unnecessary external lib in server README.md how-to (ggerganov#2804) llama : fix struct decl (ggerganov#2790) Faster perplexity computation (ggerganov#2786) llama : add llama_beam_search() (ggerganov#2267) convert.py : Get rope scale from HuggingFace models (ggerganov#2772) llama-bench : add model sizes (ggerganov#2771) convert.py : export rope freq_base when converting CodeLlama from an HF model (ggerganov#2773) ...
… mingw (ggerganov#1528) * Fix bug in main.cpp where penalize_nl=false has no effect. It modifies the underlying logits array, but at this point we are already working on the candidates copy. * Suppress redefinition warning for NOMINMAX on mingw. In my installation, this macro is already defined by /usr/lib/gcc/x86_64-w64-mingw32/11/include/c++/x86_64-w64-mingw32/bits/os_defines.h:45. * main : fix indentation * main : pass ctx to llama_token_nl() --------- Co-authored-by: Georgi Gerganov <[email protected]>
Pretty sure this is just a bug, but it's always possible I'm missing something!