Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug in main.cpp (penalize_nl=false doesn't work). Supress warning on mingw. #1528

Merged
merged 5 commits into from
Aug 26, 2023

Conversation

tom7
Copy link
Contributor

@tom7 tom7 commented May 19, 2023

Pretty sure this is just a bug, but it's always possible I'm missing something!

tom7 added 2 commits May 19, 2023 12:50
…s the underlying logits array, but at this point we are already working on the candidates copy.
…on, this macro is already defined by /usr/lib/gcc/x86_64-w64-mingw32/11/include/c++/x86_64-w64-mingw32/bits/os_defines.h:45.
@DannyDaemonic
Copy link
Contributor

this macro is already defined by /usr/lib/gcc/x86_64-w64-mingw32/11/include/c++/x86_64-w64-mingw32/bits/os_defines.h:45.

So many little mingw oddities cropping up. I'm curious though, any idea what's including os_defines.h?

candidates_p.data[idx].logit = nl_logit;
break;
}
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, the order of candidates_p does not seem to be changed.
Why do you think it has?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not that the order of candiates_p has changed, although IMO it's not great to rely on the fact that llama_sample_repetition_penalty and llama_sample_frequency_and_presence_penalties preserve the order (without at least documenting that). The main issue is that on line 418 we've copied the logits into local candidates vector (of llama_token_data_array, not raw floats). So modifying the original logit array from the context does nothing, right?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tom7 Apologise for the delay and not paying more attention to this - I think you are right and we've had this bug for quite some time now.

Copy link
Collaborator

@ivanstepanovftw ivanstepanovftw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see bug here. Order of logits is not changed after penalization

@tom7
Copy link
Contributor Author

tom7 commented May 19, 2023

Thanks for the quick look!

this macro is already defined by /usr/lib/gcc/x86_64-w64-mingw32/11/include/c++/x86_64-w64-mingw32/bits/os_defines.h:45.

So many little mingw oddities cropping up. I'm curious though, any idea what's including os_defines.h?

Apparently it's via <string>:

examples/main/main.cpp:27: warning: "NOMINMAX" redefined
   27 | #define NOMINMAX
      |
In file included from /usr/lib/gcc/x86_64-w64-mingw32/11/include/c++/x86_64-w64-mingw32/bits/c++config.h:586,
                 from /usr/lib/gcc/x86_64-w64-mingw32/11/include/c++/string:38,
                 from ./examples/common.h:7,
                 from examples/main/main.cpp:6:
/usr/lib/gcc/x86_64-w64-mingw32/11/include/c++/x86_64-w64-mingw32/bits/os_defines.h:45: note: this is the location of the previous definition
   45 | #define NOMINMAX 1

@martindevans
Copy link
Contributor

It looks like this PR didn't gain much traction, but I think I just discovered the same bug that this PR fixes.

As I understand it the current situation is:

  1. Get the logits
  2. Copy that into candidates vector
  3. Save the nl token logit
  4. Do stuff with candidates_p
  5. Overwrite nl token logit

However, as far as I can see logits is never referenced again after the copy (in step 2). Which means that writing a value to it in step 5 is pointless.

This PR changes it to find the nl token in candidates_p and overwrite that value instead.

Copy link
Owner

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martindevans Thanks for bringing attention again to this

@tom7 Sorry for the delay of about ~3 months 😄

@ggerganov ggerganov merged commit 72f895c into ggerganov:master Aug 26, 2023
24 checks passed
mattgauf added a commit to mattgauf/llama.cpp that referenced this pull request Aug 26, 2023
* master: (773 commits)
  server : add `/detokenize` endpoint (ggerganov#2802)
  convert.py : advanced option (ggerganov#2753)
  llama : use Unicode Escape Sequence to replace encoded characters (ggerganov#2814)
  flake.nix : add rocm support and cleanup (ggerganov#2808)
  llama : move #includes out of _GNU_SOURCE conditional (ggerganov#2817)
  main : fix bug (penalize_nl=false doesn't work) + suppress warning on mingw (ggerganov#1528)
  llama : use std::abs in llama_sample_tail_free (ggerganov#2800)
  k-quants : remove unnecessary tensor shape restrictions (ggerganov#2811)
  Better perplexity for 2- and 3-bit quantization for LLaMA-v2-70B (ggerganov#2807)
  Fix HellaSwag (ggerganov#2805)
  flake : build llama.cpp on Intel with nix (ggerganov#2795)
  Handle null rope scaling value (ggerganov#2793)
  Fix spm whitespaces (ggerganov#2806)
  examples : skip unnecessary external lib in server README.md how-to (ggerganov#2804)
  llama : fix struct decl (ggerganov#2790)
  Faster perplexity computation (ggerganov#2786)
  llama : add llama_beam_search() (ggerganov#2267)
  convert.py : Get rope scale from HuggingFace models (ggerganov#2772)
  llama-bench : add model sizes (ggerganov#2771)
  convert.py : export rope freq_base when converting CodeLlama from an HF model (ggerganov#2773)
  ...
akawrykow pushed a commit to akawrykow/llama.cpp that referenced this pull request Aug 29, 2023
… mingw (ggerganov#1528)

* Fix bug in main.cpp where penalize_nl=false has no effect. It modifies the underlying logits array, but at this point we are already working on the candidates copy.

* Suppress redefinition warning for NOMINMAX on mingw. In my installation, this macro is already defined by /usr/lib/gcc/x86_64-w64-mingw32/11/include/c++/x86_64-w64-mingw32/bits/os_defines.h:45.

* main : fix indentation

* main : pass ctx to llama_token_nl()

---------

Co-authored-by: Georgi Gerganov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants