Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gguf : general usability improvements #3409

Merged
merged 7 commits into from
Oct 2, 2023

Conversation

cebtenzzre
Copy link
Collaborator

  • avoid copy-pasted tensor names in MODEL_TENSOR_NAMES
  • accept str for path in SpecialVocab.__init__

Resolves the discussion at #2842 (review)

@cebtenzzre
Copy link
Collaborator Author

@ggerganov I updated the PR to remove MODEL_TENSOR_NAMES, since it generally makes more sense to use MODEL_TENSORS and TENSOR_NAMES separately. But this is a breaking change. What do you think?

Copy link
Owner

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, why is it a breaking change? If it is just that python scripts have to switch to using gguf.TENSOR_NAMES then I think it's fine

@cebtenzzre cebtenzzre merged commit 0fe3210 into ggerganov:master Oct 2, 2023
for tensor, keys in self.mappings_cfg.items():
tensor_name = tensor_names.get(tensor)
if tensor_name is None:
if tensor not in MODEL_TENSORS[arch]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's kind of late to say anything, but I'm curious why you'd make these changes. There's no usability improvement from the user's perspective except it'll be twice as slow now since it requires the key to get hashed and the dictionary searched twice compared to using dict.get().

Copy link
Collaborator Author

@cebtenzzre cebtenzzre Oct 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the code makes more sense this way - the standard name of the tensor is never dependent on the model it is from, so we should represent that fact by using separate data structures. You shouldn't have to know what model architecture you are working with to find the name of a standard tensor, either. I don't think this part of the code is performance-critical, so I didn't bother optimizing it.
This has confused developers at least, which is what I mean by usability. See #3417 (comment)

joelkuiper added a commit to vortext/llama.cpp that referenced this pull request Oct 5, 2023
…example

* 'master' of github.com:ggerganov/llama.cpp: (24 commits)
  convert : fix Baichuan2 models by using vocab size in config.json (ggerganov#3299)
  readme : add project status link
  ggml : fix build after ggerganov#3329
  llm : add Refact model (ggerganov#3329)
  sync : ggml (conv 1d + 2d updates, UB fixes) (ggerganov#3468)
  finetune : readme fix typo (ggerganov#3465)
  ggml : add RISC-V Vector Support for K-Quants and improved the existing intrinsics (ggerganov#3453)
  main : consistent prefix/suffix coloring (ggerganov#3425)
  llama : fix session saving/loading (ggerganov#3400)
  llama : expose model's rope_freq_scale in the API (ggerganov#3418)
  metal : alibi for arbitrary number of heads (ggerganov#3426)
  cmake : make LLAMA_NATIVE flag actually use the instructions supported by the processor (ggerganov#3273)
  Work on the BPE tokenizer (ggerganov#3252)
  convert : fix vocab size when not defined in hparams (ggerganov#3421)
  cmake : increase minimum version for add_link_options (ggerganov#3444)
  CLBlast: Add broadcast support for matrix multiplication (ggerganov#3402)
  gguf : add BERT, MPT, and GPT-J arch info (ggerganov#3408)
  gguf : general usability improvements (ggerganov#3409)
  cmake : make CUDA flags more similar to the Makefile (ggerganov#3420)
  finetune : fix ggerganov#3404 (ggerganov#3437)
  ...
yusiwen pushed a commit to yusiwen/llama.cpp that referenced this pull request Oct 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants