You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -60,7 +60,7 @@ Inference of Meta's [LLaMA](https://arxiv.org/abs/2302.13971) model (and others)
60
60
61
61
### Hot topics
62
62
63
-
-**`convert.py` has been deprecated and moved to `examples/convert-legacy-llama.py`, please use `convert-hf-to-gguf.py`**https://github.com/ggerganov/llama.cpp/pull/7430
63
+
-**`convert.py` has been deprecated and moved to `examples/convert_legacy_llama.py`, please use `convert_hf_to_gguf.py`**https://github.com/ggerganov/llama.cpp/pull/7430
- BPE pre-tokenization support has been added: https://github.com/ggerganov/llama.cpp/pull/6920
66
66
- MoE memory layout has been updated - reconvert models for `mmap` support and regenerate `imatrix`https://github.com/ggerganov/llama.cpp/pull/6387
@@ -670,8 +670,8 @@ Building the program with BLAS support may lead to some performance improvements
670
670
671
671
To obtain the official LLaMA 2 weights please see the <ahref="#obtaining-and-using-the-facebook-llama-2-model">Obtaining and using the Facebook LLaMA 2 model</a> section. There is also a large selection of pre-quantized `gguf` models available on Hugging Face.
672
672
673
-
Note: `convert.py` has been moved to `examples/convert-legacy-llama.py` and shouldn't be used for anything other than `Llama/Llama2/Mistral` models and their derivatives.
674
-
It does not support LLaMA 3, you can use `convert-hf-to-gguf.py` with LLaMA 3 downloaded from Hugging Face.
673
+
Note: `convert.py` has been moved to `examples/convert_legacy_llama.py` and shouldn't be used for anything other than `Llama/Llama2/Mistral` models and their derivatives.
674
+
It does not support LLaMA 3, you can use `convert_hf_to_gguf.py` with LLaMA 3 downloaded from Hugging Face.
675
675
676
676
```bash
677
677
# obtain the official LLaMA model weights and place them in ./models
@@ -688,7 +688,7 @@ ls ./models
688
688
python3 -m pip install -r requirements.txt
689
689
690
690
# convert the model to ggml FP16 format
691
-
python3 convert-hf-to-gguf.py models/mymodel/
691
+
python3 convert_hf_to_gguf.py models/mymodel/
692
692
693
693
# quantize the model to 4-bits (using Q4_K_M method)
Copy file name to clipboardExpand all lines: docs/HOWTO-add-model.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -17,7 +17,7 @@ Also, it is important to check that the examples and main ggml backends (CUDA, M
17
17
### 1. Convert the model to GGUF
18
18
19
19
This step is done in python with a `convert` script using the [gguf](https://pypi.org/project/gguf/) library.
20
-
Depending on the model architecture, you can use either [convert-hf-to-gguf.py](../convert-hf-to-gguf.py) or [examples/convert-legacy-llama.py](../examples/convert-legacy-llama.py) (for `llama/llama2` models in `.pth` format).
20
+
Depending on the model architecture, you can use either [convert_hf_to_gguf.py](../convert_hf_to_gguf.py) or [examples/convert_legacy_llama.py](../examples/convert_legacy_llama.py) (for `llama/llama2` models in `.pth` format).
21
21
22
22
The convert script reads the model configuration, tokenizer, tensor names+data and converts them to GGUF metadata and tensors.
3. Use `convert-image-encoder-to-gguf.py` with `--projector-type ldp` (for **V2** please use `--projector-type ldpv2`) to convert the LLaVA image encoder to GGUF:
39
+
3. Use `convert_image_encoder_to_gguf.py` with `--projector-type ldp` (for **V2** please use `--projector-type ldpv2`) to convert the LLaVA image encoder to GGUF:
0 commit comments