Skip to content

Commit f9e892a

Browse files
mikekgfbmalfet
authored andcommitted
Update README.md (pytorch#169)
Update readme load_gguf => gguf-path
1 parent d032898 commit f9e892a

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

README.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -107,7 +107,7 @@ specified using the `params-path ${PARAMS_PATH}` containing the appropriate mode
107107

108108
The parameter file will should be in JSON format specifying thee parameters. You can find the Model Args data class in [`model.py`](https://github.com/pytorch/torchat/blob/main/model.py#L22).
109109

110-
The final way to initialize a torchat model from a GGUF format, a new file format for storing models. You load a GGUF model with the option --load_gguf ${MODELNAME}.gguf`. Presently, the F16, F32, Q4_0, and Q6_K formats are supported and converted into native torch-chat models. Please refer to section *Loading GGUF* for details.
110+
The final way to initialize a torchat model from a GGUF format, a new file format for storing models. You load a GGUF model with the option --gguf-path ${MODELNAME}.gguf`. Presently, the F16, F32, Q4_0, and Q6_K formats are supported and converted into native torch-chat models. Please refer to section *Loading GGUF* for details.
111111

112112
You may also dequantize GGUF models with the GGUF quantize tool, and then load and requantize with torchat native quantization options. (Please note that quantizing and dequantizing is a lossy process, and you will get the best results by starting with the original unquantized model checkpoint, not a previsoul;y quantized and thend equantized model.)
113113

@@ -513,7 +513,7 @@ We invite contributors to submit established quantization schemes, with accuracy
513513
GGUF is a nascent industry standard format and presently torchat can read the F16, F32, Q4_0, and Q6_K formats natively and convert them into native torch-chat models by using the load-gguf option:
514514

515515
```
516-
--load_gguf <gguf_filename> # all other options as described elsewhere, works for generate and export, for all backends, but cannot be used with --quantize
516+
--gguf-path <gguf_filename> # all other options as described elsewhere, works for generate and export, for all backends, but cannot be used with --quantize
517517
```
518518

519519
Ypu may then apply the standard quantization options, e.g., to add embedding table quantization as described under quantization. (You cannot directly requantize already quantized formats. However, you may dequantize them using GGUF tools, and then laod the model into torchat to quantize wqith torchat's quantization workflow.)

0 commit comments

Comments
 (0)