Skip to content

Commit

Permalink
Refactor and Fix the Readme (#563)
Browse files Browse the repository at this point in the history
* refactoring the readme

* continued refining

* more cleanup

* more cleanup

* more cleanup

* more cleanup

* more cleanup

* more refining

* Update README.md

Update README.md

* move the discaimer down

* remove torchtune from main readme
Fix pathing issues for runner commands

* don't use pybindings for et setup

---------

Co-authored-by: Michael Gschwind <[email protected]>
  • Loading branch information
2 people authored and malfet committed Jul 17, 2024
1 parent d94ed28 commit d98a7b6
Show file tree
Hide file tree
Showing 3 changed files with 54 additions and 2 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ __pycache__/

.model-artifacts/
.venv
.torchchat

# Build directories
build/android/*
Expand Down
25 changes: 23 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,6 @@ You can also remove downloaded models with the remove command:
`python3 torchchat.py remove llama3`



## Running via PyTorch / Python
[Follow the installation steps if you haven't](#installation)

Expand Down Expand Up @@ -199,7 +198,7 @@ export TORCHCHAT_ROOT=${PWD}
### Export for mobile
The following example uses the Llama3 8B Instruct model.

[#shell default]: echo '{"embedding": {"bitwidth": 4, "groupsize" : 32}, "linear:a8w4dq": {"groupsize" : 32}}' >./config/data/mobile.json
[comment default]: echo '{"embedding": {"bitwidth": 4, "groupsize" : 32}, "linear:a8w4dq": {"groupsize" : 32}}' >./config/data/mobile.json

```
# Export
Expand Down Expand Up @@ -250,8 +249,11 @@ Now, follow the app's UI guidelines to pick the model and tokenizer files from t
<img src="https://pytorch.org/executorch/main/_static/img/llama_ios_app.png" width="600" alt="iOS app running a LlaMA model">
</a>


### Deploy and run on Android



MISSING. TBD.


Expand All @@ -262,6 +264,8 @@ Uses the lm_eval library to evaluate model accuracy on a variety of
tasks. Defaults to wikitext and can be manually controlled using the
tasks and limit args.

See [Evaluation](docs/evaluation.md)

For more information run `python3 torchchat.py eval --help`

**Examples**
Expand Down Expand Up @@ -317,6 +321,7 @@ you can perform the example commands with any of these models.
**CERTIFICATE_VERIFY_FAILED**
Run `pip install --upgrade certifi`.


**Access to model is restricted and you are not in the authorized
list** Some models require an additional step to access. Follow the
link provided in the error to get access.
Expand All @@ -338,6 +343,22 @@ third-party models, weights, data, or other technologies, and you are
solely responsible for complying with all such obligations.


### Disclaimer
The torchchat Repository Content is provided without any guarantees about
performance or compatibility. In particular, torchchat makes available
model architectures written in Python for PyTorch that may not perform
in the same manner or meet the same standards as the original versions
of those models. When using the torchchat Repository Content, including
any model architectures, you are solely responsible for determining the
appropriateness of using or redistributing the torchchat Repository Content
and assume any risks associated with your use of the torchchat Repository Content
or any models, outputs, or results, both alone and in combination with
any other technologies. Additionally, you may have other legal obligations
that govern your use of other content, such as the terms of service for
third-party models, weights, data, or other technologies, and you are
solely responsible for complying with all such obligations.


## Acknowledgements
Thank you to the [community](docs/ACKNOWLEDGEMENTS.md) for all the
awesome libraries and tools you've built around local LLM inference.
Expand Down
30 changes: 30 additions & 0 deletions docs/torchtune.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Fine-tuned models from torchtune

torchchat supports running inference with models fine-tuned using [torchtune](https://github.com/pytorch/torchtune). To do so, we first need to convert the checkpoints into a format supported by torchchat.

Below is a simple workflow to run inference on a fine-tuned Llama3 model. For more details on how to fine-tune Llama3, see the instructions [here](https://github.com/pytorch/torchtune?tab=readme-ov-file#llama3)

```bash
# install torchtune
pip install torchtune

# download the llama3 model
tune download meta-llama/Meta-Llama-3-8B \
--output-dir ./Meta-Llama-3-8B \
--hf-token <ACCESS TOKEN>

# Run LoRA fine-tuning on a single device. This assumes the config points to <checkpoint_dir> above
tune run lora_finetune_single_device --config llama3/8B_lora_single_device

# convert the fine-tuned checkpoint to a format compatible with torchchat
python3 build/convert_torchtune_checkpoint.py \
--checkpoint-dir ./Meta-Llama-3-8B \
--checkpoint-files meta_model_0.pt \
--model-name llama3_8B \
--checkpoint-format meta

# run inference on a single GPU
python3 torchchat.py generate \
--checkpoint-path ./Meta-Llama-3-8B/model.pth \
--device cuda
```

0 comments on commit d98a7b6

Please sign in to comment.