From 926111f9fc7f8c645a2f7b2e29c767ba1c964e9e Mon Sep 17 00:00:00 2001 From: "Wang, Yi A" Date: Wed, 16 Oct 2024 03:20:52 -0700 Subject: [PATCH] add peft generation example Signed-off-by: Wang, Yi A --- examples/text-generation/README.md | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/examples/text-generation/README.md b/examples/text-generation/README.md index 22390d6497..b99d3d8a6a 100755 --- a/examples/text-generation/README.md +++ b/examples/text-generation/README.md @@ -214,6 +214,22 @@ python run_generation.py \ > The prompt length is limited to 16 tokens. Prompts longer than this will be truncated. +### Use PEFT models for generation + +You can also provide the path to a PEFT model to perform generation with the argument `--peft_model`. + +For example: +```bash +python run_generation.py \ +--model_name_or_path meta-llama/Llama-2-7b-hf \ +--use_hpu_graphs \ +--use_kv_cache \ +--batch_size 1 \ +--bf16 \ +--max_new_tokens 100 \ +--prompt "Here is my prompt" \ +--peft_model yard1/llama-2-7b-sql-lora-test +``` ### Using growing bucket optimization