Add cmdline arguments to run phi-2 by nprotasov · Pull Request #687 · huggingface/optimum-habana

nprotasov · 2024-02-05T14:14:05Z

To run Phi-2 model we need to do some workarounds, because Phi-2 is require (4.37.0.dev) of transformers, we can't call adapt_transformers_to_gaudi and we need to pass trust_remote_code to AutoModelForCausalLM

trust_remote_code - to load phi-2 model
default_transformers - this argument will disable adapt_transformers_to_gaudi() method, and Habana inference args

regisss · 2024-02-05T15:53:35Z

I think we should just wait for #651 to be merged, which should happen very soon

nprotasov · 2024-02-05T15:56:55Z

I think we should just wait for #651 to be merged, which should happen very soon

This workarounds are not specific for Phi-2, it's useful for all models which require trust_remote_code or need another transformers version

regisss · 2024-02-05T16:23:47Z

I think we should just wait for #651 to be merged, which should happen very soon

This workarounds are not specific for Phi-2, it's useful for all models which require trust_remote_code or need another transformers version

I get that, but the problem is that we have absolutely no guarantee on the API provided by this kind of models. The name of the inputs or methods could be different from what Transformers expects. So it's basically impossible to ensure that these models will work.
I think an issue should be opened by users if they cannot make it work for a specific model. Or we can add a workaround for trendy models as we did for Falcon.
Now, if customers and users are asking for it, we can move forward and do something with this PR. Otherwise, I would rather keep it as it is.

regisss · 2024-02-19T02:53:21Z

@nprotasov The release of Optimum Habana v1.10.2 is fully compatible with Transformers v4.37 so phi-2 can be used out of the box. I quickly tried the text-generation example with

python run_generation.py \
  --model_name_or_path microsoft/phi-2 \
  --use_hpu_graphs \
  --use_kv_cache \
  --max_new_tokens 100 \
  --do_sample \
  --prompt "Here is my prompt"

and it ran successfully. Note that it is not compatible with a static KV cache, which leads to a high number of generated HPU graphs (and thus high compilation time) due to the increasing size of this cache throughout the generation process. We could implement it and override parts of its modeling as we do for Llama and other optimized models if needed.

add cmdline args to workaround phi-2

41809e2

nprotasov requested a review from regisss as a code owner February 5, 2024 14:14

regisss closed this Feb 19, 2024

regisss mentioned this pull request Feb 21, 2024

Update text generation example to support qwen model #729

Closed

3 tasks

12010486 deleted the nprotaso/phi-2_workaround branch August 23, 2024 13:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add cmdline arguments to run phi-2#687

Add cmdline arguments to run phi-2#687
nprotasov wants to merge 1 commit into
huggingface:mainfrom
12010486:nprotaso/phi-2_workaround

nprotasov commented Feb 5, 2024

Uh oh!

regisss commented Feb 5, 2024

Uh oh!

nprotasov commented Feb 5, 2024

Uh oh!

regisss commented Feb 5, 2024

Uh oh!

regisss commented Feb 19, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nprotasov commented Feb 5, 2024

Uh oh!

regisss commented Feb 5, 2024

Uh oh!

nprotasov commented Feb 5, 2024

Uh oh!

regisss commented Feb 5, 2024

Uh oh!

regisss commented Feb 19, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants