Skip to content

Add cmdline arguments to run phi-2#687

Closed
nprotasov wants to merge 1 commit into
huggingface:mainfrom
12010486:nprotaso/phi-2_workaround
Closed

Add cmdline arguments to run phi-2#687
nprotasov wants to merge 1 commit into
huggingface:mainfrom
12010486:nprotaso/phi-2_workaround

Conversation

@nprotasov
Copy link
Copy Markdown
Contributor

To run Phi-2 model we need to do some workarounds, because Phi-2 is require (4.37.0.dev) of transformers, we can't call adapt_transformers_to_gaudi and we need to pass trust_remote_code to AutoModelForCausalLM

  1. trust_remote_code - to load phi-2 model
  2. default_transformers - this argument will disable adapt_transformers_to_gaudi() method, and Habana inference args

@nprotasov nprotasov requested a review from regisss as a code owner February 5, 2024 14:14
@regisss
Copy link
Copy Markdown
Collaborator

regisss commented Feb 5, 2024

I think we should just wait for #651 to be merged, which should happen very soon

@nprotasov
Copy link
Copy Markdown
Contributor Author

I think we should just wait for #651 to be merged, which should happen very soon

This workarounds are not specific for Phi-2, it's useful for all models which require trust_remote_code or need another transformers version

@regisss
Copy link
Copy Markdown
Collaborator

regisss commented Feb 5, 2024

I think we should just wait for #651 to be merged, which should happen very soon

This workarounds are not specific for Phi-2, it's useful for all models which require trust_remote_code or need another transformers version

I get that, but the problem is that we have absolutely no guarantee on the API provided by this kind of models. The name of the inputs or methods could be different from what Transformers expects. So it's basically impossible to ensure that these models will work.
I think an issue should be opened by users if they cannot make it work for a specific model. Or we can add a workaround for trendy models as we did for Falcon.
Now, if customers and users are asking for it, we can move forward and do something with this PR. Otherwise, I would rather keep it as it is.

@regisss
Copy link
Copy Markdown
Collaborator

regisss commented Feb 19, 2024

@nprotasov The release of Optimum Habana v1.10.2 is fully compatible with Transformers v4.37 so phi-2 can be used out of the box. I quickly tried the text-generation example with

python run_generation.py \
  --model_name_or_path microsoft/phi-2 \
  --use_hpu_graphs \
  --use_kv_cache \
  --max_new_tokens 100 \
  --do_sample \
  --prompt "Here is my prompt"

and it ran successfully. Note that it is not compatible with a static KV cache, which leads to a high number of generated HPU graphs (and thus high compilation time) due to the increasing size of this cache throughout the generation process. We could implement it and override parts of its modeling as we do for Llama and other optimized models if needed.

@regisss regisss closed this Feb 19, 2024
@12010486 12010486 deleted the nprotaso/phi-2_workaround branch August 23, 2024 13:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants