Run Llama2 with torch.compile on Gaudi2 by kausikmaiti · Pull Request #605 · huggingface/optimum-habana

kausikmaiti · 2023-12-18T13:03:13Z

What does this PR do?

This change allows the user to run Llama2 model with torch.compile on Gaudi2.

Signed-off-by: kausik <kmaiti@habana.ai>

HuggingFaceDocBuilderDev · 2023-12-19T06:56:47Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

vivekgoe

@regisss can you please help review this change? this is needed to enable torch.compile for text generation tasks (for Llama and other models).

vivekgoe · 2023-12-19T08:53:57Z

-                output_hidden_states=output_hidden_states,
-                **hpu_graphs_kwargs,
-            )
+            if torch_compile:


wrapping model only for greedy_search does not look right, it should probably be done in generate() so that it works for other modes (such as beam_search also),

Not even sure we should do it in generate at all. If using the trainer, it should already be taken care of (see discussion above). Otherwise, for example in the text-generation example, I think we should just have a get_torch_compiled_model in text-generation/utils.py. That seems to be the way recommended by Transformers.

@regisss thanks for your comments, we will check if we can go with adding get_torch_compiled_model in text-generation/utils.py

Ok. I would create 'get_torch_compiled_model' in text-generation/utils.py.

vivekgoe · 2023-12-19T09:01:45Z

        negative_prompt_ids: Optional[torch.Tensor] = None,
        negative_prompt_attention_mask: Optional[torch.Tensor] = None,
        lazy_mode: Optional[bool] = False,
+        torch_compile: Optional[bool] = False,


For normal training, eval, predict models are wrapped within accelerator.prepare_model() call, adding new code for generate() may not be aligned. @regisss any idea how direct model.generate() calls are handled in transformers for compile mode, I tried to search there but did not find anything.

In the trainer, the link with Accelerate is made here:

optimum-habana/optimum/habana/transformers/training_args.py

Line 474 in 21238af

if self.torch_compile:

And then in Accelerate it happens here:

optimum-habana/optimum/habana/accelerate/accelerator.py

Line 371 in 21238af

if self.state.dynamo_plugin.backend != GaudiDynamoBackend.NO and not is_compiled_module(model):

It was introduced in #465.

Outside of the trainer, Transformers recommends to simply use:

model = torch.compile(model)

https://huggingface.co/docs/transformers/v4.36.1/en/perf_torch_compile

As suggested, I would create 'get_torch_compiled_model()' in text-generation/utils.py. And this will be called inside setup_model() in text-generation/utils.py.

regisss · 2023-12-19T14:04:40Z

        help="Whether to use the key/value cache for decoding. It should speed up generation.",
    )
+    parser.add_argument(
+        "--use_torch_compile",


Suggested change

"--use_torch_compile",

"--torch_compile",

to be aligned with Transformers and GaudiTrainingArguments

ok. I would change.

regisss · 2023-12-19T14:16:38Z

        negative_prompt_ids: Optional[torch.Tensor] = None,
        negative_prompt_attention_mask: Optional[torch.Tensor] = None,
        lazy_mode: Optional[bool] = False,
+        torch_compile: Optional[bool] = False,


In the trainer, the link with Accelerate is made here:

optimum-habana/optimum/habana/transformers/training_args.py

Line 474 in 21238af

if self.torch_compile:

And then in Accelerate it happens here:

optimum-habana/optimum/habana/accelerate/accelerator.py

Line 371 in 21238af

if self.state.dynamo_plugin.backend != GaudiDynamoBackend.NO and not is_compiled_module(model):

It was introduced in #465.

regisss · 2023-12-19T14:18:21Z

        negative_prompt_ids: Optional[torch.Tensor] = None,
        negative_prompt_attention_mask: Optional[torch.Tensor] = None,
        lazy_mode: Optional[bool] = False,
+        torch_compile: Optional[bool] = False,


Outside of the trainer, Transformers recommends to simply use:

model = torch.compile(model)

https://huggingface.co/docs/transformers/v4.36.1/en/perf_torch_compile

regisss · 2023-12-19T14:21:22Z

-                output_hidden_states=output_hidden_states,
-                **hpu_graphs_kwargs,
-            )
+            if torch_compile:


Not even sure we should do it in generate at all. If using the trainer, it should already be taken care of (see discussion above). Otherwise, for example in the text-generation example, I think we should just have a get_torch_compiled_model in text-generation/utils.py. That seems to be the way recommended by Transformers.

kausikmaiti · 2023-12-28T06:59:08Z

I created a separate PR after making necessary changes. Kindly refer to #616

… generation tests (huggingface#2200) (huggingface#605) Co-authored-by: Grzegorz Pluto-Prondzinski <gplutopx@habana.ai>

[SW-169007] Enable torch.compile support for Llama2

4444f1b

Signed-off-by: kausik <kmaiti@habana.ai>

kausikmaiti requested review from bhargaveede, regisss, ssarkar2 and vivekgoe as code owners December 18, 2023 13:03

vivekgoe added the run-test Run CI for PRs from external contributors label Dec 19, 2023

vivekgoe reviewed Dec 19, 2023

View reviewed changes

regisss reviewed Dec 19, 2023

View reviewed changes

kausikmaiti closed this Dec 28, 2023

Conversation

kausikmaiti commented Dec 18, 2023

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Dec 19, 2023

Uh oh!

vivekgoe left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kausikmaiti commented Dec 28, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants