Expose Llama Fused OPs control from run_lora_clm.py by hlahkar · Pull Request #751 · huggingface/optimum-habana

hlahkar · 2024-03-03T09:48:08Z

Provide variable to enable/disable FusedRoPE. This may be useful for integrating new models or configurations where FusedRoPE may not work out of the box.

* Expose Llama Fused OPs control from run_lora_clm.py * Update as per review comments

HuggingFaceDocBuilderDev · 2024-03-03T09:51:28Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

vivekgoe

Add use-case(s) in description where we may need to switch off "fused rope" (say for debugging potential accuracy issues or if fused rope does not work for a new input configuration).

regisss · 2024-03-07T05:57:48Z

@vivekgoe Why was this PR merged? I'm not convinced this is a good way to managed FusedRoPE, it should be enabled or disabled automatically in the modeling file of each model IMO.

Expose Llama Fused OPs control from run_lora_clm.py (HabanaAI#23) * Expose Llama Fused OPs control from run_lora_clm.py * Update as per review comments Co-authored-by: Vivek Goel <vgoel@habana.ai>

Expose Llama Fused OPs control from run_lora_clm.py (#23) * Expose Llama Fused OPs control from run_lora_clm.py * Update as per review comments Co-authored-by: Vivek Goel <vgoel@habana.ai>

vivekgoe · 2024-03-13T04:04:04Z

@regisss we need a way to disable FusedRoPE for corner cases where it causes a problem with a new feature. Ideally we would like to fix the problem with the FusedRoPE operator itself but there are cases where we would like to release feature with a work-around (disable FusedRoPE) without waiting for fix.

Current motivation is failure we see with FusedRoPE during Llama-70B-FSDP evaluation phase. If there is another better way you see to disable FusedRoPE for particular feature or test-case then please do let us know. We will revert this change and push a new one as per your recommendation.

regisss · 2024-03-13T09:38:32Z

@vivekgoe Thanks for the explanation. Since this kind of workaround is meant to be temporary, I would directly access self.generation_config.use_fused_rope from the forward methods in the modeling. That way, we can remove use_fused_rope from the forward signatures (it will still be accessible through self.generation_config) and also remove the logic added in the trainer.
I would also not add a new arg to apply_customized_rope and rather set FusedRoPE = None directly as it is done here for instance: https://github.com/huggingface/optimum-habana/pull/746/files#diff-bae02284f455b93397dbeafee178bd779671429602246f3ba60ea833a538eb68R35
The goal of all this is to keep changes minimal since this is rather a short-term workaround than a long-term fix. It will be easier to revert when the fix is available.

vivekgoe · 2024-03-13T10:37:10Z

@regisss Thanks for feedback. Let me explore your suggestion regarding self.generation_config.use_fused_rope and check if it works for our use-case.
Regarding not changing arg to apply_customized_rope, If I understand correctly we apply change done in modeling_gpt_neoX for modeling_llama, it will switch off FusedRoPE for all use-cases (including inference), whereas we would like to switch it off for only 1 use-case we are failing and leave rest untouched.

regisss · 2024-03-13T10:42:24Z

The example of GPT-NeoX is just to illustrate that we can modify FusedRoPE directly instead of inserting a new variable. You could use something such as:

if self.generation_config.use_fused_rope == False:
    FusedRoPE = None

That way it will be disabled only if use_fused_rope is explicitly set to False in the generation config.

vivekgoe · 2024-03-14T04:04:20Z

@regisss Thanks for explaining, we will try to make the required changes and get those in before next major release.

Expose Llama Fused OPs control from run_lora_clm.py (#23)

aa1a1f8

* Expose Llama Fused OPs control from run_lora_clm.py * Update as per review comments

hlahkar requested review from bhargaveede, libinta, mandy-li, ssarkar2 and vivekgoe as code owners March 3, 2024 09:48

hlahkar requested a review from a user March 3, 2024 09:48

hlahkar requested a review from regisss as a code owner March 3, 2024 09:48

hlahkar changed the title ~~Expose Llama Fused OPs control from run_lora_clm.py (#23)~~ Expose Llama Fused OPs control from run_lora_clm.py Mar 3, 2024

vivekgoe requested changes Mar 4, 2024

View reviewed changes

vivekgoe added run-test Run CI for PRs from external contributors synapse 1.15 labels Mar 4, 2024

Merge branch 'main' into expose_fused_ops

fbfe082

vivekgoe added run-test Run CI for PRs from external contributors and removed run-test Run CI for PRs from external contributors labels Mar 7, 2024

vivekgoe approved these changes Mar 7, 2024

View reviewed changes

vivekgoe merged commit 1a0c775 into huggingface:main Mar 7, 2024

This was referenced Mar 14, 2024

Control Fused Rope from Generation Config HabanaAI/optimum-habana-fork#108

Closed

Fix graph breaks in torch compile mode #806

Merged

This was referenced Jun 7, 2024

Expose Llama Fused OPs control from run_lora_clm.py HabanaAI/optimum-habana-fork#23

Merged

Change check to False explicitly for use_fused_rope HabanaAI/optimum-habana-fork#62

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose Llama Fused OPs control from run_lora_clm.py#751

Expose Llama Fused OPs control from run_lora_clm.py#751
vivekgoe merged 2 commits into
huggingface:mainfrom
HabanaAI:expose_fused_ops

hlahkar commented Mar 3, 2024 •

edited by vivekgoe

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Mar 3, 2024

Uh oh!

vivekgoe left a comment •

edited

Loading

Uh oh!

regisss commented Mar 7, 2024

Uh oh!

vivekgoe commented Mar 13, 2024

Uh oh!

regisss commented Mar 13, 2024

Uh oh!

vivekgoe commented Mar 13, 2024

Uh oh!

regisss commented Mar 13, 2024

Uh oh!

vivekgoe commented Mar 14, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

hlahkar commented Mar 3, 2024 • edited by vivekgoe Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Mar 3, 2024

Uh oh!

vivekgoe left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

regisss commented Mar 7, 2024

Uh oh!

vivekgoe commented Mar 13, 2024

Uh oh!

regisss commented Mar 13, 2024

Uh oh!

vivekgoe commented Mar 13, 2024

Uh oh!

regisss commented Mar 13, 2024

Uh oh!

vivekgoe commented Mar 14, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

hlahkar commented Mar 3, 2024 •

edited by vivekgoe

Loading

vivekgoe left a comment •

edited

Loading