Expose Llama Fused OPs control from run_lora_clm.py by vivekgoe · Pull Request #23 · HabanaAI/optimum-habana-fork

vivekgoe · 2024-02-05T10:56:58Z

What does this PR do?

Exposes FusedRoPE enable/disable for Llama model from task scripts (currently done for only run_lora_clm.py). It is useful to have this capability for debugging purposes. Immediate motivation is to use this as a workaround for issue we see with FusedRoPE in compile mode.

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

vivekgoe · 2024-02-07T06:57:37Z

@dvarshney-habana please review, we have already reviewed it with Puneesh.

* Expose Llama Fused OPs control from run_lora_clm.py * Update as per review comments

Expose Llama Fused OPs control from run_lora_clm.py (#23) * Expose Llama Fused OPs control from run_lora_clm.py * Update as per review comments Co-authored-by: Vivek Goel <vgoel@habana.ai>

Expose Llama Fused OPs control from run_lora_clm.py (HabanaAI#23) * Expose Llama Fused OPs control from run_lora_clm.py * Update as per review comments Co-authored-by: Vivek Goel <vgoel@habana.ai>

Expose Llama Fused OPs control from run_lora_clm.py (#23) * Expose Llama Fused OPs control from run_lora_clm.py * Update as per review comments Co-authored-by: Vivek Goel <vgoel@habana.ai>

* Expose Llama Fused OPs control from run_lora_clm.py * Update as per review comments

astachowiczhabana · 2024-06-07T14:18:55Z

huggingface#751

* Fix clip test * Skip falcon tests * Fix clip test * [SW-209062] Disable default sdpa in Albert (#23) Transformers' default sdpa implementation caused performance drop in Albert. Adding Albert to the list of models which don't yet have sdpa implementation in Gaudi and use eager attention. * [SW-209210] skip first token in EOS check. (#25) (#27) * Problem: output of _sample function was filled with padding tokens for for bart model. * Cause: Bart model uses the same token as decoder_start_token_id and end of string. See: https://huggingface.co/facebook/bart-large-cnn/blob/main/config.json Because of that mechanism filling model output with padding tokens after EOS (end of string) toke was replacing whole response with padding. * Solution: Skip check for EOS for first token in padding filling loop. * Update CODEOWNERS * Adding labels clone as workaround to avoid crash (#28) * [SW-0] Fix style --------- Co-authored-by: Urszula Golowicz <urszula.golowicz@intel.com> Co-authored-by: Marcin Łapiński <mlapinskix@habana.ai> Co-authored-by: Bhargav <beede@habana.ai>

Vivek added 2 commits February 5, 2024 12:54

Expose Llama Fused OPs control from run_lora_clm.py

e0ba75d

Update as per review comments

a7e3b5d

vivekgoe marked this pull request as ready for review February 7, 2024 06:53

vivekgoe requested review from bhargaveede, libinta, mandy-li and ssarkar2 as code owners February 7, 2024 06:53

vivekgoe requested a review from a user February 7, 2024 06:53

ghost approved these changes Feb 7, 2024

View reviewed changes

ghost merged commit e48398d into habana-main Feb 7, 2024

bhargaveede pushed a commit that referenced this pull request Feb 19, 2024

Expose Llama Fused OPs control from run_lora_clm.py (#23)

30cde4b

* Expose Llama Fused OPs control from run_lora_clm.py * Update as per review comments

bhargaveede pushed a commit that referenced this pull request Feb 19, 2024

Expose Llama Fused OPs control from run_lora_clm.py (#23)

0e56c6b

* Expose Llama Fused OPs control from run_lora_clm.py * Update as per review comments

dudilester pushed a commit that referenced this pull request Feb 29, 2024

Expose Llama Fused OPs control from run_lora_clm.py (#23)

a0a1658

* Expose Llama Fused OPs control from run_lora_clm.py * Update as per review comments

vivekgoe deleted the fusedllama_fused_ops branch March 2, 2024 06:20

hlahkar pushed a commit that referenced this pull request Mar 3, 2024

Expose Llama Fused OPs control from run_lora_clm.py (#23)

aa1a1f8

* Expose Llama Fused OPs control from run_lora_clm.py * Update as per review comments

vivekgoe added the ported_to_hf_oh PR has been ported to huggingface/optimum-habana label Mar 4, 2024

kalyanjk pushed a commit to kalyanjk/optimum-habana-fork that referenced this pull request Apr 12, 2024

Expose Llama Fused OPs control from run_lora_clm.py (HabanaAI#23)

1d87e48

* Expose Llama Fused OPs control from run_lora_clm.py * Update as per review comments

kalyanjk pushed a commit to kalyanjk/optimum-habana-fork that referenced this pull request Apr 15, 2024

Expose Llama Fused OPs control from run_lora_clm.py (HabanaAI#23)

65117ae

* Expose Llama Fused OPs control from run_lora_clm.py * Update as per review comments

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose Llama Fused OPs control from run_lora_clm.py#23

Expose Llama Fused OPs control from run_lora_clm.py#23
2 commits merged into
habana-mainfrom
fusedllama_fused_ops

vivekgoe commented Feb 5, 2024 •

edited

Loading

Uh oh!

vivekgoe commented Feb 7, 2024

Uh oh!

astachowiczhabana commented Jun 7, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

vivekgoe commented Feb 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Uh oh!

vivekgoe commented Feb 7, 2024

Uh oh!

astachowiczhabana commented Jun 7, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vivekgoe commented Feb 5, 2024 •

edited

Loading