Expose Llama Fused OPs control from run_lora_clm.py#751
Conversation
* Expose Llama Fused OPs control from run_lora_clm.py * Update as per review comments
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
@vivekgoe Why was this PR merged? I'm not convinced this is a good way to managed FusedRoPE, it should be enabled or disabled automatically in the modeling file of each model IMO. |
Expose Llama Fused OPs control from run_lora_clm.py (HabanaAI#23) * Expose Llama Fused OPs control from run_lora_clm.py * Update as per review comments Co-authored-by: Vivek Goel <vgoel@habana.ai>
Expose Llama Fused OPs control from run_lora_clm.py (#23) * Expose Llama Fused OPs control from run_lora_clm.py * Update as per review comments Co-authored-by: Vivek Goel <vgoel@habana.ai>
|
@regisss we need a way to disable FusedRoPE for corner cases where it causes a problem with a new feature. Ideally we would like to fix the problem with the FusedRoPE operator itself but there are cases where we would like to release feature with a work-around (disable FusedRoPE) without waiting for fix. Current motivation is failure we see with FusedRoPE during Llama-70B-FSDP evaluation phase. If there is another better way you see to disable FusedRoPE for particular feature or test-case then please do let us know. We will revert this change and push a new one as per your recommendation. |
|
@vivekgoe Thanks for the explanation. Since this kind of workaround is meant to be temporary, I would directly access |
|
@regisss Thanks for feedback. Let me explore your suggestion regarding |
|
The example of GPT-NeoX is just to illustrate that we can modify if self.generation_config.use_fused_rope == False:
FusedRoPE = NoneThat way it will be disabled only if |
|
@regisss Thanks for explaining, we will try to make the required changes and get those in before next major release. |
Provide variable to enable/disable FusedRoPE. This may be useful for integrating new models or configurations where FusedRoPE may not work out of the box.