Fix graph breaks in torch compile mode by scsudhak-intel · Pull Request #52 · HabanaAI/optimum-habana-fork

scsudhak-intel · 2024-02-19T11:29:15Z

Training script uses dynamic control flow based on the success of a module import, which caused a graph break.

This patch handles the control flow through boolean flags which avoids graph breaks.

* Expose Llama Fused OPs control from run_lora_clm.py * Update as per review comments

* enable internal kv bucket in llama * initialize bucket_internal for CI * make bucket_internal more clear * further perf optim while max length is not multiple of bucket size

* [SW-173358] add first token prints * [SW-173358] rename x to outputs * [SW-173358] make style

* Enable Flash Attention in recompute and causal modes * Add flash_attention_causal_mask to generation utils * Propagate Flash Attention causal_mask to finetuning example * Modify README example and provide additional description * Add flash_attention_causal_mask to FT README

* enable loading falcon-180b ckpt in .safetensors format * Address comments borrowing transformer's way of reading ckpt file * address comments

mark_step() should not be called for eager mode Signed-off-by: Manoj Kumar <mkumar@habana.ai>

Co-authored-by: Sun Choi <schoi@habana.ai>

* enable loading falcon-180b ckpt in .safetensors format * Address comments borrowing transformer's way of reading ckpt file * address comments * Update ckpt loading PR#15 reads a set of ckpt file names from the index json file. When OH downloads files from the hub instead of loading from a cache dir, get_repo_root() skips downloading the index json file. Thus the PR#15 fails to load file names. This PR scans the path and returns a list of names that matches the pattern * import modeling_utils from transformers

Co-authored-by: Sayantan Sarkar <supersarkar@gmail.com>

* Further fixes for performance with internal bucketing. Also add clear cache() to save memory. make style changes also added. Signed-off-by: Puneesh Khanna <pkhanna@habana.ai> * Calculate kv cache sliding idx for the decode phase only. Signed-off-by: Puneesh Khanna <pkhanna@habana.ai> * Add hpu graphs check for clear cache. Signed-off-by: Puneesh Khanna <pkhanna@habana.ai> --------- Signed-off-by: Puneesh Khanna <pkhanna@habana.ai>

* Adding a flag whether to save checkpoint or not * Add the flag to a model run script

* [SW-174850] Fix for Nonetype in image * Using media external reader API * minor fixes * output info fix * Update media reader function name * make style

Signed-off-by: Sanju C Sudhakaran <scsudhakaran@habana.ai>

* Initial commit => add overrides to support bnb on HPU * Change quantizer file name * Added HPU specific checks and updates (#52) * supports hpu nf4 quant/dequant * added tranformer/quantizers to bitsandbytes * Deleted transformer/quantizers * Fix BnB inference (#60) * update inference test (#63) * Added 4-bit training script (#69) * adapted test files with framework * Adjusted test files format --------- Co-authored-by: Vivek <vgoel@habana.ai>

Vivek Goel and others added 16 commits February 7, 2024 21:56

Expose Llama Fused OPs control from run_lora_clm.py (#23)

e48398d

* Expose Llama Fused OPs control from run_lora_clm.py * Update as per review comments

enable internal kv bucket in llama (#24)

d5291ae

* enable internal kv bucket in llama * initialize bucket_internal for CI * make bucket_internal more clear * further perf optim while max length is not multiple of bucket size

[SW-173358] add first token prints (#18)

fc91b28

* [SW-173358] add first token prints * [SW-173358] rename x to outputs * [SW-173358] make style

Fix inference command clip-roberta (#31)

64013ff

Changing backend name (#32)

64fd45a

enable falcon-180b inference (#15)

87443e3

* enable loading falcon-180b ckpt in .safetensors format * Address comments borrowing transformer's way of reading ckpt file * address comments

To fix LLAMA-V2-70B-FT-HF (8x) for eager mode (#35)

19c5e7e

mark_step() should not be called for eager mode Signed-off-by: Manoj Kumar <mkumar@habana.ai>

Add support for safetensors and sharded checkpoints (#25)

f4e0239

Co-authored-by: Sun Choi <schoi@habana.ai>

Fix tests (huggingface#669) (#41)

e2de09b

Co-authored-by: Sayantan Sarkar <supersarkar@gmail.com>

Adding a flag whether to save checkpoint or not. (#37)

99e5643

* Adding a flag whether to save checkpoint or not * Add the flag to a model run script

Update llama-7b command to include eval (#43)

8e694b9

[BridgeTower] Fix for NoneType in clip mediapipe (#45)

af2c2c2

* [SW-174850] Fix for Nonetype in image * Using media external reader API * minor fixes * output info fix * Update media reader function name * make style

Fix graph breaks in torch compile mode

c774515

Signed-off-by: Sanju C Sudhakaran <scsudhakaran@habana.ai>

scsudhak-intel requested a review from vivekgoe February 19, 2024 11:29

scsudhak-intel added 2 commits February 21, 2024 14:24

Fix graph breaks in torch compile mode

322e335

Signed-off-by: Sanju C Sudhakaran <scsudhakaran@habana.ai>

Fix graph breaks in torch compile mode

5d4dafd

Signed-off-by: Sanju C Sudhakaran <scsudhakaran@habana.ai>

bhargaveede force-pushed the habana-main branch from 8d30377 to 4cf8089 Compare February 22, 2024 07:51

scsudhak-intel closed this Feb 22, 2024

scsudhak-intel deleted the fix-graph-breaks branch February 22, 2024 14:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix graph breaks in torch compile mode#52

Fix graph breaks in torch compile mode#52
scsudhak-intel wants to merge 18 commits into
habana-mainfrom
fix-graph-breaks

scsudhak-intel commented Feb 19, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

Conversation

scsudhak-intel commented Feb 19, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants