enable HQT by bgoldberg-habana · Pull Request #3 · HabanaAI/optimum-habana-fork

bgoldberg-habana · 2024-01-09T12:29:37Z

enable Habana quantization flow for fp8 inference.
add an option to use kv cache in lm_eval
wrap kv cache in module for HQT

* Fix for Falcon error from PR #587 * Reformatted

* add DPO and SFT of TRL support in Gaudi and example Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * upgrade SFTTrainer/DPO trainer and stack_llama_2 example to v0.7.6 Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> --------- Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

Change-Id: I5f952e2f8d2f9db6d6be41d4069d8f5a4e21dfa9

* enable QwenimageLayered on gaudi. * refine sdpa attention forward. * use adapt transformers to gaudi. * add param additional_t_cond * refine text_encoder generate * refine example code. * refine code style * add 3D rope. * set Qwen2.5 VL cache static * Add doc for Qwen-Image-Layered #1 * Add doc for Qwen-Image-Layered #2 * Add doc for Qwen-Image-Layered #3 --------- Co-authored-by: Wei-Lin-Intel <wei2.lin@intel.com>

regisss and others added 14 commits December 21, 2023 12:23

Add Textual Inversion fine-tuning script (#243)

223a63c

Fix for Falcon error from PR #587 (#608)

cba7f00

* Fix for Falcon error from PR #587 * Reformatted

Add inheritance in Diffusers pipelines (#611)

5f772ac

Falcon graph compilation error fix for when bs>1 (#607)

cba23b3

Temporary fix for Diffusers CI (#618)

a5d7dec

Fix crash if gaudi_config is not passed to GaudiTrainer (#613)

0152688

Update generation config to enable flash attention for inference (#609)

6a1521b

Avoid falcon perf drop from PR#607 when BS=1 (#620)

b0421c1

Adding support for bf16_full_eval (#610)

dd02a7b

Enable fused rmsnorm in bf16 for llama (#621)

b72d8ea

Text-Generation Pipeline Example (#526)

e419599

Update CI diff file (#624)

8fb43a4

enable HQT

bf9ab7d

Change-Id: I5f952e2f8d2f9db6d6be41d4069d8f5a4e21dfa9

bgoldberg-habana requested a review from a user January 9, 2024 12:29

bgoldberg-habana closed this Jan 9, 2024

astachowiczhabana added a commit that referenced this pull request Nov 15, 2024

[SW-205356] Rebase to OH v1.14 (#3)

6bd9193

xinyu-intel pushed a commit that referenced this pull request Mar 4, 2025

[SW-205356] Rebase to OH v1.14 (#3)

81e1cb0

Wei-Lin-Intel added a commit to nc-BobLee/optimum-habana-fork that referenced this pull request Jan 7, 2026

Add doc for Qwen-Image-Layered HabanaAI#3

e7b826d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enable HQT#3

enable HQT#3
bgoldberg-habana wants to merge 14 commits into
HabanaAI:mainfrom
huggingface:hqt

bgoldberg-habana commented Jan 9, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

bgoldberg-habana commented Jan 9, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants