Skip to content

enable HQT#3

Closed
bgoldberg-habana wants to merge 14 commits into
HabanaAI:mainfrom
huggingface:hqt
Closed

enable HQT#3
bgoldberg-habana wants to merge 14 commits into
HabanaAI:mainfrom
huggingface:hqt

Conversation

@bgoldberg-habana
Copy link
Copy Markdown

enable Habana quantization flow for fp8 inference.
add an option to use kv cache in lm_eval
wrap kv cache in module for HQT

@bgoldberg-habana bgoldberg-habana requested a review from a user January 9, 2024 12:29
astachowiczhabana added a commit that referenced this pull request Nov 15, 2024
xinyu-intel pushed a commit that referenced this pull request Mar 4, 2025
Wei-Lin-Intel added a commit to nc-BobLee/optimum-habana-fork that referenced this pull request Jan 7, 2026
Wei-Lin-Intel added a commit that referenced this pull request Jan 7, 2026
* enable QwenimageLayered on gaudi.

* refine sdpa attention forward.

* use adapt transformers to gaudi.

* add param additional_t_cond

* refine text_encoder generate

* refine example code.

* refine code style

* add 3D rope.

* set Qwen2.5 VL cache static

* Add doc for Qwen-Image-Layered #1

* Add doc for Qwen-Image-Layered #2

* Add doc for Qwen-Image-Layered #3

---------

Co-authored-by: Wei-Lin-Intel <wei2.lin@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants