Skip to content

Backport GPTBigCodeAttention from v4.53.0 to avoid major refactoring.#2271

Merged
regisss merged 2 commits into
huggingface:mainfrom
AKloniecki:gpt-bigcode-attention
Sep 19, 2025
Merged

Backport GPTBigCodeAttention from v4.53.0 to avoid major refactoring.#2271
regisss merged 2 commits into
huggingface:mainfrom
AKloniecki:gpt-bigcode-attention

Conversation

@AKloniecki
Copy link
Copy Markdown
Collaborator

What does this PR do?

Between transformers versions v4.53 and v4.54, a major refactor of GPTBigCodeAttention was made. This caused our GaudiBPTBigCodeAttention implementation incompatible.
This PR backports _get_mask_value() function from version v4.53, as well as overwrites attn_dropout property to make it an operator once again, as it was being stored as a scalar in v4.54+, and used in new forward() implementation differently than before.
This reenables text generation using bigcode, i.e:
cd examples/text-generation
pip install -r requirements.txt
export PT_HPU_ENABLE_REFINE_DYNAMIC_SHAPES=1
PT_HPU_LAZY_MODE=1 python3 ../gaudi_spawn.py --world_size 1 run_generation.py --batch_size 1 --bf16 --model_name_or_path bigcode/starcoder --use_hpu_graphs --use_kv_cache --n_iterations 1 --max_new_tokens 100

Signed-off-by: Artur Kloniecki <arturx.kloniecki@intel.com>
@AKloniecki AKloniecki self-assigned this Sep 19, 2025
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Comment thread optimum/habana/transformers/models/gpt_bigcode/modeling_gpt_bigcode.py Outdated
Signed-off-by: Artur Kloniecki <arturx.kloniecki@intel.com>
Copy link
Copy Markdown
Collaborator

@regisss regisss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@regisss regisss merged commit 153b1f8 into huggingface:main Sep 19, 2025
2 of 5 checks passed
@AKloniecki AKloniecki deleted the gpt-bigcode-attention branch September 22, 2025 07:37
astachowiczhabana pushed a commit that referenced this pull request Sep 22, 2025
…#2271)

Signed-off-by: Artur Kloniecki <arturx.kloniecki@intel.com>
gplutop7 pushed a commit to HabanaAI/optimum-habana-fork that referenced this pull request Oct 15, 2025
…huggingface#2271) (huggingface#706)

Signed-off-by: Artur Kloniecki <arturx.kloniecki@intel.com>
Co-authored-by: Artur KlonieckiX <arturx.kloniecki@intel.com>
gplutop7 pushed a commit to HabanaAI/optimum-habana-fork that referenced this pull request Nov 6, 2025
…huggingface#2271) (huggingface#706)

Signed-off-by: Artur Kloniecki <arturx.kloniecki@intel.com>
Co-authored-by: Artur KlonieckiX <arturx.kloniecki@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants